环境:RHEL5.8 RAC 11.2.0.3.0
1:查看ORC和voting disk信息: In 11g Release 2 your voting disk data is automatically backed up in the OCR whenever there is a configuration change. 所以恢复时恢复备份OCR即可,这里和10g是不同的,不需要备份voting disk,备份OCR即可 2:使用cluvfy 工具检查OCR完整性 [grid@rac1 ~]$ cluvfy comp ocr -n all Verifying OCR integrity Checking OCR integrity... Checking the absence of a non-clustered configuration... All nodes free of non-clustered, local-only configurations ASM Running check passed. ASM is running on all specified nodes Checking OCR config file "/etc/oracle/ocr.loc"... OCR config file "/etc/oracle/ocr.loc" check successful Disk group for ocr location "+CRSDATA" available on all the nodes NOTE: This check does not verify the integrity of the OCR contents. Execute 'ocrcheck' as a privileged user to verify the contents of OCR. OCR integrity check passed Verification of OCR integrity was successful. 3:使用ocrcheck检测OCR内容的完整性 [grid@rac1 ~]$ ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3016 Available space (kbytes) : 259104 ID : 1236405787 Device/File Name : +CRSDATA Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check bypassed due to non-privileged user --如果使用root用户执行ocrcheck时,会显示Logical corruption check succeeded 4:检测voting disk的信息 [grid@rac1 ~]$ crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 2b1bd0c122584f5abf72033b2b2d26bd (/dev/asm-b_crs) [CRSDATA] 2. ONLINE 2bc03776cdd94f5cbfb9165c473fdb0e (/dev/asm-c_crs) [CRSDATA] 3. ONLINE 3b43c39513a64f2dbf7083a9510ada89 (/dev/asm-d_crs) [CRSDATA] Located 3 voting disk(s). 从上面看出,OCR和voting disk都位于+CRSDATA磁盘组 ,注意+CRSDATA磁盘组还有ASM的启动参数文件,ASM启动是根据磁盘头的kfdhdb.spfile指向ASM上的此磁盘的UA NUMBER从而读取spfile文件 5:手动备份一份OCR信息: [root@rac1 grid]# ocrconfig -export /tmp/ocr_20130717.dmp [root@rac1 grid]# ll /tmp/ocr_20130717.dmp -h -rw------- 1 root root 102K Jul 17 14:45 /tmp/ocr_20130717.dmp 6:查看OCR自动备份信息 [grid@rac1 ~]$ ocrconfig -showbackup rac1 2013/07/16 15:45:24 /u01/app/11.2.0.3/grid/cdata/ad-cluster/backup00.ocr rac2 2013/07/16 08:13:38 /u01/app/11.2.0.3/grid/cdata/ad-cluster/backup01.ocr rac2 2013/07/16 04:14:09 /u01/app/11.2.0.3/grid/cdata/ad-cluster/backup02.ocr rac2 2013/07/16 00:14:38 /u01/app/11.2.0.3/grid/cdata/ad-cluster/day.ocr rac2 2013/07/07 04:40:11 /u01/app/11.2.0.3/grid/cdata/ad-cluster/week.ocr PROT-25: Manual backups for the Oracle Cluster Registry are not available 7:保存一份ASM参数文件,如果提前没保存,可以到$CRS_HOME/dbs/init.ora获取一份,后面此启动参数的详细内容 [grid@rac1 dbs]$ sqlplus / as sysasm SQL> create pfile='/tmp/asm_pfile_130717.txt' from spfile; File created. 8:破坏保存OCR信息的磁盘组+CRSDATA [root@rac1 dev]# dd if=/dev/zero of=/dev/asm-b_crs bs=1024 count=1000 [root@rac1 dev]# dd if=/dev/zero of=/dev/asm-c_crs bs=1024 count=1000 9:破坏了磁盘b和c后,都检测通过,没报错,在rac1和rac2停止crs [root@rac1 dev]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1' CRS-2673: Attempting to stop 'ora.crsd' on 'rac1' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac1' ..................... CRS-4133: Oracle High Availability Services has been stopped. [root@rac2 dev]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1' CRS-2673: Attempting to stop 'ora.crsd' on 'rac1' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac1' ..................... CRS-4133: Oracle High Availability Services has been stopped. [root@rac1 dev]# ps -ef |grep ora_ root 16189 32265 0 16:26 pts/0 00:00:00 grep ora_ [root@rac1 dev]# ps -ef |grep asm_ root 16195 32265 0 16:26 pts/0 00:00:00 grep asm_ 10:再启动CRS,报错 [root@rac1 dev]# crsctl start crs CRS-4123: Oracle High Availability Services has been started. [root@rac1 ~]# tail -50f /u01/app/11.2.0.3/grid/log/rac1/alertrac1.log [cssd(16559)]CRS-1637:Unable to locate configured voting file with ID 2b1bd0c1-22584f5a-bf72033b-2b2d26bd; details at (:CSSNM00020:) in /u01/app/11.2.0.3/grid/log/rac1/cssd/ocssd.log 2013-07-17 16:28:15.947 [cssd(16559)]CRS-1637:Unable to locate configured voting file with ID 2bc03776-cdd94f5c-bfb9165c-473fdb0e; details at (:CSSNM00020:) in /u01/app/11.2.0.3/grid/log/rac1/cssd/ocssd.log 2013-07-17 16:28:15.947 [cssd(16559)]CRS-1705:Found 1 configured voting files but 2 voting files are required, terminating to ensure data integrity; details at (:CSSNM00021:) in /u01/app/11.2.0.3/grid/log/rac1/cssd/ocssd.log 2013-07-17 16:28:15.948 [cssd(16559)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11.2.0.3/grid/log/rac1/cssd/ocssd.log 2013-07-17 16:28:16.073 [cssd(16559)]CRS-1603:CSSD on node rac1 shutdown by user. ocrcheck检测报错: [root@rac1 dev]# ocrcheck PROT-602: Failed to retrieve data from the cluster registry PROC-26: Error while accessing the physical storage 11:强制关闭CRS: [root@rac1 dev]# crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1' [root@rac1 dev]# crsctl stop crs -f CRS-2797: Shutdown is already in progress for 'rac1', waiting for it to complete CRS-2797: Shutdown is already in progress for 'rac1', waiting for it to complete CRS-4133: Oracle High Availability Services has been stopped. 12:以独占模式启动rac1 [root@rac1 dev]# crsctl start crs -excl -nocrs CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1' CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1' CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1' CRS-2672: Attempting to start 'ora.gipcd' on 'rac1' CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'rac1' CRS-2672: Attempting to start 'ora.diskmon' on 'rac1' CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rac1' CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'rac1' CRS-2672: Attempting to start 'ora.ctssd' on 'rac1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1' CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded CRS-2676: Start of 'ora.drivers.acfs' on 'rac1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'rac1' CRS-2676: Start of 'ora.asm' on 'rac1' succeeded 12:创建CRSVOTEDISK磁盘组以及spfile [grid@rac1 ~]$ asmcmd ASMCMD> ls 空的 [grid@rac1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Wed Jul 17 16:58:18 2013 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter spfile NAME TYPEVALUE ------------------------------------ ----------- ------------------------------ spfile string SQL> create diskgroup CRSVOTEDISK normal redundancy disk '/dev/asm-b_crs','/dev/asm-c_crs', '/dev/asm-d_crs' 2 attribute 'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0'; create diskgroup CRSVOTEDISK normal redundancy disk '/dev/asm-b_crs','/dev/asm-c_crs', '/dev/asm-d_crs' * ERROR at line 1: ORA-15018: diskgroup cannot be created ORA-15033: disk '/dev/asm-d_crs' belongs to diskgroup "CRSDATA" --这里报错是因为asm-d_crs没清除磁盘头信息 清除asm-d_crs磁盘头信息 [root@rac1 dev]# dd if=/dev/zero of=/dev/asm-d_crs bs=1024 count=1000 SQL> create diskgroup CRSVOTEDISK normal redundancy disk '/dev/asm-b_crs','/dev/asm-c_crs','/dev/asm-d_crs' 2 attribute 'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0'; Diskgroup created. SQL> create spfile='+CRSVOTEDISK' from pfile='/tmp/asm_pfile_130717.txt'; File created. SQL> quit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options [grid@rac1 ~]$ asmcmd ASMCMD> ls CRSVOTEDISK/ ASMCMD> ls CRSVOTEDISK ad-cluster/ ASMCMD> ls CRSVOTEDISK/ad-cluster/ ASMPARAMETERFILE/ ASMCMD> ls CRSVOTEDISK/ad-cluster/ASMPARAMETERFILE REGISTRY.253.821034567 13:Restore OCR from backup: 将原磁盘组+CRSDATA改为新建立的磁盘组 +CRSVOTEDISK [root@rac1 dev]# vim /etc/oracle/ocr.loc ocrconfig_loc=+CRSVOTEDISK local_only=FALSE [root@rac1 dev]# ocrconfig -restore /u01/app/11.2.0.3/grid/cdata/ad-cluster/backup00.ocr 可以看到增加了一个OCRFILE文件夹 ASMCMD> ls CRSVOTEDISK/ad-cluster ASMPARAMETERFILE/ OCRFILE/ ASMCMD> ls CRSVOTEDISK/ad-cluster/OCRFILE -l Type Redund Striped Time Sys Name OCRFILE MIRROR COARSE JUL 17 17:00:00 Y REGISTRY.255.821036449 ASMCMD> ls CRSVOTEDISK/ad-cluster/ASMPARAMETERFILE -l Type Redund Striped Time Sys Name ASMPARAMETERFILE MIRROR COARSE JUL 17 17:00:00 Y REGISTRY.253.821034567 检测成功 [root@rac1 dev]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3016 Available space (kbytes) : 259104 ID : 1236405787 Device/File Name : +CRSVOTEDISK Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded 14:Restore the Voting Disk: [root@rac1 dev]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. OFFLINE 2b1bd0c122584f5abf72033b2b2d26bd () [] 2. OFFLINE 2bc03776cdd94f5cbfb9165c473fdb0e () [] 3. ONLINE 3b43c39513a64f2dbf7083a9510ada89 (/dev/asm-d_crs) [CRSDATA] Located 3 voting disk(s). [root@rac1 dev]# crsctl replace votedisk +CRSVOTEDISK CRS-4602: Failed 27 to add voting file 5818c2c531394f45bff13c5a7532c8d4. CRS-4602: Failed 27 to add voting file 1ce0436528624faabf7d4a1dd8dc978a. CRS-4602: Failed 27 to add voting file 09def2b244af4f42bf13679a8aa0ff73. Failure 27 with Cluster Synchronization Services while deleting voting disk. Failure 27 with Cluster Synchronization Services while deleting voting disk. Failure 27 with Cluster Synchronization Services while deleting voting disk. Failed to replace voting disk group with +CRSVOTEDISK. CRS-4000: Command Replace failed, or completed with errors. 这里报错是一开始asm-d_crs没清除磁盘头信息导致的 ======================================== 到这里恢复voting disk失败了 ,下面重新开始再次尝试恢复============ 下面恢复时要注意:
crsctl start crs -excl -nocrs 启动后,马上关闭ASM,不要立刻创建create diskgroup CRSVOTEDISK磁盘组,再使用参数启动ASM
不然创建磁盘组时可能会收入如下报错:
例如:(下面的操作看看就好了,直到 :下面开始再次恢复操作)
[grid@rac1 ~]$ sqlplus / as sysasm SQL> create diskgroup CRSVOTEDISK normal redundancy disk '/dev/asm-b_crs','/dev/asm-c_crs','/dev/asm-d_crs' 2 attribute 'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0'; create diskgroup CRSVOTEDISK normal redundancy disk '/dev/asm-b_crs','/dev/asm-c_crs','/dev/asm-d_crs' * ERROR at line 1: ORA-15018: diskgroup cannot be created ORA-15031: disk specification '/dev/asm-d_crs' matches no disks ORA-15014: path '/dev/asm-d_crs' is not in the discovery set ORA-15031: disk specification '/dev/asm-c_crs' matches no disks ORA-15014: path '/dev/asm-c_crs' is not in the discovery set ORA-15031: disk specification '/dev/asm-b_crs' matches no disks ORA-15014: path '/dev/asm-b_crs' is not in the discovery set --这里找不到设备应该也是和下面的情况是一样的,没指定扫描的路径 SQL> col PATH for a50 SQL> select group_number, disk_number, mount_status, header_status, path from v$asm_disk; no rows selected 说明没识别出磁盘 ,这里为什么没磁盘现在是搞明白了,因为参数里面根本没设置 SQL> show parameter asm NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskgroups stringDATA ---这里使用默认参数文件启动时是空的,正常情况也不会显示保存OCR磁盘组名的 asm_diskstring string/dev/asm* ---这里使用默认参数文件启动时是空的,没指定扫描的路径 asm_power_limit integer1 asm_preferred_read_failure_groups string 所以为了保险起见,应该crsctl start crs -excl -nocrs 启动后,马上关闭ASM,不要立刻创建create diskgroup CRSVOTEDISK磁盘组,再使用参数启动ASM SQL> startup pfile='/tmp/asm_pfile_130717.txt'; [grid@rac1 ~]$ cat /tmp/asm_pfile_130717.txt +ASM1.__oracle_base='/u01/app/grid'#ORACLE_BASE set from in memory value +ASM1.asm_diskgroups='DATA'#Manual Mount +ASM2.asm_diskgroups='DATA'#Manual Mount *.asm_diskstring='/dev/asm*' *.asm_power_limit=1 *.diagnostic_dest='/u01/app/grid' *.instance_type='asm' *.large_pool_size=12M *.remote_login_passwordfile='EXCLUSIVE' 这里再查下v$asm_disk就可以查询到磁盘,也可以顺利的创建磁盘组了。。。。。。。。。。,就是因为没立刻关闭ASM,使用修改好的参数文件,创建磁盘组时一直提示找不到磁盘,耽误了半天时间
下面开始再次恢复操作:
关闭crs后再启动
[root@rac1 dev]# crsctl stop crs [root@rac1 dev]# crsctl start crs -excl -nocrs [root@rac1 dev]# crsctl query css votedisk Located 0 voting disk(s).
关闭rac1上的ASM,再使用参数文件启动ASM,创建CRS磁盘组,创建spfile
[grid@rac1 ~]$ sqlplus / as sysasm
SQL>shutdown immediate ASM diskgroups dismounted ASM instance shutdown SQL> startup pfile='/tmp/asm_pfile_130717.txt'; ASM instance started
SQL> col path for a50 SQL> set linesize 130 SQL> select group_number, disk_number, mount_status, header_status, path from v$asm_disk; GROUP_NUMBER DISK_NUMBER MOUNT_S HEADER_STATU PATH -------------------------------------- ------------ ---------------- ------------------------------ 0 0 CLOSED MEMBER /dev/asm-e_data 0 3 CLOSED CANDIDATE /dev/asm-b_crs 0 2 CLOSED CANDIDATE /dev/asm-c_crs 0 1 CLOSED CANDIDATE /dev/asm-d_crs SQL> create diskgroup CRSVOTEDISK normal redundancy disk '/dev/asm-b_crs','/dev/asm-c_crs','/dev/asm-d_crs' 2 attribute 'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0'; Diskgroup created. SQL> create spfile='+CRSVOTEDISK ' from pfile='/tmp/asm_pfile_130717.txt'; File created. SQL> quit 恢复crs [root@rac1 dev]# ocrconfig -restore /u01/app/11.2.0.3/grid/cdata/ad-cluster/backup00.ocr
恢复voting disk
[root@rac1 dev]# crsctl replace votedisk +CRSVOTEDISK Successful addition of voting disk 1b00b0ec4e504f7fbf1f8d20fbbfaa4b. Successful addition of voting disk 5a3b646433124fdcbf23c3c290de7fe3. Successful addition of voting disk 5d27d80b96d74f09bf1756be6dee387f. Successfully replaced voting disk group with +CRSVOTEDISK . CRS-4266: Voting file(s) successfully replaced 检测 [root@rac1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3016 Available space (kbytes) : 259104 ID : 1236405787 Device/File Name : +CRSVOTEDISK Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@rac1 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 1b00b0ec4e504f7fbf1f8d20fbbfaa4b (/dev/asm-b_crs) [CRSVOTEDISK ] 2. ONLINE 5a3b646433124fdcbf23c3c290de7fe3 (/dev/asm-c_crs) [CRSVOTEDISK ] 3. ONLINE 5d27d80b96d74f09bf1756be6dee387f (/dev/asm-d_crs) [CRSVOTEDISK ] Located 3 voting disk(s). 停止crs以正常方式启动: [root@rac1 ~]# crsctl stop crs [root@rac1 ~]# crsctl start crs 此时,crs和voting disk已经完成恢复,但要注意修改rac2上的/etc/oracle/ocr.loc里面的ocrconfig_loc=+CRSVOTEDISK ,不然启动报错:
[/u01/app/11.2.0.3/grid/bin/oraagent.bin(19510)]CRS-5019:All OCR locations are on ASM disk groups [CRSDATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0.3/grid/log/rac1/agent/ohasd/oraagent_grid/oraagent_grid.log". 2013-07-18 00:10:33.678 [/u01/app/11.2.0.3/grid/bin/oraagent.bin(19510)]CRS-5019:All OCR locations are on ASM disk groups [CRSDATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0.3/grid/log/rac1/agent/ohasd/oraagent_grid/oraagent_grid.log". 2013-07-18 00:11:03.614 [root@rac2 ~]# crsctl start crs [root@rac1 ~]# crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.CRSDATA.dg ora....up.type ONLINE OFFLINE ora.DATA.dg ora....up.type ONLINE ONLINE rac1 ora....ER.lsnr ora....er.type ONLINE ONLINE rac1 ora....N1.lsnr ora....er.type ONLINE ONLINE rac1 ora.asm ora.asm.type ONLINE ONLINE rac1 ora.chris.db ora....se.type ONLINE ONLINE rac1 ora.cvu ora.cvu.type ONLINE ONLINE rac1 ora.gsd ora.gsd.type OFFLINE OFFLINE ora....network ora....rk.type ONLINE ONLINE rac1 ora.oc4j ora.oc4j.type ONLINE ONLINE rac1 ora.ons ora.ons.type ONLINE ONLINE rac1 ora....SM1.asm application ONLINE ONLINE rac1 ora....C1.lsnr application ONLINE ONLINE rac1 ora.rac1.gsd application OFFLINE OFFLINE ora.rac1.ons application ONLINE ONLINE rac1 ora.rac1.vip ora....t1.type ONLINE ONLINE rac1 ora....SM2.asm application ONLINE ONLINE rac2 ora....C2.lsnr application ONLINE ONLINE rac2 ora.rac2.gsd application OFFLINE OFFLINE ora.rac2.ons application ONLINE ONLINE rac2 ora.rac2.vip ora....t1.type ONLINE ONLINE rac2 ora....ry.acfs ora....fs.type ONLINE ONLINE rac1 ora.scan1.vip ora....ip.type ONLINE ONLINE rac1