红联Linux门户
Linux帮助

由于丢失OLR导致的节点无法启动

发布时间:2016-09-10 15:18:53来源:linux网站作者:丹心明月
环境:RHEL6.5+11.2.0.4 RAC,两节点
问题描述:故意把OLR删掉,重启后发现GI无法启动。
 
分析过程:
1.确认GI启动到了哪一个阶段
[grid@rac1 ~]$ crsctl status resource -t -init  
CRS-4639: Could not contact Oracle High Availability Services  
CRS-4000: Command Status failed, or completed with errors.  
解析:发现连OHASD都没有启动,两种可能:1是init.ohasd脚本没有被调用 2是ohasd.bin守护进程没有启动成功,那么:  
[grid@rac1 ~]$ ps -ef | grep ohas |grep -v grep  
root 960 1  0 09:23 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run  
发现,脚本被调用了,但是守护进程没有成功启动。  
 
2.查看ohasd的日志
2016-04-18 12:26:25.918: [ default][1661986592] OHASD Daemon Starting. Command string :restart  
2016-04-18 12:26:25.919: [ default][1661986592] Initializing OLR  
2016-04-18 12:26:25.919: [  OCROSD][1661986592]utopen:6m': failed in stat OCR file/disk /u01/app/11.2.0.1/grid/cdata/rac1.olr, errno=2, os err string=No such file or directory  
2016-04-18 12:26:25.919: [  OCROSD][1661986592]utopen:7: failed to open any OCR file/disk, errno=2, os err string=No such file or directory  
2016-04-18 12:26:25.919: [  OCRRAW][1661986592]proprinit: Could not open raw device  
2016-04-18 12:26:25.919: [  OCRAPI][1661986592]a_init:16!: Backend init unsuccessful : [26]  
2016-04-18 12:26:25.920: [  CRSOCR][1661986592] OCR context init failure.  Error: PROCL-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]  
2016-04-18 12:26:25.920: [ default][1661986592] Created alert : (:OHAS00106:) :  OLR initialization failed, error: PROCL-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]  
2016-04-18 12:26:25.920: [ default][1661986592][PANIC] OHASD exiting; Could not init OLR  
2016-04-18 12:26:25.920: [ default][1661986592] Done.  
解析:看报错是OLR打不开,那就过去看看存在不(手动删的,怎么可能存在)  
[grid@rac1 cdata]$ ll  
total 12  
drwxrwxr-x 2 grid oinstall 4096 Apr 18 07:51 liming-cluster  
drwxr-xr-x 2 grid oinstall 4096 Apr 18 07:49 localhost  
drwxr-xr-x 2 grid oinstall 4096 Apr 18 08:11 rac1  
OLR不存在了。  
 
3.查看OLR的备份是否存在
[grid@rac1 rac1]$ ll  
total 6644  
-rw------- 1 root root 6803456 Apr 18 08:11 backup_20160418_081108.olr  
可以的。  
 
4.恢复OLR
[root@rac1 bin]# ./ocrconfig -local -restore /u01/app/11.2.0.1/grid/cdata/rac1/backup_20160418_081108.olr   
PROTL-35: The configured OLR location is not accessible.  
书中没写的步骤来了!  
[grid@rac1 cdata]$ touch rac1.olr  
[root@rac1 bin]# ./ocrconfig -local -restore /u01/app/11.2.0.1/grid/cdata/rac1/backup_20160418_081108.olr   
[root@rac1 bin]#   
[grid@rac1 cdata]$ ll  
total 6660  
drwxrwxr-x 2 grid oinstall      4096 Apr 18 07:51 liming-cluster  
drwxr-xr-x 2 grid oinstall      4096 Apr 18 07:49 localhost  
drwxr-xr-x 2 grid oinstall      4096 Apr 18 08:11 rac1  
-rw-r--r-- 1 grid oinstall 272756736 Apr 18 13:02 rac1.olr
 
5.启动GI,恢复正常
[root@rac1 bin]# ./crsctl start crs
 
本文永久更新地址:http://www.linuxdiyf.com/linux/24041.html