ASM resource online and ASM idle instance
Hi all,
SO..........: CentOS 5.8 32bits (VM - VirtualBox)
Oracle...: 11gR2 (11.2.0.1.0)
On my test environment, i have a VM running Oracle Clusterware and 2 11g instances. The problem is: The ASM service appears online (through ./crsctl status resource -t, see below), but when a try to connect to the ASM instance as sysasm, it shows me that the instance is idle, and if i try to startup it, the message "ORA-15149: another ASM instance found running on the host" is shown to me.
--output of ./crsctl status resource -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATA.dg
ONLINE ONLINE vm-cent5
ora.DG_MANOBRA.dg
ONLINE ONLINE vm-cent5
ora.LISTENER.lsnr
ONLINE ONLINE vm-cent5
ora.asm
ONLINE OFFLINE vm-cent5 Instance Shutdown
Cluster Resources
ora.auxdb.db
1 ONLINE ONLINE vm-cent5 Open
ora.cssd
1 ONLINE ONLINE vm-cent5
ora.diskmon
1 ONLINE ONLINE vm-cent5 If i try to start the ora.asm resource through ./crtctl start, the following output is shown:
[root@VM-Cent5 bin]# ./crsctl start resource ora.asm
CRS-2672: Attempting to start 'ora.asm' on 'vm-cent5'
ORA-01012: not logged on
CRS-2674: Start of 'ora.asm' on 'vm-cent5' failed
CRS-2679: Attempting to clean 'ora.asm' on 'vm-cent5'
CRS-2681: Clean of 'ora.asm' on 'vm-cent5' succeeded
CRS-4000: Command Start failed, or completed with errors.I am probably doing something stupid, but i still could not see what i'm doing wrong.
Any help will be appreciated. Thanks in advance.
Just one thing that i saw in my recent tests. I restarted CRS (crsctl stop | start resource -all), and the asm service (ora.asm) was back online, both target and state, as you can see below:
[root@VM-Cent5 bin]# ./crsctl status resource -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATA.dg
ONLINE ONLINE vm-cent5
ora.DG_MANOBRA.dg
ONLINE ONLINE vm-cent5
ora.LISTENER.lsnr
ONLINE ONLINE vm-cent5
ora.asm
ONLINE ONLINE vm-cent5 Started
Cluster Resources
ora.auxdb.db
1 ONLINE ONLINE vm-cent5 Open
ora.cssd
1 ONLINE ONLINE vm-cent5
ora.diskmon
1 ONLINE ONLINE vm-cent5 ... but when i connect as sysasm using user oracle (sid +ASM), i see again the idle instance.
Similar Messages
-
RAC DB and ASM SYS user passwords
Hi,
We are planning to change SYS user password of our RAC instances SYS user password. But one of my colleague is saying that if we change the DB password we need to change the ASM password too and ASM instances password should match the DB password i.e. the DB new password should be same as ASM new password.
Is it true that we need to change the ASM password too if we change the DB same password ? and does ASM password need to be same as DB password?
Thanks,
Mahipal ReddyHi,
Not true. You can have different paasswords for sys and asm. ASM has a different password file and db instance has a different password file. You can change one without changing the other and they both don't have to be the same. -
All rac resource status is unknown only vip and asm is running fine
i have just install oracle 10g clusterware and oracle software on RHEL 5
only asm and vip is working fine ..rest of the resource not working status is unknown
[oracle@rac1 bin]$ ./crs_stat -t
Name Type Target State Host
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE UNKNOWN rac1
ora.rac1.gsd application ONLINE UNKNOWN rac1
ora.rac1.ons application ONLINE UNKNOWN rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE ONLINE rac2
i tried to start is manually but its return error like this
[oracle@rac1 bin]$ ./crs_stop ora.rac1.LISTENER_RAC1.lsnr -f
Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
[oracle@rac1 bin]$ ./crs_start ora.rac1.LISTENER_RAC1.lsnr -f
Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
`ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
CRS-0215: Could not start resource 'ora.rac1.LISTENER_RAC1.lsnr'.
[oracle@rac1 bin]$ ./crs_start ora.rac1.LISTENER_RAC1.lsnr -f
CRS-1028: Dependency analysis failed because of:
'Resource in UNKNOWN state: ora.rac1.LISTENER_RAC1.lsnr'
CRS-0223: Resource 'ora.rac1.LISTENER_RAC1.lsnr' has placement error.
[oracle@rac1 bin]$ ./crs_start ora.rac2.LISTENER_RAC2.lsnr -f
CRS-1028: Dependency analysis failed because of:
'Resource in UNKNOWN state: ora.rac2.LISTENER_RAC2.lsnr'
CRS-0223: Resource 'ora.rac2.LISTENER_RAC2.lsnr' has placement error.
i reboot the system 3 times but problem is same plz help me to solve this problem...
Edited by: 969752 on 18-Dec-2012 00:33Hi
Using the srvctl cannot start/stop the resources ?
Please see:
CRS: Resource in UNKNOWN state [ID 845709.1] -
Rac resource status is unknown only vip and asm is running fine
i have just install oracle 10g clusterware and oracle software on RHEL 5
only asm and vip is working fine ..rest of the resource not working status is unknown
[oracle@rac1 bin]$ ./crs_stat -t
Name Type Target State Host
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE UNKNOWN rac1
ora.rac1.gsd application ONLINE UNKNOWN rac1
ora.rac1.ons application ONLINE UNKNOWN rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE ONLINE rac2
i tried to start is manually but its return error like this
[oracle@rac1 bin]$ ./crs_stop ora.rac1.LISTENER_RAC1.lsnr -f
Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
[oracle@rac1 bin]$ ./crs_start ora.rac1.LISTENER_RAC1.lsnr -f
Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
`ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
CRS-0215: Could not start resource 'ora.rac1.LISTENER_RAC1.lsnr'.
[oracle@rac1 bin]$ ./crs_start ora.rac1.LISTENER_RAC1.lsnr -f
CRS-1028: Dependency analysis failed because of:
'Resource in UNKNOWN state: ora.rac1.LISTENER_RAC1.lsnr'
CRS-0223: Resource 'ora.rac1.LISTENER_RAC1.lsnr' has placement error.
[oracle@rac1 bin]$ ./crs_start ora.rac2.LISTENER_RAC2.lsnr -f
CRS-1028: Dependency analysis failed because of:
'Resource in UNKNOWN state: ora.rac2.LISTENER_RAC2.lsnr'
CRS-0223: Resource 'ora.rac2.LISTENER_RAC2.lsnr' has placement error.
i reboot the system 3 times but problem is same plz help me to solve this problem.../opt/app/crs/log/rac1/alertrac1.log output..
[cssd(5441)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /opt/app/crs/log/rac1/cssd/ocssd.log.
2012-12-19 05:17:02.561
[cssd(5441)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 .
2012-12-19 05:17:03.998
[crsd(4718)]CRS-1012:The OCR service started on node rac1.
2012-12-19 05:17:04.028
[evmd(5327)]CRS-1401:EVMD started on node rac1.
2012-12-19 05:17:12.456
[crsd(4718)]CRS-1201:CRSD started on node rac1.
2012-12-19 05:17:23.668
[cssd(5441)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 rac2 .
2012-12-19 07:23:46.211
[cssd(5216)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /opt/app/crs/log/rac1/cssd/ocssd.log.
2012-12-19 07:23:49.399
[cssd(5216)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 .
2012-12-19 07:23:50.458
[crsd(4709)]CRS-1012:The OCR service started on node rac1.
2012-12-19 07:23:50.490
[evmd(5098)]CRS-1401:EVMD started on node rac1.
2012-12-19 07:23:55.776
[crsd(4709)]CRS-1201:CRSD started on node rac1.
2012-12-19 07:25:00.583
[cssd(5216)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 rac2 .
2012-12-20 00:09:11.199
[cssd(5286)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /opt/app/crs/log/rac1/cssd/ocssd.log.
2012-12-20 00:09:14.907
[cssd(5286)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 .
2012-12-20 00:09:16.446
[evmd(5128)]CRS-1401:EVMD started on node rac1.
2012-12-20 00:09:16.459
[crsd(4756)]CRS-1012:The OCR service started on node rac1.
2012-12-20 00:10:02.406
[crsd(4756)]CRS-1201:CRSD started on node rac1.
2012-12-20 00:10:39.220
[cssd(5286)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 rac2 .
/opt/app/crs/log/rac1/crsd/crsd.log output:-
2012-12-20 00:09:15.606: [ CRSD][7390912]0Daemon Version: 10.2.0.1.0 Active Version: 10.2.0.1.0
2012-12-20 00:09:15.606: [ CRSD][7390912]0Active Version and Software Version are same
2012-12-20 00:09:15.606: [ CRSMAIN][7390912]0Initializing OCR
2012-12-20 00:09:15.801: [ OCRRAW][7390912]proprioo: for disk 0 (/dev/raw/raw1), id match (1), my id set (1669906634,1028247821) total id sets (1), 1st set (1669906634,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2012-12-20 00:09:16.264: [ OCRMAS][3065346960]th_master:12: I AM THE NEW OCR MASTER at incar 1. Node Number = 1
2012-12-20 00:09:16.310: [ OCRRAW][3065346960]proprioo: for disk 0 (/dev/raw/raw1), id match (1), my id set (1669906634,1028247821) total id sets (1), 1st set (1669906634,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2012-12-20 00:09:16.524: [ CRSD][7390912]0ENV Logging level for Module: allcomp 0
2012-12-20 00:09:16.528: [ CRSD][7390912]0ENV Logging level for Module: default 0
2012-12-20 00:09:16.534: [ CRSD][7390912]0ENV Logging level for Module: COMMCRS 0
2012-12-20 00:09:16.536: [ CRSD][7390912]0ENV Logging level for Module: COMMNS 0
2012-12-20 00:09:16.549: [ CRSD][7390912]0ENV Logging level for Module: CRSUI 0
2012-12-20 00:09:16.556: [ CRSD][7390912]0ENV Logging level for Module: CRSCOMM 0
2012-12-20 00:09:16.559: [ CRSD][7390912]0ENV Logging level for Module: CRSRTI 0
2012-12-20 00:09:16.562: [ CRSD][7390912]0ENV Logging level for Module: CRSMAIN 0
2012-12-20 00:09:16.564: [ CRSD][7390912]0ENV Logging level for Module: CRSPLACE 0
2012-12-20 00:09:16.567: [ CRSD][7390912]0ENV Logging level for Module: CRSAPP 0
2012-12-20 00:09:16.570: [ CRSD][7390912]0ENV Logging level for Module: CRSRES 0
2012-12-20 00:09:16.573: [ CRSD][7390912]0ENV Logging level for Module: CRSOCR 0
2012-12-20 00:09:16.576: [ CRSD][7390912]0ENV Logging level for Module: CRSTIMER 0
2012-12-20 00:09:16.582: [ CRSD][7390912]0ENV Logging level for Module: CRSEVT 0
2012-12-20 00:09:16.586: [ CRSD][7390912]0ENV Logging level for Module: CRSD 0
2012-12-20 00:09:16.590: [ CRSD][7390912]0ENV Logging level for Module: CLUCLS 0
2012-12-20 00:09:16.593: [ CRSD][7390912]0ENV Logging level for Module: OCRRAW 0
2012-12-20 00:09:16.596: [ CRSD][7390912]0ENV Logging level for Module: OCROSD 0
2012-12-20 00:09:16.600: [ CRSD][7390912]0ENV Logging level for Module: CSSCLNT 0
2012-12-20 00:09:16.603: [ CRSD][7390912]0ENV Logging level for Module: OCRAPI 0
2012-12-20 00:09:16.606: [ CRSD][7390912]0ENV Logging level for Module: OCRUTL 0
2012-12-20 00:09:16.609: [ CRSD][7390912]0ENV Logging level for Module: OCRMSG 0
2012-12-20 00:09:16.613: [ CRSD][7390912]0ENV Logging level for Module: OCRCLI 0
2012-12-20 00:09:16.651: [ CRSD][7390912]0ENV Logging level for Module: OCRCAC 0
2012-12-20 00:09:16.671: [ CRSD][7390912]0ENV Logging level for Module: OCRSRV 0
2012-12-20 00:09:16.678: [ CRSD][7390912]0ENV Logging level for Module: OCRMAS 0
2012-12-20 00:09:16.678: [ CRSMAIN][7390912]0Filename is /opt/app/crs/crs/init/rac1.pid
2012-12-20 00:09:16.956: [ CRSMAIN][7390912]0Using Authorizer location: /opt/app/crs/crs/auth/
2012-12-20 00:09:17.080: [ CRSMAIN][7390912]0Initializing RTI
2012-12-20 00:09:17.085: [CRSTIMER][2845059984]0Timer Thread Starting.
[ clsdmt][2866039696]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_CRSD))
2012-12-20 00:09:17.236: [ CRSRES][7390912]0Parameter SECURITY = 1, running in USER Mode
2012-12-20 00:09:17.236: [ CRSMAIN][7390912]0Initializing EVMMgr
2012-12-20 00:09:17.475: [ COMMCRS][2834570128]clsc_connect: (0xa4e7cc8) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2012-12-20 00:09:18.437: [ COMMCRS][2834570128]clsc_connect: (0xa45b6b8) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2012-12-20 00:09:18.888: [ COMMCRS][2834570128]clsc_connect: (0xa45af68) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2012-12-20 00:09:19.575: [ COMMCRS][2834570128]clsc_connect: (0xa456f50) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2012-12-20 00:09:20.029: [ COMMCRS][2834570128]clsc_connect: (0xa45b330) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2012-12-20 00:09:47.675: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [172339896] retval lht [-27] Signal CV.
2012-12-20 00:09:55.947: [ CRSMAIN][7390912]0CRSD locked during state recovery, please wait.
2012-12-20 00:10:01.118: [ CRSMAIN][7390912]0CRSD recovered, unlocked.
2012-12-20 00:10:01.127: [ CRSMAIN][7390912]0QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
2012-12-20 00:10:02.329: [ CRSMAIN][7390912]0CRSD UI socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
2012-12-20 00:10:02.400: [ CRSMAIN][7390912]0E2E socket on: (ADDRESS=(PROTOCOL=tcp)(HOST=rac1-priv)(PORT=49896))
2012-12-20 00:10:02.401: [ CRSMAIN][7390912]0Starting Threads
2012-12-20 00:10:02.406: [ CRSMAIN][7390912]0CRS Daemon Started.
2012-12-20 00:10:09.239: [ CRSRES][2740161424]0startRunnable: setting CLI values
2012-12-20 00:10:09.612: [ CRSRES][2729671568]0startRunnable: setting CLI values
2012-12-20 00:10:10.089: [ CRSRES][2740161424]0Attempting to start `ora.rac1.vip` on member `rac1`
2012-12-20 00:10:10.147: [ CRSRES][2729671568]0Attempting to start `ora.rac2.vip` on member `rac1`
2012-12-20 00:10:25.883: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:25.895: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:25.907: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:25.929: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.009: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.036: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.077: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.099: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.112: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.124: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.138: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.156: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.181: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.197: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.213: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.225: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.239: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.253: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.265: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.275: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.288: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.299: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.312: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.324: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.335: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.351: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.366: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.379: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.389: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.400: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.414: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.426: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.438: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.449: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.460: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.473: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.486: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.500: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.513: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.523: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.537: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.551: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.563: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.574: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.587: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.607: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.620: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.636: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.650: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.662: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.680: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:26.694: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [176455408] retval lht [-27] Signal CV.
2012-12-20 00:10:40.640: [ CRSRES][2740161424]0Start of `ora.rac1.vip` on member `rac1` succeeded.
2012-12-20 00:10:42.653: [ CRSRES][2729671568]0Start of `ora.rac2.vip` on member `rac1` succeeded.
2012-12-20 00:10:44.290: [ CRSRES][2740161424]0startRunnable: setting CLI values
2012-12-20 00:10:44.600: [ CRSRES][2729671568]0StopResource: setting CLI values
2012-12-20 00:10:44.601: [ CRSRES][2740161424]0Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
2012-12-20 00:10:44.669: [ CRSRES][2729671568]0Attempting to stop `ora.rac2.vip` on member `rac1`
2012-12-20 00:10:46.051: [ CRSRES][2729671568]0Stop of `ora.rac2.vip` on member `rac1` succeeded.
2012-12-20 00:10:46.154: [ COMMCRS][2696936336]clsc_send_msg: (0xa84ecb0) NS err (12571, 12560), transport (530, 111, 0)
2012-12-20 00:10:46.154: [ CRSCOMM][2729671568]0CLSC connection failed, ret = 9
2012-12-20 00:10:46.154: [ CRSEVT][2729671568]0invokepeer ret 200
2012-12-20 00:10:46.302: [ CRSRES][2729671568]0Remote start never sent to rac2: X_E2E_NotSent : Failed to connect to node: rac2
(File: caa_CmdRTI.cpp, line: 492
2012-12-20 00:10:46.303: [ CRSRES][2729671568][ALERT]0Remote start for `ora.rac2.vip` failed on member `rac2`
2012-12-20 00:10:46.446: [ CRSEVT][2740161424]0CAAMonitorHandler :: 0:Action Script /opt/app/oracle/product/db_1/bin/racgwrap(start) timed out for ora.rac1.ASM1.asm! (timeout=600)
2012-12-20 00:10:46.446: [ CRSAPP][2740161424]0StartResource error for ora.rac1.ASM1.asm error code = -2
2012-12-20 00:10:46.558: [ CRSRES][2729671568]0startRunnable: setting CLI values
2012-12-20 00:10:46.625: [ CRSEVT][2740161424]0CAAMonitorHandler :: 0:Action Script /opt/app/oracle/product/db_1/bin/racgwrap(stop) timed out for ora.rac1.ASM1.asm! (timeout=600)
2012-12-20 00:10:46.626: [ CRSAPP][2740161424]0StopResource error for ora.rac1.ASM1.asm error code = -2
2012-12-20 00:10:46.665: [ CRSRES][2729671568]0Attempting to start `ora.rac2.vip` on member `rac1`
2012-12-20 00:10:46.750: [ CRSRES][2740161424]0X_OP_StopResourceFailed : Stop Resource failed
(File: rti.cpp, line: 1698
2012-12-20 00:10:46.750: [ CRSRES][2740161424][ALERT]0`ora.rac1.ASM1.asm` on member `rac1` has experienced an unrecoverable failure.
2012-12-20 00:10:46.750: [ CRSRES][2740161424]0Human intervention required to resume its availability.
2012-12-20 00:10:46.938: [ CRSRES][2740161424]0startRunnable: setting CLI values
2012-12-20 00:10:46.978: [ CRSRES][2740161424]0Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
2012-12-20 00:10:47.541: [ CRSEVT][2740161424]0CAAMonitorHandler :: 0:Action Script /opt/app/oracle/product/db_1/bin/racgwrap(start) timed out for ora.rac1.LISTENER_RAC1.lsnr! (timeout=600)
2012-12-20 00:10:47.541: [ CRSAPP][2740161424]0StartResource error for ora.rac1.LISTENER_RAC1.lsnr error code = -2
2012-12-20 00:10:47.807: [ CRSEVT][2740161424]0CAAMonitorHandler :: 0:Action Script /opt/app/oracle/product/db_1/bin/racgwrap(stop) timed out for ora.rac1.LISTENER_RAC1.lsnr! (timeout=600)
2012-12-20 00:10:47.807: [ CRSAPP][2740161424]0StopResource error for ora.rac1.LISTENER_RAC1.lsnr error code = -2
2012-12-20 00:10:48.181: [ CRSRES][2740161424]0X_OP_StopResourceFailed : Stop Resource failed
(File: rti.cpp, line: 1698
2012-12-20 00:10:48.181: [ CRSRES][2740161424][ALERT]0`ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` has experienced an unrecoverable failure.
2012-12-20 00:10:48.181: [ CRSRES][2740161424]0Human intervention required to resume its availability.
2012-12-20 00:10:50.692: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [177094664] retval lht [-27] Signal CV.
2012-12-20 00:11:00.139: [ CRSRES][2729671568]0Start of `ora.rac2.vip` on member `rac1` succeeded.
2012-12-20 00:11:00.257: [ CRSRES][2729671568]0StopResource: setting CLI values
2012-12-20 00:11:00.269: [ CRSRES][2729671568]0Attempting to stop `ora.rac2.vip` on member `rac1`
2012-12-20 00:11:00.882: [ CRSRES][2729671568]0Stop of `ora.rac2.vip` on member `rac1` succeeded.
2012-12-20 00:11:00.938: [ CRSRES][2729671568]0Attempting to start `ora.rac2.vip` on member `rac2`
2012-12-20 00:11:23.937: [ CRSRES][2729671568]0Start of `ora.rac2.vip` on member `rac2` failed.
2012-12-20 00:11:24.051: [ CRSRES][2729671568]0startRunnable: setting CLI values
2012-12-20 00:11:24.067: [ CRSRES][2729671568]0Attempting to start `ora.rac2.vip` on member `rac1`
2012-12-20 00:11:36.257: [ OCRSRV][3044367248]th_select_handler: Failed to retrieve procctx from ht. constr = [177090520] retval lht [-27] Signal CV.
2012-12-20 00:11:43.061: [ CRSAPP][2729671568]0StartResource error for ora.rac2.vip error code = 1
2012-12-20 00:11:46.191: [ CRSAPP][2740161424]0CheckResource error for ora.rac1.vip error code = 1
2012-12-20 00:11:46.196: [ CRSRES][2740161424]0In stateChanged, ora.rac1.vip target is ONLINE
2012-12-20 00:11:46.197: [ CRSRES][2740161424]0ora.rac1.vip on rac1 went OFFLINE unexpectedly
2012-12-20 00:11:46.197: [ CRSRES][2740161424]0StopResource: setting CLI values
2012-12-20 00:11:46.211: [ CRSRES][2740161424]0Attempting to stop `ora.rac1.vip` on member `rac1`
2012-12-20 00:11:46.422: [ CRSRES][2729671568]0Start of `ora.rac2.vip` on member `rac1` failed.
2012-12-20 00:11:46.815: [ CRSRES][2686446480]0startRunnable: setting CLI values
2012-12-20 00:11:46.848: [ CRSRES][2729671568]0startRunnable: setting CLI values
2012-12-20 00:11:47.139: [ CRSRES][2686446480]0Attempting to start `ora.rac1.ons` on member `rac1`
2012-12-20 00:11:47.163: [ CRSRES][2729671568]0Attempting to start `ora.rac1.gsd` on member `rac1`
2012-12-20 00:11:47.266: [ CRSRES][2675956624]0Attempting to start `ora.rac2.gsd` on member `rac2`
2012-12-20 00:11:47.571: [ CRSRES][2665466768]0Attempting to start `ora.rac2.ons` on member `rac2`
2012-12-20 00:11:49.679: [ CRSEVT][2686446480]0CAAMonitorHandler :: 0:Action Script /opt/app/crs/bin/racgwrap(start) timed out for ora.rac1.ons! (timeout=600)
2012-12-20 00:11:49.680: [ CRSAPP][2686446480]0StartResource error for ora.rac1.ons error code = -2
2012-12-20 00:11:49.710: [ CRSEVT][2729671568]0CAAMonitorHandler :: 0:Action Script /opt/app/crs/bin/racgwrap(start) timed out for ora.rac1.gsd! (timeout=600)
2012-12-20 00:11:49.710: [ CRSAPP][2729671568]0StartResource error for ora.rac1.gsd error code = -2
2012-12-20 00:11:49.794: [ CRSEVT][2686446480]0CAAMonitorHandler :: 0:Action Script /opt/app/crs/bin/racgwrap(stop) timed out for ora.rac1.ons! (timeout=600)
2012-12-20 00:11:49.794: [ CRSAPP][2686446480]0StopResource error for ora.rac1.ons error code = -2
2012-12-20 00:11:49.813: [ CRSRES][2686446480]0X_OP_StopResourceFailed : Stop Resource failed
(File: rti.cpp, line: 1698
2012-12-20 00:11:49.813: [ CRSRES][2686446480][ALERT]0`ora.rac1.ons` on member `rac1` has experienced an unrecoverable failure.
2012-12-20 00:11:49.813: [ CRSRES][2686446480]0Human intervention required to resume its availability.
2012-12-20 00:11:49.839: [ CRSEVT][2729671568]0CAAMonitorHandler :: 0:Action Script /opt/app/crs/bin/racgwrap(stop) timed out for ora.rac1.gsd! (timeout=600)
2012-12-20 00:11:49.839: [ CRSAPP][2729671568]0StopResource error for ora.rac1.gsd error code = -2
2012-12-20 00:11:49.865: [ CRSRES][2729671568]0X_OP_StopResourceFailed : Stop Resource failed
(File: rti.cpp, line: 1698
2012-12-20 00:11:49.865: [ CRSRES][2729671568][ALERT]0`ora.rac1.gsd` on member `rac1` has experienced an unrecoverable failure.
2012-12-20 00:11:49.865: [ CRSRES][2729671568]0Human intervention required to resume its availability.
2012-12-20 00:11:50.076: [ CRSRES][2740161424]0Stop of `ora.rac1.vip` on member `rac1` succeeded.
2012-12-20 00:11:50.079: [ CRSRES][2740161424]0ora.rac1.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2012-12-20 00:11:50.103: [ CRSRES][2740161424]0ora.rac1.vip failed on rac1 relocating.
2012-12-20 00:11:50.520: [ CRSRES][2740161424]0Attempting to start `ora.rac1.vip` on member `rac2`
2012-12-20 00:11:50.773: [ CRSRES][2675956624][ALERT]0`ora.rac2.gsd` on member `rac2` has experienced an unrecoverable failure.
2012-12-20 00:11:50.774: [ CRSRES][2675956624]0Human intervention required to resume its availability.
2012-12-20 00:11:50.810: [ CRSRES][2665466768][ALERT]0`ora.rac2.ons` on member `rac2` has experienced an unrecoverable failure.
2012-12-20 00:11:50.810: [ CRSRES][2665466768]0Human intervention required to resume its availability.
2012-12-20 00:12:13.625: [ CRSRES][2740161424]0Start of `ora.rac1.vip` on member `rac2` failed.
2012-12-20 00:12:13.994: [ CRSRES][2686446480]0startRunnable: setting CLI values
2012-12-20 00:12:26.925: [ COMMCRS][2824080272]clsc_receive: (0xa975368) Lock release 1 failed, rc 2
2012-12-20 00:12:26.925: [ COMMCRS][2824080272]clsc_receive: (0xa975368) error 2
2012-12-20 00:12:33.062: [ CRSAPP][2686446480]0StartResource error for ora.rac2.vip error code = 1
[root@rac1 bin]# ./crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
[root@rac1 bin]#
current output of crs_stat
[root@rac1 bin]# ./crs_stat -t
Name Type Target State Host
ora....SM1.asm application ONLINE UNKNOWN rac1
ora....C1.lsnr application ONLINE UNKNOWN rac1
ora.rac1.gsd application ONLINE UNKNOWN rac1
ora.rac1.ons application ONLINE UNKNOWN rac1
ora.rac1.vip application ONLINE OFFLINE
ora....SM2.asm application ONLINE OFFLINE
ora....C2.lsnr application ONLINE OFFLINE
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE OFFLINE -
Cluster with one 2 Node RAC and a Single Instance using ASM
Hi there,
i am not sure with one planned installation and want to ask, weather i am on the right track.
Some Facts:
Clusterware 11g
ASM 11g
Database 10gR2
AIX 5.3
3 Machines
2 Storages DS4700
My Plan
On Node 1 and Node 2 we install a RAC Database for an ERP Software
On Node 3 we install a single Instance Database for a Logistic Software
So i will install on all three Nodes Clusterware and an 3 Instances ASM - Cluster
I create 2 Diskgroups, one for the FRA and one for the Data, both on Luns on the DS4700
The RAC-Database and the Logistic-Database are using the same Diskgroups.
Is this the way to go for this circumstances?
The alternative is, as far as i see
Clusterware on an 3 Servers
One 2 Node ASM for the ERP Software
one Single Node ASM for the Logistcs
4 Diskgroups, because of the 2 ASM-Database 2 for the RAC and 2 for the Single Instance.
Please give me some hints, which way i should prefer.
My tendence is going to the first alternative. I like the idea to share the Diskgroups over more than on Database because of easy administration.
The load of the 2 Databases are completly different, the logistc software will nearly do nothing compared to the ERP Software, so this should'nt be a problem.
But maybe i oversee something, so please do not hesitate to tell me, i am completly wrong ;)
Thanks a lot
JörgChris Slattery wrote:
why clusterware on 3rd machine ?
I'd have separate DGs but that's just me.If you wish to install ASM you need OCS installed on the machine, even if it is just one node at all.
It is a kind of a dependency, no OCS, no ASM
cu
Jörg -
Dear all,
I am trying to install clean Oracle 11.2.0.3 grid infrastructure on a two node cluster running on Solaris 5.10.
- Cluster verification was successfully on both nodes; No warning or issues;
- I am using 2 network cards for the public and 2 for the private interconnect;
- OCR is stored on ASM
- Firewall is disabled on both nodes
- SCAN is being configured on the DNS (not added in /etc/hosts)
- GNS is not used
- hosts file is identical (except the primary hostname)
The problem: root.sh fails on the 2nd (remote) node, because it fails to start the "ora.asm" resource. However, the root.sh has completed successfully on the 1st node.. Somehow, root.sh doesn't create +ASM2 instance on the remote (host2) node.
root.sh was executed first on the local node (host1) and after the successful execution was started on the remote (host2) node.
Output from host1 (working):
===================
Adding Clusterware entries to inittab
CRS-2672: Attempting to start 'ora.mdnsd' on 'host1'
CRS-2676: Start of 'ora.mdnsd' on 'host1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'host1'
CRS-2676: Start of 'ora.gpnpd' on 'host1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'host1'
CRS-2672: Attempting to start 'ora.gipcd' on 'host1'
CRS-2676: Start of 'ora.cssdmonitor' on 'host1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'host1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'host1'
CRS-2672: Attempting to start 'ora.diskmon' on 'host1'
CRS-2676: Start of 'ora.diskmon' on 'host1' succeeded
CRS-2676: Start of 'ora.cssd' on 'host1' succeeded
ASM created and started successfully.
Disk Group CRS created successfully.
clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4256: Updating the profile
Successful addition of voting disk 4373be34efab4f01bf79f6c5362acfd3.
Successful addition of voting disk 7fd725fa4d904f07bf76cecf96791547.
Successful addition of voting disk a9c85297bdd74f3abfd86899205aaf17.
Successfully replaced voting disk group with +CRS.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
## STATE File Universal Id File Name Disk group
1. ONLINE 4373be34efab4f01bf79f6c5362acfd3 (/dev/rdsk/c4t600A0B80006E2CC40000C6674E82AA57d0s4) [CRS]
2. ONLINE 7fd725fa4d904f07bf76cecf96791547 (/dev/rdsk/c4t600A0B80006E2CC40000C6694E82AADDd0s4) [CRS]
3. ONLINE a9c85297bdd74f3abfd86899205aaf17 (/dev/rdsk/c4t600A0B80006E2F100000C7744E82AC7Ad0s4) [CRS]
Located 3 voting disk(s).
CRS-2672: Attempting to start 'ora.asm' on 'host1'
CRS-2676: Start of 'ora.asm' on 'host1' succeeded
CRS-2672: Attempting to start 'ora.CRS.dg' on 'host1'
CRS-2676: Start of 'ora.CRS.dg' on 'host1' succeeded
CRS-2672: Attempting to start 'ora.registry.acfs' on 'host1'
CRS-2676: Start of 'ora.registry.acfs' on 'host1' succeeded
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
Name Type Target State Host
ora.CRS.dg ora....up.type ONLINE ONLINE host1
ora....ER.lsnr ora....er.type ONLINE ONLINE host1
ora....N1.lsnr ora....er.type ONLINE ONLINE host1
ora....N2.lsnr ora....er.type ONLINE ONLINE host1
ora....N3.lsnr ora....er.type ONLINE ONLINE host1
ora.asm ora.asm.type ONLINE ONLINE host1
ora....SM1.asm application ONLINE ONLINE host1
ora....B1.lsnr application ONLINE ONLINE host1
ora....db1.gsd application OFFLINE OFFLINE
ora....db1.ons application ONLINE ONLINE host1
ora....db1.vip ora....t1.type ONLINE ONLINE host1
ora.cvu ora.cvu.type ONLINE ONLINE host1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE host1
ora.oc4j ora.oc4j.type ONLINE ONLINE host1
ora.ons ora.ons.type ONLINE ONLINE host1
ora....ry.acfs ora....fs.type ONLINE ONLINE host1
ora.scan1.vip ora....ip.type ONLINE ONLINE host1
ora.scan2.vip ora....ip.type ONLINE ONLINE host1
ora.scan3.vip ora....ip.type ONLINE ONLINE host1
Output from host2 (failing):
===================
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node billdb1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Start of resource "ora.asm" failed
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'host2'
CRS-2676: Start of 'ora.drivers.acfs' on 'host2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'host2'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 0 Serial number: 0
*. For details refer to "(:CLSN00107:)" in "/u01/11.2.0/grid/log/host2/agent/ohasd/oraagent_grid/oraagent_grid.log".*
CRS-2674: Start of 'ora.asm' on 'host2' failed
CRS-2679: Attempting to clean 'ora.asm' on 'host2'
CRS-2681: Clean of 'ora.asm' on 'host2' succeeded
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'host2'
CRS-2677: Stop of 'ora.drivers.acfs' on 'host2' succeeded
CRS-4000: Command Start failed, or completed with errors.
Failed to start Oracle Grid Infrastructure stack
Failed to start ASM at /u01/11.2.0/grid/crs/install/crsconfig_lib.pm line 1272.
/u01/11.2.0/grid/perl/bin/perl -I/u01/11.2.0/grid/perl/lib -I/u01/11.2.0/grid/crs/install /u01/11.2.0/grid/crs/install/rootcrs.pl execution failed
Contents of "/u01/11.2.0/grid/cfgtoollogs/crsconfig/rootcrs_host2.log"
=============================================
CRS-2672: Attempting to start 'ora.asm' on 'host2'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 0 Serial number: 0
. For details refer to "(:CLSN00107:)" in "/u01/11.2.0/grid/log/host2/agent/ohasd/oraagent_grid/oraagent_grid.log".
CRS-2674: Start of 'ora.asm' on 'host2' failed
CRS-2679: Attempting to clean 'ora.asm' on 'host2'
CRS-2681: Clean of 'ora.asm' on 'host2' succeeded
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'host2'
CRS-2677: Stop of 'ora.drivers.acfs' on 'host2' succeeded
CRS-4000: Command Start failed, or completed with errors.
2011-10-24 19:36:54: Failed to start Oracle Grid Infrastructure stack
2011-10-24 19:36:54: ###### Begin DIE Stack Trace ######
2011-10-24 19:36:54: Package File Line Calling
2011-10-24 19:36:54: --------------- -------------------- ---- ----------
2011-10-24 19:36:54: 1: main rootcrs.pl 375 crsconfig_lib::dietrap
2011-10-24 19:36:54: 2: crsconfig_lib crsconfig_lib.pm 1272 main::__ANON__
2011-10-24 19:36:54: 3: crsconfig_lib crsconfig_lib.pm 1171 crsconfig_lib::start_cluster
2011-10-24 19:36:54: 4: main rootcrs.pl 803 crsconfig_lib::perform_start_cluster
2011-10-24 19:36:54: ####### End DIE Stack Trace #######
Shortened output from "/u01/11.2.0/grid/log/host2/agent/ohasd/oraagent_grid/oraagent_grid.log"
2011-10-24 19:35:48.726: [ora.asm][9] {0:0:224} [start] clean {
2011-10-24 19:35:48.726: [ora.asm][9] {0:0:224} [start] InstAgent::stop_option stop mode immediate option 1
2011-10-24 19:35:48.726: [ora.asm][9] {0:0:224} [start] InstAgent::stop {
2011-10-24 19:35:48.727: [ora.asm][9] {0:0:224} [start] InstAgent::stop original reason system do shutdown abort
2011-10-24 19:35:48.727: [ora.asm][9] {0:0:224} [start] ConnectionPool::resetConnection s_statusOfConnectionMap 00ab1948
2011-10-24 19:35:48.727: [ora.asm][9] {0:0:224} [start] ConnectionPool::resetConnection sid +ASM2 status 2
2011-10-24 19:35:48.728: [ora.asm][9] {0:0:224} [start] Gimh::check OH /u01/11.2.0/grid SID +ASM2
2011-10-24 19:35:48.728: [ora.asm][9] {0:0:224} [start] Gimh::check condition changes to (GIMH_NEXT_NUM) 0,1,7 exists
2011-10-24 19:35:48.729: [ora.asm][9] {0:0:224} [start] (:CLSN00006:)AsmAgent::check failed gimh state 0
2011-10-24 19:35:48.729: [ora.asm][9] {0:0:224} [start] AsmAgent::check ocrCheck 1 m_OcrOnline 0 m_OcrTimer 0
2011-10-24 19:35:48.729: [ora.asm][9] {0:0:224} [start] DgpAgent::initOcrDgpSet { entry
2011-10-24 19:35:48.730: [ora.asm][9] {0:0:224} [start] DgpAgent::initOcrDgpSet procr_get_conf: retval [0] configured [1] local only [0] error buffer []
2011-10-24 19:35:48.730: [ora.asm][9] {0:0:224} [start] DgpAgent::initOcrDgpSet procr_get_conf: OCR loc [0], Disk Group : [+CRS]
2011-10-24 19:35:48.730: [ora.asm][9] {0:0:224} [start] DgpAgent::initOcrDgpSet m_ocrDgpSet 015fba90 dgName CRS
2011-10-24 19:35:48.731: [ora.asm][9] {0:0:224} [start] DgpAgent::initOcrDgpSet ocrret 0 found 1
2011-10-24 19:35:48.731: [ora.asm][9] {0:0:224} [start] DgpAgent::initOcrDgpSet ocrDgpSet CRS
2011-10-24 19:35:48.731: [ora.asm][9] {0:0:224} [start] DgpAgent::initOcrDgpSet exit }
2011-10-24 19:35:48.731: [ora.asm][9] {0:0:224} [start] DgpAgent::ocrDgCheck Entry {
2011-10-24 19:35:48.732: [ora.asm][9] {0:0:224} [start] DgpAgent::getConnxn new pool
2011-10-24 19:35:48.732: [ora.asm][9] {0:0:224} [start] DgpAgent::getConnxn new pool m_oracleHome:/u01/11.2.0/grid m_oracleSid:+ASM2 m_usrOraEnv:
2011-10-24 19:35:48.732: [ora.asm][9] {0:0:224} [start] ConnectionPool::ConnectionPool 2 m_oracleHome:/u01/11.2.0/grid, m_oracleSid:+ASM2, m_usrOraEnv:
2011-10-24 19:35:48.733: [ora.asm][9] {0:0:224} [start] ConnectionPool::addConnection m_oracleHome:/u01/11.2.0/grid m_oracleSid:+ASM2 m_usrOraEnv: pConnxn:
01fcdf10
2011-10-24 19:35:48.733: [ora.asm][9] {0:0:224} [start] Utils::getCrsHome crsHome /u01/11.2.0/grid
2011-10-24 19:35:51.969: [ora.asm][14] {0:0:224} [check] makeConnectStr = (DESCRIPTION=(ADDRESS=(PROTOCOL=beq)(PROGRAM=/u01/11.2.0/grid/bin/oracle)(ARGV0=o
racle+ASM2)(ENVS='ORACLE_HOME=/u01/11.2.0/grid,ORACLE_SID=+ASM2')(ARGS='(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))'))(CONNECT_DATA=(SID=+ASM2)))
2011-10-24 19:35:51.971: [ora.asm][14] {0:0:224} [check] ConnectionPool::getConnection 260 pConnxn 013e40a0
2011-10-24 19:35:51.971: [ora.asm][14] {0:0:224} [check] DgpAgent::getConnxn connected
2011-10-24 19:35:51.971: [ora.asm][14] {0:0:224} [check] InstConnection::connectInt: server not attached
2011-10-24 19:35:52.190: [ora.asm][14] {0:0:224} [check] ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
SVR4 Error: 2: No such file or directory
Process ID: 0
Session ID: 0 Serial number: 0
2011-10-24 19:35:52.190: [ora.asm][14] {0:0:224} [check] InstConnection::connectInt (2) Exception OCIException
2011-10-24 19:35:52.190: [ora.asm][14] {0:0:224} [check] InstConnection:connect:excp OCIException OCI error 1034
2011-10-24 19:35:52.190: [ora.asm][14] {0:0:224} [check] DgpAgent::queryDgStatus excp ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
SVR4 Error: 2: No such file or directory
Process ID: 0
Session ID: 0 Serial number: 0
2011-10-24 19:35:52.190: [ora.asm][14] {0:0:224} [check] DgpAgent::queryDgStatus asm inst is down or going down
2011-10-24 19:35:52.191: [ora.asm][14] {0:0:224} [check] DgpAgent::queryDgStatus dgName CRS ret 1
2011-10-24 19:35:52.191: [ora.asm][14] {0:0:224} [check] (:CLSN00100:)DgpAgent::ocrDgCheck OCR dgName CRS state 1
2011-10-24 19:35:52.192: [ora.asm][14] {0:0:224} [check] ConnectionPool::releaseConnection InstConnection 013e40a0
2011-10-24 19:35:52.192: [ora.asm][14] {0:0:224} [check] AsmAgent::check ocrCheck 2 m_OcrOnline 0 m_OcrTimer 0
2011-10-24 19:35:52.193: [ora.asm][14] {0:0:224} [check] CrsCmd::ClscrsCmdData::stat entity 1 statflag 32 useFilter 0
2011-10-24 19:35:52.197: [ COMMCRS][23]clsc_connect: (1020d39d0) no listener at (ADDRESS=(PROTOCOL=IPC)(KEY=CRSD_UI_SOCKET))
Please advice for any workaround or a metalink note.
Thanks in advance!Thanks for the fast reply!
- Yes, the shared storage is accessible.
- The alert log for the +ASM2 clearly shows that ASM instance has started normally using default parameters and at one point PMON process dumped.
- The system logs just shows that there is an error executing "crswrapexece.pl"
System Log
===================
*Oct 24 19:25:03 host2 root: [ID 702911 user.error] exec /u01/11.2.0/grid/perl/bin/perl -I/u01/11.2.0/grid/perl/lib /u01/11.2.0/grid/bin/crswrapexece.pl /*
u01/11.2.0/grid/crs/install/s_crsconfig_host2_env.txt /u01/11.2.0/grid/bin/ohasd.bin "reboot"
Oct 24 19:26:33 host2 oracleoks: [ID 902884 kern.notice] [Oracle OKS] mallocing log buffer, size=10485760
Oct 24 19:26:33 host2 oracleoks: [ID 714332 kern.notice] [Oracle OKS] log buffer = 0x301780fcb50, size 10485760
Oct 24 19:26:33 host2 oracleoks: [ID 400061 kern.notice] NOTICE: [Oracle OKS] ODLM hash size 16384
Oct 24 19:26:33 host2 oracleoks: [ID 160659 kern.notice] NOTICE: OKSK-00004: Module load succeeded. Build information: (LOW DEBUG) USM_11.2.0.3.0_SOLAR
IS.SPARC64_110803.1 2011/08/11 02:38:30
Oct 24 19:26:33 host2 pseudo: [ID 129642 kern.info] pseudo-device: oracleadvm0
Oct 24 19:26:33 host2 genunix: [ID 936769 kern.info] oracleadvm0 is /pseudo/oracleadvm@0
Oct 24 19:26:33 host2 oracleoks: [ID 141287 kern.notice] NOTICE: ADVMK-00001: Module load succeeded. Build information: (LOW DEBUG) - USM_11.2.0.3.0_SOL
ARIS.SPARC64_110803.1 built on 2011/08/11 02:40:17.
Oct 24 19:26:33 host2 oracleacfs: [ID 202941 kern.notice] NOTICE: [Oracle ACFS] FCB hash size 16384
Oct 24 19:26:33 host2 oracleacfs: [ID 671725 kern.notice] NOTICE: [Oracle ACFS] buffer cache size 511MB (79884 buckets)
Oct 24 19:26:33 host2 oracleacfs: [ID 730054 kern.notice] NOTICE: [Oracle ACFS] DLM hash size 16384
Oct 24 19:26:33 host2 oracleoks: [ID 617314 kern.notice] NOTICE: ACFSK-0037: Module load succeeded. Build information: (LOW DEBUG) USM_11.2.0.3.0_SOLAR
IS.SPARC64_110803.1 2011/08/11 02:42:45
Oct 24 19:26:33 host2 pseudo: [ID 129642 kern.info] pseudo-device: oracleacfs0
Oct 24 19:26:33 host2 genunix: [ID 936769 kern.info] oracleacfs0 is /pseudo/oracleacfs@0
Oct 24 19:26:36 host2 oracleoks: [ID 621795 kern.notice] NOTICE: OKSK-00010: Persistent OKS log opened at /u01/11.2.0/grid/log/host2/acfs/acfs.log.0.
Oct 24 19:31:37 host2 last message repeated 1 time
Oct 24 19:33:05 host2 CLSD: [ID 770310 daemon.notice] The clock on host host2 has been updated by the Cluster Time Synchronization Service to be synchr
onous with the mean cluster time.
ASM alert log
====================================================================
<msg time='2011-10-24T19:35:48.776+01:00' org_id='oracle' comp_id='asm'
client_id='' type='UNKNOWN' level='16'
host_id='host2' host_addr='10.172.16.200' module=''
pid='26406'>
<txt>System state dump requested by (instance=2, osid=26396 (PMON)), summary=[abnormal instance termination].
</txt>
</msg>
<msg time='2011-10-24T19:35:48.778+01:00' org_id='oracle' comp_id='asm'
client_id='' type='UNKNOWN' level='16'
host_id='host2' host_addr='10.172.16.200' module=''
pid='26406'>
<txt>System State dumped to trace file /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_diag_26406.trc
</txt>
</msg>
<msg time='2011-10-24T19:35:48.927+01:00' org_id='oracle' comp_id='asm'
type='UNKNOWN' level='16' host_id='host2'
host_addr='10.172.16.200' pid='26470'>
<txt>ORA-1092 : opitsk aborting process
</txt>
</msg>
<msg time='2011-10-24T19:35:49.128+01:00' org_id='oracle' comp_id='asm'
type='UNKNOWN' level='16' host_id='host2'
host_addr='10.172.16.200' pid='26472'>
<txt>ORA-1092 : opitsk aborting process
</txt>
</msg>
Output from "/u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_diag_26406.trc"
REQUEST:system state dump at level 10, requested by (instance=2, osid=26396 (PMON)), summary=[abnormal instance termination].
kjzdattdlm: Can not attach to DLM (LMON up=[TRUE], DB mounted=[FALSE]).
===================================================
SYSTEM STATE (level=10)
Orapids on dead process list: [count = 0]
PROCESS 1:
SO: 0x3df098b50, type: 2, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x3df098b50, name=process, file=ksu.h LINE:12616 ID:, pg=0
(process) Oracle pid:1, ser:0, calls cur/top: 0x0/0x0
flags : (0x20) PSEUDO
flags2: (0x0), flags3: (0x10)
intr error: 0, call error: 0, sess error: 0, txn error 0
intr queue: empty
ksudlp FALSE at location: 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=0
O/S info: user: , term: , ospid: (DEAD)
OSD pid info: Unix process pid: 0, image: PSEUDO
SO: 0x38000cef0, type: 5, owner: 0x3df098b50, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x0, name=kss parent, file=kss2.h LINE:138 ID:, pg=0
PSO child state object changes :
Dump of memory from 0x00000003DF722AC0 to 0x00000003DF722CC8
3DF722AC0 00000000 00000000 00000000 00000000 [................]
Repeat 31 times
3DF722CC0 00000000 00000000 [........]
PROCESS 2: PMON
SO: 0x3df099bf8, type: 2, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x3df099bf8, name=process, file=ksu.h LINE:12616 ID:, pg=0
(process) Oracle pid:2, ser:1, calls cur/top: 0x3db6c8d30/0x3db6c8d30
flags : (0xe) SYSTEM
flags2: (0x0), flags3: (0x10)
intr error: 0, call error: 0, sess error: 0, txn error 0
intr queue: empty
ksudlp FALSE at location: 0
(post info) last post received: 0 0 136
last post received-location: kjm.h LINE:1228 ID:kjmdmi: pmon to attach
last process to post me: 3df0a2138 1 6
last post sent: 0 0 137
last post sent-location: kjm.h LINE:1230 ID:kjiath: pmon attached
last process posted by me: 3df0a2138 1 6
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 0x3debbbf40
O/S info: user: grid, term: UNKNOWN, ospid: 26396
OSD pid info: Unix process pid: 26396, image: oracle@host2 (PMON)
SO: 0x3d8800c18, type: 30, owner: 0x3df099bf8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x3df099bf8, name=ges process, file=kji.h LINE:3669 ID:, pg=0
GES MSG BUFFERS: st=emp chunk=0x0 hdr=0x0 lnk=0x0 flags=0x0 inc=0
outq=0 sndq=0 opid=0 prmb=0x0
mbg=(0 0) mbg=(0 0) mbg[r]=(0 0)
fmq=(0 0) fmq=(0 0) fmq[r]=(0 0)
mop[s]=0 mop[q]=0 pendq=0 zmbq=0
nonksxp_recvs=0
------------process 3d8800c18--------------------
proc version : 0
Local inst : 2
pid : 26396
lkp_inst : 2
svr_mode : 0
proc state : KJP_FROZEN
Last drm hb acked : 0
flags : x50
ast_rcvd_svrmod : 0
current lock op : 0
Total accesses : 1
Imm. accesses : 0
Locks on ASTQ : 0
Locks Pending AST : 0
Granted locks : 0
AST_Q:
PENDING_Q:
GRANTED_Q:
SO: 0x3d9835198, type: 14, owner: 0x3df099bf8, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
proc=0x3df099bf8, name=channel handle, file=ksr2.h LINE:367 ID:, pg=0
(broadcast handle) 3d9835198 flag: (2) ACTIVE SUBSCRIBER,
owner: 3df099bf8 - ospid: 26396
event: 1, last message event: 1,
last message waited event: 1,
next message: 0(0), messages read: 0
channel: (3d9934df8) PMON actions channel [name: 2]
scope: 7, event: 1, last mesage event: 0,
publishers/subscribers: 0/1,
messages published: 0
heuristic msg queue length: 0
SO: 0x3d9835008, type: 14, owner: 0x3df099bf8, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
proc=0x3df099bf8, name=channel handle, file=ksr2.h LINE:367 ID:, pg=0
(broadcast handle) 3d9835008 flag: (2) ACTIVE SUBSCRIBER,
owner: 3df099bf8 - ospid: 26396
event: 1, last message event: 1,
last message waited event: 1,
next message: 0(0), messages read: 0
channel: (3d9941e40) scumnt mount lock [name: 157]
scope: 1, event: 12, last mesage event: 0,
publishers/subscribers: 0/12,
messages published: 0
heuristic msg queue length: 0
SO: 0x3de4a2b80, type: 4, owner: 0x3df099bf8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x3df099bf8, name=session, file=ksu.h LINE:12624 ID:, pg=0
(session) sid: 33 ser: 1 trans: 0x0, creator: 0x3df099bf8
flags: (0x51) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x409) -/-/INC
DID: , short-term DID:
txn branch: 0x0
oct: 0, prv: 0, sql: 0x0, psql: 0x0, user: 0/SYS
ksuxds FALSE at location: 0
service name: SYS$BACKGROUND
Current Wait Stack:
Not in wait; last wait ended 0.666415 sec ago
Wait State:
fixed_waits=0 flags=0x21 boundary=0x0/-1
Session Wait History:
elapsed time of 0.666593 sec since last wait
0: waited for 'pmon timer'
duration=0x12c, =0x0, =0x0
wait_id=63 seq_num=64 snap_id=1
wait times: snap=3.000089 sec, exc=3.000089 sec, total=3.000089 sec
wait times: max=3.000000 sec
wait counts: calls=1 os=1
occurred after 0.002067 sec of elapsed time
1: waited for 'pmon timer'
duration=0x12c, =0x0, =0x0
wait_id=62 seq_num=63 snap_id=1
wait times: snap=3.010111 sec, exc=3.010111 sec, total=3.010111 sec
wait times: max=3.000000 sec
wait counts: calls=1 os=1
occurred after 0.001926 sec of elapsed time
2: waited for 'pmon timer'
duration=0x12c, =0x0, =0x0
wait_id=61 seq_num=62 snap_id=1
wait times: snap=3.125286 sec, exc=3.125286 sec, total=3.125286 sec
wait times: max=3.000000 sec
wait counts: calls=1 os=1
occurred after 0.003361 sec of elapsed time
3: waited for 'pmon timer'
duration=0x12c, =0x0, =0x0
wait_id=60 seq_num=61 snap_id=1
wait times: snap=3.000081 sec, exc=3.000081 sec, total=3.000081 sec
wait times: max=3.000000 sec
wait counts: calls=1 os=1
occurred after 0.002102 sec of elapsed time
4: waited for 'pmon timer'
duration=0x12c, =0x0, =0x0 -
Hi,
I have noticed following 'strange' behaviour of Oracle Restart and ASM.
starting position:
-bash-3.2 $ crsctl status resource -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATA.dg
ONLINE ONLINE oracle-restart
ora.LISTENERASM.lsnr
ONLINE ONLINE oracle-restart
ora.asm
ONLINE ONLINE oracle-restart Started
Cluster Resources
ora.cssd
1 ONLINE ONLINE oracle-restart
ora.diskmon
1 ONLINE ONLINE oracle-restartstep 1:
-bash-3.2 $ srvctl stop asm
-bash-3.2 $ srvctl stop diskgroup -g data
-bash-3.2 $ srvctl disable diskgroup -g datastep 2:
via sqlplus start ASM instance
SQL> startup
ASM instance started
Total System Global Area 283930624 bytes
Fixed Size 2212656 bytes
Variable Size 256552144 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled
SQL> select * from v$asm_diskgroup;
GROUP_NUMBER NAME SECTOR_SIZE BLOCK_SIZE
ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB HOT_USED_MB
COLD_USED_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB OFFLINE_DISKS
COMPATIBILITY
DATABASE_COMPATIBILITY V
1 DATA 512 4096
1048576 MOUNTED EXTERN 10236 10177 0
59 0 10177 0
GROUP_NUMBER NAME SECTOR_SIZE BLOCK_SIZE
ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB HOT_USED_MB
COLD_USED_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB OFFLINE_DISKS
COMPATIBILITY
DATABASE_COMPATIBILITY V
11.2.0.0.0
10.1.0.0.0 N
-bash-3.2 $ crsctl status resource -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATA.dg
OFFLINE OFFLINE oracle-restart <== funny !!!
ora.LISTENERASM.lsnr
ONLINE ONLINE oracle-restart
ora.asm
ONLINE ONLINE oracle-restart Started
Cluster Resources
ora.cssd
1 ONLINE ONLINE oracle-restart
ora.diskmon
1 ONLINE ONLINE oracle-restartIs this behaviour a 'feature' or bug?
Anyone had similar experience?
thanks,
goranHi,
asm resource is depending on diskgroup resource ... if diskgroup res. is not available, crsctl status shows offline, I would expect asm should be also shown as 'offline' (and brought offline) as they are dependent.
What is the point of managing resources via srvctl when it doesn't take care of dependencies? For me it's wrong.ora.asm : is ASM Instance
ora.*.dg : is Diskgroup
ora.*.dg is dependent of ora.asm, not to the contrary.
I can have more than one diskgroup and want only one diskgroup disabled, so I need the ASM Instance (ora.asm) online.
Important:
If you shut down the database with SQL*Plus, Oracle Restart does not interpret this as a database failure and does not attempt to restart the database.
Similarly, if you shut down the Oracle ASM instance with SQL*Plus or ASMCMD, Oracle Restart does not attempt to restart it.
An important difference between starting a component with SRVCTL and starting it with SQL*Plus (or another utility) is the following:
When you start a component with SRVCTL, any components on which this component depends are automatically started first, and in the proper order.
When you start a component with SQL*Plus (or another utility), other components in the dependency chain are not automatically started; you must ensure that any components on which this component depends are started.
Oracle Restart also manages the weak dependency between database instances and the Oracle Net listener (the listener): When a database instance is started, Oracle Restart attempts to start the listener. If the listener startup fails, then the database is still started. If the listener later fails, Oracle Restart does not shut down and restart any database instances.
It makes no sense Oracle Restart to shut down all environment (databases) because the listener down.
Regards,
Levi Pereira -
Hi, I wonder if anyone has experiences in running the above configuration in SUN Cluster. As far as I know, ASM is not yet supported by the SC Oracle HA Agent and for this reason ASM is just certified for RAC on SC. Anyway, there is no good reason why ASM should not be used for HA solutions as well (instead of RAW devices or a filesystem based configuration), even though the ASM instance must be started by a GDS ressource before the database comes up. Are there any plans to integrate ASM into the SC Oracle HA Agent ?
Thanks, MarcTim,
thanks for this and sorry for my late response. Anyway, I disagree with most of the issues:
1) Yes, the hostname is stored in the OCR, but without CRS, the OCR is not shared, but per default locally stored beneath $ORACLE_HOME, for instance: /orahome/oracle/product/10.2.0.2/db/cdata/localhost/local.ocr
So there is no problem with OCR as long as $ORACLE_HOME is locally stored on each cluster node. Furthermore there is a library provided by SUN for older OAS environments to keep the hostname consistent to OAS in SUN Cluster regardless to the real hostname of the cluster node (libloghost_64.so). This could additionally solve the problem in case of a shared $ORACLE_HOME, but in fact there is no need for a shared $ORACLE_HOME and local installations avoid this problem at all.
2) I fully agree, you will have a local ASM instance on each node using it's own OCR (see above).
3) I also agree, the only point to take care about is to integrate startup, shutdown and monitoring of the ASM instance(s) needded to serve the database into SC. This can be done easily by using GDS.
4) Yes the greate advantage of ASM is that DG configuration information is stored as part as header information on the disks. But there is no need zero out the configuration at all to import the DGs to another server. Just shutdown the original ASM instance first (as without RAC the ASM instances shoud be prevented to import the DGs concurrently), switch over the DCS and start the other ASM instance on the failover node. By starting the instance ASM imports all DGs stored on the disks which match the asm_diskstring, in my case asm_diskstring='/dev/did/rdsk/*', because there is just one database.
Moreover this way of failing over ASM is supported by Oracle (for instance for an OAS infrastructure database, see http://download-east.oracle.com/docs/cd/B25006_01/install.1012/install/ha_cfc.htm#BABEADDA). As mentioned in my previous posting I tested this constellation running a local ASM instance (+ASM) on each cluster node and one failover database:
Resource Name Node Name State Status Message
Resource: HW1P-lh-res fix Online Online - LogicalHostname online.
Resource: HW1P-lh-res foxi Offline Offline - LogicalHostname offline.
Resource: HW1P-dg-res fix Online Online
Resource: HW1P-dg-res foxi Offline Offline
Resource: HW1P-asm-res fix Online Online - Service is online.
Resource: HW1P-asm-res foxi Offline Offline
Resource: HW1P-db-res fix Online Online
Resource: HW1P-db-res foxi Offline OfflineI have to admit, that things will get more difficult in case of several failover databases in the same cluster. But there is generally no problem to have several ASM instance on each node, each of them serving another database. So after all I can't see any technical reason for not supporting ASM for failover databases in SUN Cluster. In fact this could be also a solution to get rid of problems in environments using stoarge-based data replication. As DG information is stored on the disks and each disk matching the asm_diskstring will be scanned, ASM-DGs can be imported, even if the DIDs are differrent due to a storage failover. Thus there is no need to combine/fake DIDs like in SVM to use storage-based replication. -
Hi,
After successfully installing PSU 11.2.0.3.6 on Node the Services did not start automatically;
2 Node RAC on HP-UX 11.31
$ crsctl stat res -t -init
NAME TARGET STATE SERVER STATE_DETAILS
Cluster Resources
ora.asm
1 ONLINE OFFLINE
ora.cluster_interconnect.haip
1 ONLINE ONLINE proddb02
ora.crsd
1 ONLINE ONLINE proddb02
ora.cssd
1 ONLINE ONLINE proddb02
ora.cssdmonitor
1 ONLINE ONLINE proddb02
ora.ctssd
1 ONLINE ONLINE proddb02 OBSERVER
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 ONLINE ONLINE proddb02
ora.gipcd
1 ONLINE ONLINE proddb02
ora.gpnpd
1 ONLINE ONLINE proddb02
ora.mdnsd
1 ONLINE ONLINE proddb02
$ crsctl start resource ora.asm
CRS-2672: Attempting to start 'ora.asm' on proddb02'
CRS-2672: Attempting to start 'ora.asm' on 'proddb02'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:check if failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpcini3
ORA-27303: additional information: requested interface lan1:801 interface not up _disable_interface_checking = TRUE to disable this check for single instance cluster. Check output from ifconfig c
. For details refer to "(:CLSN00107:)" in "/u01/app/oracle/11.2.0/grid/log/proddb02/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.asm' on 'proddb02' failed
CRS-2679: Attempting to clean 'ora.asm' on 'proddb02'
CRS-2681: Clean of 'ora.asm' on 'proddb02' succeeded
CRS-2674: Start of 'ora.asm' on 'proddb02' failed
CRS-4000: Command Start failed, or completed with errors.
$ crsctl stat res -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATADG.dg
ONLINE ONLINE proddb01
ONLINE OFFLINE proddb02
ora.FLASHDG.dg
ONLINE ONLINE proddb01
ONLINE OFFLINE proddb02
ora.LISTENER.lsnr
ONLINE ONLINE proddb01
ONLINE ONLINE proddb02
ora.asm
ONLINE ONLINE proddb01 Started
ONLINE OFFLINE proddb02
ora.gsd
OFFLINE OFFLINE proddb01
OFFLINE OFFLINE proddb02
ora.net1.network
ONLINE ONLINE proddb01
ONLINE ONLINE proddb02
ora.ons
ONLINE ONLINE proddb01
ONLINE ONLINE proddb02
Cluster Resources
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE proddb01
ora.cvu
1 ONLINE ONLINE proddb01
ora.proddb01.vip
1 ONLINE ONLINE proddb01
ora.proddb02.vip
1 ONLINE ONLINE proddb02
ora.oc4j
1 OFFLINE OFFLINE
ora.orcl.db
1 ONLINE ONLINE proddb01 Open
2 ONLINE OFFLINE Instance Shutdown
ora.orcl.orcltaf.svc
1 ONLINE ONLINE proddb01
2 ONLINE OFFLINE
ora.scan1.vip
1 ONLINE ONLINE proddb01The Interface lan1:801 is not correct, it was not there initially;
NODE2
$ oifcfg getif
lan1 192.168.1.0 global cluster_interconnect
lan0 10.241.16.128 global public
$ oifcfg iflist
lan1 192.168.1.0
lan1 169.254.0.0
lan0 10.141.15.128
NODE1
$ oifcfg getif
lan1 192.168.1.0 global cluster_interconnect
lan0 10.141.15.128 global public
$ oifcfg iflist
lan1 192.168.1.0
lan0 10.141.15.128
ORA-27303: additional information: requested interface lan1:801* interface not up disableinterface_checking = TRUE to disable this check for single instance cluster. Check output from ifconfig c*
Instead of looking for lan1 192.168.1.0 the clusterware is using lan1:801 and this was not there before installing the PSU 11.2.0.3.6.
may due to that the ASM service is not starting
Any help would be highly appreciated.Trying to remove the virtual lan1:801
# netstat -inw
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
lan1 1500 192.168.1.0 192.168.1.11 69344 0 73890 0 0
lan0 1500 10.141.15.128 10.141.15.136 7343 0 7293 0 0
lo0 32808 127.0.0.0 127.0.0.1 250567 0 250567 0 0
lan1:801 1500 169.254.0.0 169.254.150.220 138 0 138 0 0
lan0:801 1500 10.141.15.128 10.141.15.138 280 0 0 0 0# ifconfig lan1:801 0.0.0.0 down
As soon as the interface lan1:801 was down automatically a new interface lan1:802 came into existence with the same ip address
# netstat -inw
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
lan1 1500 192.168.1.0 192.168.1.11 69423 0 73998 0 0
lan0 1500 10.141.15.128 10.141.15.136 7421 0 7351 0 0
lo0 32808 127.0.0.0 127.0.0.1 252055 0 252055 0 0
lan1:801* 1500 none none 0 0 0 0 0
lan1:802 1500 169.254.0.0 169.254.150.220 0 0 0 0 0
lan0:801 1500 10.141.15.128 10.141.15.138 280 0 0 0 0# ifconfig lan1:802 0.0.0.0 down
As soon as the interface lan1:802 was down automatically a new interface lan1:803 came into existence with the same ip address
# netstat -inw
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
lan1 1500 192.168.1.0 192.168.1.11 69491 0 74087 0 0
lan0 1500 10.141.15.128 10.141.15.136 7437 0 7367 0 0
lo0 32808 127.0.0.0 127.0.0.1 252208 0 252208 0 0
lan1:801* 1500 none none 0 0 0 0 0
lan1:803 1500 169.254.0.0 169.254.150.220 0 0 0 0 0
lan1:802* 1500 none none 0 0 0 0 0
lan0:801 1500 10.141.15.128 10.141.15.138 280 0 0 0 0the new virtual interface is getting created once the old one is disabled.
Brought down the interface lan1:803 and tried to start the "ora.asm" resource
# ifconfig lan1:803 down
# netstat -inw
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
lan1 1500 192.168.1.0 192.168.1.11 74078 0 79108 0 0
lan0 1500 10.241.16.128 10.241.16.136 7817 0 7749 0 0
lo0 32808 127.0.0.0 127.0.0.1 266335 0 266335 0 0
lan1:801* 1500 none none 0 0 0 0 0
lan1:803* 1500 none none 0 0 0 0 0
lan1:802* 1500 none none 0 0 0 0 0
lan0:801 1500 10.241.16.128 10.241.16.138 314 0 0 0 0Resource status
$ crsctl stat res -t -init
NAME TARGET STATE SERVER STATE_DETAILS
Cluster Resources
ora.asm
1 ONLINE OFFLINE Instance Shutdown
ora.cluster_interconnect.haip
1 ONLINE ONLINE proddb02
ora.crsd
1 ONLINE ONLINE proddb02
ora.cssd
1 ONLINE ONLINE proddb02
ora.cssdmonitor
1 ONLINE ONLINE proddb02
ora.ctssd
1 ONLINE ONLINE proddb02 OBSERVER
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 ONLINE ONLINE proddb02
ora.gipcd
1 ONLINE ONLINE proddb02
ora.gpnpd
1 ONLINE ONLINE proddb02
ora.mdnsd
1 ONLINE ONLINE proddb02
$
$ crsctl start resource ora.asm
CRS-2672: Attempting to start 'ora.asm' on 'proddb02'
CRS-2672: Attempting to start 'ora.asm' on 'proddb02'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:check if failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpcini3
ORA-27303: additional information: requested interface lan1:803 interface not up _disable_interface_checking = TRUE to disable this check for single instance cluster. Check output from ifconfig c
. For details refer to "(:CLSN00107:)" in "/u01/app/oracle/11.2.0/grid/log/proddb02/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.asm' on 'proddb02' failed
CRS-2679: Attempting to clean 'ora.asm' on 'proddb02'
CRS-2681: Clean of 'ora.asm' on 'proddb02' succeeded
CRS-2674: Start of 'ora.asm' on 'proddb02' failed
CRS-4000: Command Start failed, or completed with errors.# ifconfig lan1:804 up
$ crsctl start resource ora.asm
CRS-2672: Attempting to start 'ora.asm' on 'proddb02'
CRS-2672: Attempting to start 'ora.asm' on 'proddb02'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 0 Serial number: 0
. For details refer to "(:CLSN00107:)" in "/u01/app/oracle/11.2.0/grid/log/proddb02/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.asm' on 'proddb02' failed
CRS-2679: Attempting to clean 'ora.asm' on 'proddb02'
CRS-2681: Clean of 'ora.asm' on 'proddb02' succeeded
CRS-2674: Start of 'ora.asm' on proddb02' failed
CRS-4000: Command Start failed, or completed with errors. -
CRS-0210: Could not find resource while start ASM on 10G RAC
Dear All
I'm trying to install 10g RAC on Redhat 4.1. At the point at which I try to start ASM it fails with the message
ORA-03113: end-of-file on communication channel
select value from v$parameter where name='instance_type'
ERROR at line 1:
ORA-01034: ORACLE not available
This seems to be due to a failure CRS. The trace file shows:
[Thread-11] [17:14:2:907] [HAOperationImpl.runCommand:1254] CRS cmd is: /u01/app/oracle/product/10.2.0/crs/bin/crs_stat -u ora.bfhxx-sql012.ASM1.asm
[Thread-41] [17:14:2:964] [StreamReader.run:65] OUTPUT>CRS-0210: Could not find resource ora.bfhxx-sql012.ASM1.asm.
Executing
/u01/app/oracle/product/10.2.0/crs/bin/crs_stat
produces:
NAME=ora.bfhxx-sql012.LISTENER_BFHXX-SQL012.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on bfhxx-sql012
NAME=ora.bfhxx-sql012.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE on bfhxx-sql012
NAME=ora.bfhxx-sql012.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE on bfhxx-sql012
NAME=ora.bfhxx-sql012.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on bfhxx-sql012
Is this a red herring or is it the reason ASM won't start? At what point should this entry have been created (ie should it already exist when ASM starts or should ASM create it)? Can I manually create the entry using crsstat.
Many thanks
PaulIt all depends on what your mapping problem was and what you did to resolve it. Here are a couple of things you could try:
-Make sure the orc.loc file on ebdb1 has the same entry as ebdb2. Also, say you are using shared raw devices for CRS: make sure your raw devices are consistent on both nodes.
If the above does not work
-Make sure the shared devices are consistent on both nodes and restore one of the OCR backups. I am sure at least one of the backups 4hrs/8hrs/1day/2day/1week some thing must be a worthwhile. If you do restore OCR from a backup make sure to run cluvfy before you start your CRS.
Always a good practise to take a backup of status quo before you restore from the backups.
It would be interesting to know what lead to your disk mapping problem in the first palce.
Good Luck! -
Cluster and ASM services doesnt start automatically
Hi,
I have configured a 2 node 10g RAC environment (rac, rac2) in RHEL 4 through vmware.
However the cluster services and ASM services do not come up automatically after server reboot. I have to manually bring it up using ./srvctl command.
[root@rac2 bin]# ./crs_stat -t
Name Type Target State Host
ora....SM1.asm application ONLINE UNKNOWN rac
ora....AC.lsnr application ONLINE UNKNOWN rac
ora.rac.gsd application ONLINE UNKNOWN rac
ora.rac.ons application ONLINE UNKNOWN rac
ora.rac.vip application ONLINE ONLINE rac
[email protected] application ONLINE UNKNOWN rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE ONLINE rac2
Could someone please guide me as to how these services can be brought up automatically upon every reboot?Normally all services will start automatically, I am not sure how you installed and configured your RAC.
Here i am giving all options.
Option1:-
Re: CRS auto-start
Option2:-
See the below links.
http://jaffardba.blogspot.com/2009/03/how-to-startup-rac-database-services.html
http://www.dannorris.com/2009/03/12/start-database-services-automatically-after-instance-startup/
Option3:-
if you simply want to start the service upon server reboot, put srvctl
command in rc.local script under /etc directory.
Hope this solves your issue.
Regards
Click here to [createdisk, deletedisk and querydisk in ASM|http://www.oracleracexpert.com/2009/09/createdisk-deletedisk-and-querydisk-in.html]
Click here to see [ RAC database Instance hang/restart due to node eviction and Solution.|http://www.oracleracexpert.com/2009/09/ora-29740-evicted-by-member-0-group.html]
Click here for [Cross platform Transportable tablespace using RMAN|http://www.oracleracexpert.com/2009/10/cross-platform-transportable-tablespace.html]
http://www.oracleracexpert.com -
OCR and ASM dependancy in 11.2
Grid Version: 11.2.0.2
Platform : Solaris 10
Question1.
Since ASM's configuration information is stored in OCR , if OCR is lost (for eg: due to a corrupt LUN in OCR's Disk group) , will ASM instance crash ?
Question2.
If ASM instance crashes (for eg: someone accidently killed an ASM mandatory process) , will OCR be accessible as OCR is stored in an ASM disk group ?
Question3.
What are the precautions I can take to protect OCR from failures ?Tom wrote:
Hi Levi, Kuljeet
Because of OLR (Oracle local registry) , I was under the impression that ASM won't crash even if OCR is lost.
http://www.linkedin.com/groups/How-restore-OCR-in-11gR2-3156190.S.93908910
Hi Tom,
When Clusterware starts three files are involved.
OLR - Is the first to be read and opened. This file is local and this file contains information where is stored voting disk, and information to startup the ASM. (e.g ASM DiscoveryString)
VOTING DISK - This is the second file to be opened and read, to read the voting file only depend on the OLR be accessible. ASM start after CSSD or ASM does not start if CSSD is offline (i.e voting file missing)
OCR - Finally the ASM Instance starts and mount all Diskgroups, then Clusterware Deamon (CRSD) open and read the OCR which is stored on Diskgroup.
So, if ASM already started, ASM does not depend on OCR or OLR to be online. ASM depend on CSSD (Votedisk) to be online.
There is a exclusive mode to start ASM without CSSD (but it's to restore OCR or VOTE purposes)
Regards,
Levi Pereira -
Hi All,
i am getting an error message in asm_alert log file saying "NOTE: ASMB process exiting due to lack of ASM file activity".
This leads to frequent crashing of NODE 1. Please check below detail error and suggest solution.
Thu Mar 24 07:05:11 2011
LMD0 (ospid: 32493) has not called a wait for 94 secs.
GES: System Load is HIGH.
GES: Current load is 55.87 and high load threshold is 20.00
Thu Mar 24 07:06:32 2011
LMD0 (ospid: 32493) has not called a wait for 174 secs.
GES: System Load is HIGH.
GES: Current load is 71.23 and high load threshold is 20.00
Thu Mar 24 07:06:36 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:07:49 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:08:16 2011
Waiting for clusterware split-brain resolution
Thu Mar 24 07:18:17 2011
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc (incident=60073):+
ORA-29740: evicted by member 1, group incarnation 120
Incident details in: /u01/app/oracle/diag/asm/asm/+ASM1/incident/incdir_60073/+ASM1_lmon_32484_i60073.trc+
Thu Mar 24 07:18:19 2011
+Trace dumping is performing id=[cdmp_20110324071819]+
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc:+
ORA-29740: evicted by member 1, group incarnation 120
LMON (ospid: 32484): terminating the instance due to error 29740
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_diag_32459.trc+
+Trace dumping is performing id=[cdmp_20110324071820]+
Instance terminated by LMON, pid = 32484
Thu Mar 24 07:18:31 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 172.20.223.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 172.20.222.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/asm_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.7.0.
Using parameter settings in server-side pfile /u01/app/oracle/product/11.1.0/asm_1/dbs/initASM1.ora+
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
cluster_database = TRUE
instance_number = 1
asm_diskstring = "ORCL:*"
asm_diskgroups = "REDO01"
asm_diskgroups = "REDO02"
asm_diskgroups = "DATA"
asm_diskgroups = "RECOVERY"
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
+172.20.223.25+
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Thu Mar 24 07:18:36 2011
PMON started with pid=2, OS id=23120
Thu Mar 24 07:18:36 2011
VKTM started with pid=3, OS id=23123 at elevated priority
VKTM running at (20)ms precision
Thu Mar 24 07:18:36 2011
DIAG started with pid=4, OS id=23127
Thu Mar 24 07:18:37 2011
PING started with pid=5, OS id=23129
Thu Mar 24 07:18:37 2011
PSP0 started with pid=6, OS id=23131
Thu Mar 24 07:18:37 2011
DIA0 started with pid=7, OS id=23133
Thu Mar 24 07:18:37 2011
LMON started with pid=8, OS id=23135
Thu Mar 24 07:18:37 2011
LMD0 started with pid=9, OS id=23137
Thu Mar 24 07:18:37 2011
LMS0 started with pid=10, OS id=23148 at elevated priority
Thu Mar 24 07:18:37 2011
MMAN started with pid=11, OS id=23152
Thu Mar 24 07:18:38 2011
DBW0 started with pid=12, OS id=23170
Thu Mar 24 07:18:38 2011
LGWR started with pid=13, OS id=23176
Thu Mar 24 07:18:38 2011
CKPT started with pid=14, OS id=23218
Thu Mar 24 07:18:38 2011
SMON started with pid=15, OS id=23224
Thu Mar 24 07:18:38 2011
RBAL started with pid=16, OS id=23237
Thu Mar 24 07:18:38 2011
GMON started with pid=17, OS id=23239
lmon registered with NM - instance id 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 124)
ASM instance
List of nodes:
+0 1 2+
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* allocate domain 2, invalid = TRUE
* allocate domain 3, invalid = TRUE
* allocate domain 4, invalid = TRUE
* domain 0 valid = 1 according to instance 1
* domain 1 valid = 1 according to instance 1
* domain 2 valid = 1 according to instance 1
* domain 3 valid = 1 according to instance 1
* domain 4 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
LMS 0: 0 GCS shadows traversed, 0 replayed
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Thu Mar 24 07:18:40 2011
LCK0 started with pid=18, OS id=23277
ORACLE_BASE from environment = /u01/app/oracle
Thu Mar 24 07:18:41 2011
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0xf7063e39
NOTE: cache began mount (not first) of group DATA number=1 incarn=0xf7063e39
NOTE: cache registered group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache began mount (not first) of group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache registered group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache began mount (not first) of group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache registered group REDO02 number=4 incarn=0xf7163e3c
NOTE: cache began mount (not first) of group REDO02 number=4 incarn=0xf7163e3c
NOTE:Loaded lib: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
NOTE: Assigning number (1,0) to disk (ORCL:ASM_DATA1)
NOTE: Assigning number (1,1) to disk (ORCL:ASM_DATA2)
NOTE: Assigning number (2,0) to disk (ORCL:ASM_RECO1)
NOTE: Assigning number (3,0) to disk (ORCL:ASM_LOG1)
NOTE: Assigning number (4,0) to disk (ORCL:ASM_LOG2)
kfdp_query(): 5
kfdp_queryBg(): 5
NOTE: cache opening disk 0 of grp 1: DATA1 label:ASM_DATA1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATA2 label:ASM_DATA2
NOTE: cache mounting (not first) group 1/0xF7063E39 (DATA)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 1
NOTE: LGWR attempting to mount thread 1 for diskgroup 1
NOTE: LGWR mounted thread 1 for disk group 1
NOTE: opening chunk 1 at fcn 0.10794571 ABA
NOTE: seq=81 blk=1313
NOTE: cache mounting group 1/0xF7063E39 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=1 incarn=0xf7063e39
kfdp_query(): 6
kfdp_queryBg(): 6
NOTE: cache opening disk 0 of grp 2: RECO1 label:ASM_RECO1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 2/0xF7063E3A (RECOVERY)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 2
NOTE: LGWR attempting to mount thread 1 for diskgroup 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.10436377 ABA
NOTE: seq=48 blk=4298
NOTE: cache mounting group 2/0xF7063E3A (RECOVERY) succeeded
NOTE: cache ending mount (success) of group RECOVERY number=2 incarn=0xf7063e3a
kfdp_query(): 7
kfdp_queryBg(): 7
NOTE: cache opening disk 0 of grp 3: LOG1 label:ASM_LOG1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 3/0xF7163E3B (REDO01)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 3
NOTE: LGWR attempting to mount thread 1 for diskgroup 3
NOTE: LGWR mounted thread 1 for disk group 3
NOTE: opening chunk 1 at fcn 0.229332 ABA
NOTE: seq=30 blk=10690
NOTE: cache mounting group 3/0xF7163E3B (REDO01) succeeded
NOTE: cache ending mount (success) of group REDO01 number=3 incarn=0xf7163e3b
kfdp_query(): 8
kfdp_queryBg(): 8
NOTE: cache opening disk 0 of grp 4: LOG2 label:ASM_LOG2
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 4/0xF7163E3C (REDO02)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 4
NOTE: LGWR attempting to mount thread 1 for diskgroup 4
NOTE: LGWR mounted thread 1 for disk group 4
NOTE: opening chunk 1 at fcn 0.225880 ABA
NOTE: seq=30 blk=10556
NOTE: cache mounting group 4/0xF7163E3C (REDO02) succeeded
NOTE: cache ending mount (success) of group REDO02 number=4 incarn=0xf7163e3c
kfdp_query(): 9
kfdp_queryBg(): 9
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 1
SUCCESS: diskgroup DATA was mounted
kfdp_query(): 10
kfdp_queryBg(): 10
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup RECOVERY was mounted
kfdp_query(): 11
kfdp_queryBg(): 11
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 3
SUCCESS: diskgroup REDO01 was mounted
kfdp_query(): 12
kfdp_queryBg(): 12
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 4
SUCCESS: diskgroup REDO02 was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT
Thu Mar 24 08:26:28 2011
Starting background process ASMB
Thu Mar 24 08:26:28 2011
ASMB started with pid=20, OS id=9597
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Thu Mar 24 08:27:39 2011
Starting background process ASMB
Thu Mar 24 08:27:39 2011
ASMB started with pid=25, OS id=10735
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
+[oracle@qa1crmrac1 trace]$ tail -1500 alert_ASM1.log
Thu Mar 24 07:05:11 2011
LMD0 (ospid: 32493) has not called a wait for 94 secs.
GES: System Load is HIGH.
GES: Current load is 55.87 and high load threshold is 20.00
Thu Mar 24 07:06:32 2011
LMD0 (ospid: 32493) has not called a wait for 174 secs.
GES: System Load is HIGH.
GES: Current load is 71.23 and high load threshold is 20.00
Thu Mar 24 07:06:36 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:07:49 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:08:16 2011
Waiting for clusterware split-brain resolution
Thu Mar 24 07:18:17 2011
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc (incident=60073):+
ORA-29740: evicted by member 1, group incarnation 120
Incident details in: /u01/app/oracle/diag/asm/asm/+ASM1/incident/incdir_60073/+ASM1_lmon_32484_i60073.trc+
Thu Mar 24 07:18:19 2011
+Trace dumping is performing id=[cdmp_20110324071819]+
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc:+
ORA-29740: evicted by member 1, group incarnation 120
LMON (ospid: 32484): terminating the instance due to error 29740
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_diag_32459.trc+
+Trace dumping is performing id=[cdmp_20110324071820]+
Instance terminated by LMON, pid = 32484
Thu Mar 24 07:18:31 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 172.20.223.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 172.20.222.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/asm_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.7.0.
Using parameter settings in server-side pfile /u01/app/oracle/product/11.1.0/asm_1/dbs/initASM1.ora+
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
cluster_database = TRUE
instance_number = 1
asm_diskstring = "ORCL:*"
asm_diskgroups = "REDO01"
asm_diskgroups = "REDO02"
asm_diskgroups = "DATA"
asm_diskgroups = "RECOVERY"
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
+172.20.223.25+
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Thu Mar 24 07:18:36 2011
PMON started with pid=2, OS id=23120
Thu Mar 24 07:18:36 2011
VKTM started with pid=3, OS id=23123 at elevated priority
VKTM running at (20)ms precision
Thu Mar 24 07:18:36 2011
DIAG started with pid=4, OS id=23127
Thu Mar 24 07:18:37 2011
PING started with pid=5, OS id=23129
Thu Mar 24 07:18:37 2011
PSP0 started with pid=6, OS id=23131
Thu Mar 24 07:18:37 2011
DIA0 started with pid=7, OS id=23133
Thu Mar 24 07:18:37 2011
LMON started with pid=8, OS id=23135
Thu Mar 24 07:18:37 2011
LMD0 started with pid=9, OS id=23137
Thu Mar 24 07:18:37 2011
LMS0 started with pid=10, OS id=23148 at elevated priority
Thu Mar 24 07:18:37 2011
MMAN started with pid=11, OS id=23152
Thu Mar 24 07:18:38 2011
DBW0 started with pid=12, OS id=23170
Thu Mar 24 07:18:38 2011
LGWR started with pid=13, OS id=23176
Thu Mar 24 07:18:38 2011
CKPT started with pid=14, OS id=23218
Thu Mar 24 07:18:38 2011
SMON started with pid=15, OS id=23224
Thu Mar 24 07:18:38 2011
RBAL started with pid=16, OS id=23237
Thu Mar 24 07:18:38 2011
GMON started with pid=17, OS id=23239
lmon registered with NM - instance id 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 124)
ASM instance
List of nodes:
+0 1 2+
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* allocate domain 2, invalid = TRUE
* allocate domain 3, invalid = TRUE
* allocate domain 4, invalid = TRUE
* domain 0 valid = 1 according to instance 1
* domain 1 valid = 1 according to instance 1
* domain 2 valid = 1 according to instance 1
* domain 3 valid = 1 according to instance 1
* domain 4 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
LMS 0: 0 GCS shadows traversed, 0 replayed
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Thu Mar 24 07:18:40 2011
LCK0 started with pid=18, OS id=23277
ORACLE_BASE from environment = /u01/app/oracle
Thu Mar 24 07:18:41 2011
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0xf7063e39
NOTE: cache began mount (not first) of group DATA number=1 incarn=0xf7063e39
NOTE: cache registered group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache began mount (not first) of group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache registered group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache began mount (not first) of group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache registered group REDO02 number=4 incarn=0xf7163e3c
NOTE: cache began mount (not first) of group REDO02 number=4 incarn=0xf7163e3c
NOTE:Loaded lib: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
NOTE: Assigning number (1,0) to disk (ORCL:ASM_DATA1)
NOTE: Assigning number (1,1) to disk (ORCL:ASM_DATA2)
NOTE: Assigning number (2,0) to disk (ORCL:ASM_RECO1)
NOTE: Assigning number (3,0) to disk (ORCL:ASM_LOG1)
NOTE: Assigning number (4,0) to disk (ORCL:ASM_LOG2)
kfdp_query(): 5
kfdp_queryBg(): 5
NOTE: cache opening disk 0 of grp 1: DATA1 label:ASM_DATA1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATA2 label:ASM_DATA2
NOTE: cache mounting (not first) group 1/0xF7063E39 (DATA)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 1
NOTE: LGWR attempting to mount thread 1 for diskgroup 1
NOTE: LGWR mounted thread 1 for disk group 1
NOTE: opening chunk 1 at fcn 0.10794571 ABA
NOTE: seq=81 blk=1313
NOTE: cache mounting group 1/0xF7063E39 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=1 incarn=0xf7063e39
kfdp_query(): 6
kfdp_queryBg(): 6
NOTE: cache opening disk 0 of grp 2: RECO1 label:ASM_RECO1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 2/0xF7063E3A (RECOVERY)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 2
NOTE: LGWR attempting to mount thread 1 for diskgroup 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.10436377 ABA
NOTE: seq=48 blk=4298
NOTE: cache mounting group 2/0xF7063E3A (RECOVERY) succeeded
NOTE: cache ending mount (success) of group RECOVERY number=2 incarn=0xf7063e3a
kfdp_query(): 7
kfdp_queryBg(): 7
NOTE: cache opening disk 0 of grp 3: LOG1 label:ASM_LOG1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 3/0xF7163E3B (REDO01)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 3
NOTE: LGWR attempting to mount thread 1 for diskgroup 3
NOTE: LGWR mounted thread 1 for disk group 3
NOTE: opening chunk 1 at fcn 0.229332 ABA
NOTE: seq=30 blk=10690
NOTE: cache mounting group 3/0xF7163E3B (REDO01) succeeded
NOTE: cache ending mount (success) of group REDO01 number=3 incarn=0xf7163e3b
kfdp_query(): 8
kfdp_queryBg(): 8
NOTE: cache opening disk 0 of grp 4: LOG2 label:ASM_LOG2
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 4/0xF7163E3C (REDO02)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 4
NOTE: LGWR attempting to mount thread 1 for diskgroup 4
NOTE: LGWR mounted thread 1 for disk group 4
NOTE: opening chunk 1 at fcn 0.225880 ABA
NOTE: seq=30 blk=10556
NOTE: cache mounting group 4/0xF7163E3C (REDO02) succeeded
NOTE: cache ending mount (success) of group REDO02 number=4 incarn=0xf7163e3c
kfdp_query(): 9
kfdp_queryBg(): 9
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 1
SUCCESS: diskgroup DATA was mounted
kfdp_query(): 10
kfdp_queryBg(): 10
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup RECOVERY was mounted
kfdp_query(): 11
kfdp_queryBg(): 11
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 3
SUCCESS: diskgroup REDO01 was mounted
kfdp_query(): 12
kfdp_queryBg(): 12
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 4
SUCCESS: diskgroup REDO02 was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT
Thu Mar 24 08:26:28 2011
Starting background process ASMB
Thu Mar 24 08:26:28 2011
ASMB started with pid=20, OS id=9597
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Thu Mar 24 08:27:39 2011
Starting background process ASMB
Thu Mar 24 08:27:39 2011
ASMB started with pid=25, OS id=10735
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Do i need to set the compatible parameter?
Regards,
VishIt looks to me like your server is absolutely buried, and ASM may just be an innocent bystander. What is going on in the database when this happens? Also, run sar samples at 30 second intervals up to when this happens to see what is happening. It's overhead, but you need to find what is causing the problem with the server(s).
Are you swapping? -
Unable to remove ASM resource from OCR
Hi all,
I have a new installation 3 node RAC with clusterware and ocfs on win2003
Clusterware was successful, installed and patched, database (10.2.0.4 patch 5) was successful too.
When i tried to configure ASM the 1st node crashed and reboot, but a ASM "ghost" instance left in the services.
I removed with oradim and restart to configure ASM.
the 2nd turn the 2nd instance died but now i have the ora.node#.ASM#.asm for all 3 nodes in the OCR in UNKNOWN state and i'm not able to delete and restart the configuration of ASM
i tried crs_unregister but was unsuccessful (CRS-0216).
Any suggestion??Why you try with CRS_unregister?, "srvctl -f" did not work ?
CRS_Unregister generally is used when it was not needed. You should use srvctl remove to remove CRS resources. CRS_Unregister should only be used if you are trying to repair an inconsistency, I'm not sure if this is the case.
Is the clusterware working fine and the OCFS2 is configured properly ? Did you succesfully mounted a CFS in all nodes ?
If this does not help and you have a valid CSI take a look to Metalink note:357261.1 -
11g CRS and ASM, 10.2.0.4 database, how to set up RMAN
Hi all, I am fairly green when it comes to RAC so please bear with me. I have a Linux 2 node RAC environment. The servers are lux148 and lux149. The 2 node cluster is running 11g CRS and ASM. The database is 10.2.0.4. It had to be set up this way in order for IBM DataStage to work. The database name is fictrp0. The instances are fictrp01 (lux148) and fictrp02 (lux149). I am trying to register the database with an RMAN catalog. I am under the impression that I need to register the database (fictrp0) not the instances (fictrp01 and fictrp02) with the RMAN catalog. Is this correct? So I log into lux148 and set my environment so the ORACLE_SID=FICTRP0. From the command line I issue the following:
fictrp0:/u01/app/oracle> rman target / catalog rman102/[email protected]
The command returns the following:
Recovery Manager: Release 10.2.0.4.0 - Production on Wed Oct 29 14:05:50 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database (not started)
connected to recovery catalog database
Is this normal "connected to target database (not started)"? I was expecting to see a DBID=FICTRP0 for the target. If this is typical, how will RMAN know the ID of the database? Should I be trying to register an instance perhaps such as FICTRP01? If so do I need to register both instances (FICTRP01 and FICTRP02)?
Bottom line I am very confused on how RAC and RMAN work together. Any help would be greatly appreciated.You no need to register the instance,You need to register only the database and database only will have DBID
Maybe you are looking for
-
Dreamweaver help in Layout design
Hi, I've been using dreamweaver for the past week and made some major headway for a personal project but i have been having some issues with actually setting up a home page with an actually graphic design layout or look to it if you will. so far the
-
Wl60, jsp reloading, and managed servers
JSP pages don't reload on "Managed" nodes/servers in WLS6.0. Is this intentional or a bug? and can this behavior be modified? -jim
-
Hello, is there a way to check how a retention policy is being applied to a specific backupset and why it's not being deleted after a backup completes? This question is probably very unclear, so I will try to explain my problem. First off, my RMAN co
-
Attention customers future customers and reps
I would like to start with my name is Mark and I am a new customer as of the 8th of March. Second I would like to tell every one my story of Verizon so far... I ordered a double play getting internet(high speed enhanced) and TV (Direct TV) good deal
-
Closing all events in "photo" view
How do I close all of the turn-down arrows in the Photo view in Iphoto? I have years worth of "Untitled Event"s and I can't even find my way through them all.