RAC database auto down/restart

Hi Experts,
We have a 4 nodes ORacle 11.1 RAC and 10.2.04 database in redhat 5.0
I find that database/instance showdown and restart at 9.30PM in alert log file.
The sys admin told me that never reboot linux server.
I got below message at cssdOUT.log under /u01/app/crs/product/11.1.0/log/sale/cssd
[oracle@sale cssd]$ tail -3000 cssdOUT.log
s0clssscGetEnvOracleUser: calling getpwnam_r for user oracle
s0clssscGetEnvOracleUser: info for user oracle complete
s0main: IP addr file already exists(8).
setsid: failed with -1/1
s0clssscGetEnvOracleUser: calling getpwnam_r for user oracle
s0clssscGetEnvOracleUser: info for user oracle complete
09/30/09 21:24:30: CSSD starting
what means is for above message? where do I need to look for more information for this message?
Thanks
JIm

Thanks for your help.
I post all message since that point from crsd log filr.
I am looking forward to help.
Jim
============crsd log
2009-09-30 21:18:14.346: [  CRSRES][1509046592] In stateChanged, ora.sale.sales1.inst target is ONLINE
2009-09-30 21:18:14.346: [  CRSRES][1509046592] ora.sale.sale1.inst on sale_east went OFFLINE unexpectedly
2009-09-30 21:18:14.347: [  CRSRES][1509046592] StopResource: setting CLI values
2009-09-30 21:18:14.347: [ COMMCRS][1167051072]clsc_receive: (0xc69ef30) message type 0 of size 0
2009-09-30 21:18:14.347: [ COMMCRS][1167051072]clsc_receive: (0xc69ef30) recv failed type 0, size 2120, msgSize 0, flags 0x0000
2009-09-30 21:18:21.639: [  CRSRES][1496455488] In stateChanged, ora.sale_east.ASM1.asm target is ONLINE
2009-09-30 21:18:21.639: [  CRSRES][1496455488] ora.sale_east.ASM1.asm on sale_east went OFFLINE unexpectedly
2009-09-30 21:18:21.639: [  CRSRES][1496455488] StopResource: setting CLI values
2009-09-30 21:24:25.942: [ default][4046827776] CRS Daemon Starting
2009-09-30 21:24:25.990: [ CRSMAIN][4046827776] Checking the OCR device
2009-09-30 21:24:25.996: [ CRSMAIN][4046827776] Connecting to the CSS Daemon
2009-09-30 21:24:26.268: [ COMMCRS][1092147520]clsc_connect: (0x1a218590) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_))
2009-09-30 21:24:26.269: [ CSSCLNT][4046827776]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_)), rc 9
2009-09-30 21:24:26.269: [  CRSRTI][4046827776] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2009-09-30 21:24:27.449: [ COMMCRS][1092147520]clsc_connect: (0x1a218620) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_))
2009-09-30 21:24:27.450: [ CSSCLNT][4046827776]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_)), rc 9
2009-09-30 21:24:27.450: [  CRSRTI][4046827776] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2009-09-30 21:24:28.631: [ COMMCRS][1092147520]clsc_connect: (0x1a218620) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_))
2009-09-30 21:24:28.631: [ CSSCLNT][4046827776]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_)), rc 9
2009-09-30 21:24:28.631: [  CRSRTI][4046827776] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2009-09-30 21:24:29.812: [ COMMCRS][1092147520]clsc_connect: (0x1a218620) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_))
2009-09-30 21:24:29.812: [ CSSCLNT][4046827776]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_)), rc 9
2009-09-30 21:24:29.812: [  CRSRTI][4046827776] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2009-09-30 21:24:30.993: [ COMMCRS][1092147520]clsc_connect: (0x1a218620) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_))
2009-09-30 21:24:30.993: [ CSSCLNT][4046827776]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sale_east_)), rc 9
2009-09-30 21:24:30.993: [  CRSRTI][4046827776] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2009-09-30 21:24:33.371: [ CRSMAIN][4046827776] CRSD running as the Privileged user
2009-09-30 21:24:33.589: [  CLSVER][4046827776] Active Version from OCR:11.1.0.7.0
2009-09-30 21:24:33.589: [  CLSVER][4046827776] Active Version and Software Version are same
2009-09-30 21:24:33.589: [ CRSMAIN][4046827776] Initializing OCR
2009-09-30 21:24:33.596: [  OCRRAW][4046827776]proprioo: for disk 0 (/dev/ocr1), id match (1), my id set (531437975,1028247821) total id sets (2), 1st set (5314 37975,308780202), 2nd set (531437975,1028247821) my votes (2), total votes (2)
2009-09-30 21:24:33.617: [  OCRSRV][4046827776]th_init: Failure in smwait. Waiting to be signalled by Master elector thread.
2009-09-30 21:24:33.639: [    CRSD][4046827776] ENV Logging level for Module: allcomp 0
2009-09-30 21:24:33.648: [    CRSD][4046827776] ENV Logging level for Module: default 0
2009-09-30 21:24:33.657: [    CRSD][4046827776] ENV Logging level for Module: COMMCRS 0
2009-09-30 21:24:33.666: [    CRSD][4046827776] ENV Logging level for Module: COMMNS 0
2009-09-30 21:24:33.675: [    CRSD][4046827776] ENV Logging level for Module: CRSUI 0
2009-09-30 21:24:33.684: [    CRSD][4046827776] ENV Logging level for Module: CRSCOMM 0
2009-09-30 21:24:33.693: [    CRSD][4046827776] ENV Logging level for Module: CRSRTI 0
2009-09-30 21:24:33.702: [    CRSD][4046827776] ENV Logging level for Module: CRSMAIN 0
2009-09-30 21:24:33.711: [    CRSD][4046827776] ENV Logging level for Module: CRSPLACE 0
2009-09-30 21:24:33.712: [    CRSD][4046827776] ENV Logging level for Module: CRSAPP 0
2009-09-30 21:24:33.713: [    CRSD][4046827776] ENV Logging level for Module: CRSRES 0
2009-09-30 21:24:33.714: [    CRSD][4046827776] ENV Logging level for Module: CRSOCR 0
2009-09-30 21:24:33.725: [    CRSD][4046827776] ENV Logging level for Module: CRSTIMER 0
2009-09-30 21:24:33.734: [    CRSD][4046827776] ENV Logging level for Module: CRSEVT 0
2009-09-30 21:24:33.735: [    CRSD][4046827776] ENV Logging level for Module: CRSD 0
2009-09-30 21:24:33.736: [    CRSD][4046827776] ENV Logging level for Module: CLUCLS 0
2009-09-30 21:24:33.737: [    CRSD][4046827776] ENV Logging level for Module: CLSVER 0
2009-09-30 21:24:33.738: [    CRSD][4046827776] ENV Logging level for Module: OCRRAW 0
2009-09-30 21:24:33.739: [    CRSD][4046827776] ENV Logging level for Module: OCROSD 0
2009-09-30 21:24:33.740: [    CRSD][4046827776] ENV Logging level for Module: OCRCAC 0
2009-09-30 21:24:33.741: [    CRSD][4046827776] ENV Logging level for Module: CSSCLNT 0
2009-09-30 21:24:33.750: [    CRSD][4046827776] ENV Logging level for Module: OCRAPI 0
2009-09-30 21:24:33.751: [    CRSD][4046827776] ENV Logging level for Module: OCRUTL 0
2009-09-30 21:24:33.752: [    CRSD][4046827776] ENV Logging level for Module: OCRMSG 0
2009-09-30 21:24:33.753: [    CRSD][4046827776] ENV Logging level for Module: OCRCLI 0
2009-09-30 21:24:33.754: [    CRSD][4046827776] ENV Logging level for Module: OCRSRV 0
2009-09-30 21:24:33.763: [    CRSD][4046827776] ENV Logging level for Module: OCRMAS 0
2009-09-30 21:24:33.763: [ CRSMAIN][4046827776] Filename is /u01/app/crs/product/11.1.0/crs/init/sale_east.pid
[  clsdmt][1409100096]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=sale_eastDBG_CRSD ))
2009-09-30 21:24:33.881: [ CRSMAIN][4046827776] Using Authorizer location: /u01/app/crs/product/11.1.0/crs/auth/
2009-09-30 21:24:33.905: [ CRSMAIN][4046827776] Initializing RTI
2009-09-30 21:24:33.965: [ CRSMAIN][4046827776] Initializing EVMMgr
2009-09-30 21:24:33.965: [CRSTIMER][1430079808] Timer Thread Starting.
2009-09-30 21:24:34.145: [ COMMCRS][1440569664]clsc_connect: (0x2aaaac02be90) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2009-09-30 21:24:34.827: [ COMMCRS][1440569664]clsc_connect: (0x2aaaac02be90) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2009-09-30 21:24:37.054: [ CRSMAIN][4046827776] CRSD locked during state recovery, please wait.
2009-09-30 21:24:37.222: [ CRSMAIN][4046827776] CRSD recovered, unlocked.
2009-09-30 21:24:37.223: [ CRSMAIN][4046827776] QS socket on: (ADDRESS=(PROTOCOL =ipc)(KEY=ora_crsqs))
2009-09-30 21:24:37.236: [ CRSMAIN][4046827776] CRSD UI socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
2009-09-30 21:24:37.255: [ CRSMAIN][4046827776] E2E socket on: (ADDRESS=(PROTOCOL=tcp)(HOST=sale_east-priv)(PORT=49896))
2009-09-30 21:24:37.255: [ CRSMAIN][4046827776] Starting Threads
2009-09-30 21:24:37.255: [ CRSMAIN][4046827776] CRS Daemon Started.
2009-09-30 21:24:37.255: [ CRSMAIN][1104738624] Starting runCommandServer for (UI = 1, E2E = 0). 0
2009-09-30 21:24:37.255: [ CRSMAIN][1484630336] Starting runCommandServer for (UI = 1, E2E = 0). 1
2009-09-30 21:24:37.464: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.497: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.533: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.615: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.649: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.676: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.741: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.767: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.802: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.861: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.896: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:37.922: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:38.038: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:38.117: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:38.144: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:38.261: [  CRSRES][4046827776] startup = 1
2009-09-30 21:24:38.427: [  CRSRES][1490934080] StopResource: setting CLI values
2009-09-30 21:24:38.486: [  CRSRES][1490934080] Attempting to stop `ora.sale_east.vip` on member `sale_north`
2009-09-30 21:24:38.590: [  CRSRES][1493035328] startRunnable: setting CLI values
2009-09-30 21:24:38.635: [  CRSRES][1493035328] Attempting to start `ora.sale_east.ASM1.asm` on member `sale_east`
2009-09-30 21:24:38.758: [  CRSRES][1490934080] Stop of `ora.sale_east.vip` on member `sale_north` succeeded.
2009-09-30 21:24:38.829: [  CRSRES][1490934080] startRunnable: setting CLI values
2009-09-30 21:24:38.829: [  CRSRES][1490934080] Attempting to start `ora.sale_east.vip` on member `sale_east`
2009-09-30 21:24:42.916: [  CRSRES][1490934080] Start of `ora.sale_east.vip` on member `sale_east` succeeded.
2009-09-30 21:24:43.087: [  CRSRES][1490934080] startRunnable: setting CLI values
2009-09-30 21:24:43.160: [  CRSRES][1490934080] Attempting to start `ora.sale_east.LISTENER_sale_east.lsnr` on member `sale_east`
2009-09-30 21:24:46.215: [  CRSRES][1490934080] Start of `ora.sale_east.LISTENER_sale_east.lsnr` on member `sale_east` succeeded.
2009-09-30 21:24:46.524: [  CRSRES][1541298496] CRS-1002: Resource 'ora.sale_east.LISTENER_sale_east.lsnr' is already running on member 'sale_east'
2009-09-30 21:24:50.375: [  CRSRES][1541298496] startRunnable: setting CLI values
2009-09-30 21:24:50.424: [  CRSRES][1541298496] Attempting to start `ora.sale_east.ons` on member `sale_east`
2009-09-30 21:24:52.209: [  CRSRES][1541298496] Start of `ora.sale_east.ons` on member `sale_east` succeeded.
2009-09-30 21:24:52.257: [  CRSRES][1493035328] Start of `ora.sale_east.ASM1.asm on member `sale_east` succeeded.
2009-09-30 21:24:52.334: [  CRSRES][1493035328] startRunnable: setting CLI values
2009-09-30 21:24:52.387: [  CRSRES][1493035328] Attempting to start `ora.sale.sale1.inst` on member `sale_east`
2009-09-30 21:25:21.833: [  CRSRES][1493035328] Start of `ora.sale.sale1.inst` on member `sale_east` succeeded.
2009-09-30 21:25:21.836: [  CRSRES][1545500992] Skip online resource: ora.sale_east.ons
2009-09-30 21:25:22.318: [  CRSRES][1497237824] StopResource: setting CLI values
2009-09-30 21:25:22.424: [  CRSRES][1543399744] startRunnable: setting CLI values
2009-09-30 21:25:22.449: [  CRSRES][1490934080] StopResource: setting CLI values
2009-09-30 21:25:22.541: [  CRSRES][1497237824] Attempting to stop `ora.sale_south.vip` on member `sale_north`
2009-09-30 21:25:22.596: [  CRSRES][1493035328] Attempting to start `ora.sale_south.gsd` on member `sale_south`
2009-09-30 21:25:22.648: [  CRSRES][1543399744] Attempting to start `ora.sale_east.gsd` on member `sale_east`
2009-09-30 21:25:22.661: [  CRSRES][1495136576] Attempting to start `ora.sale_south.ons` on member `sale_south`
2009-09-30 21:25:22.717: [  CRSRES][1490934080] Attempting to stop `ora.sale_west.vip` on member `sale_north`
2009-09-30 21:25:22.932: [  CRSRES][1541298496] Attempting to start `ora.sale_west.ons` on member `sale_west`
2009-09-30 21:25:22.951: [  CRSRES][1545500992] Attempting to start `ora.sale.vmshq.sale2.srv` on member `sale_west`
2009-09-30 21:25:22.969: [  CRSRES][1547602240] Attempting to start `ora.sale_west.gsd` on member `sale_west`
2009-09-30 21:25:23.167: [  CRSRES][1497237824] Stop of `ora.sale_south.vip` on member `sale_north` succeeded.
2009-09-30 21:25:23.316: [  CRSRES][1490934080] Stop of `ora.sale_west.vip` on member `sale_north` succeeded.
2009-09-30 21:25:23.595: [  CRSRES][1497237824] Attempting to start `ora.sale_south.vip` on member `sale_south`
2009-09-30 21:25:23.767: [  CRSRES][1543399744] Start of `ora.sale_east.gsd` on member `sale_east` succeeded.
2009-09-30 21:25:23.809: [  CRSRES][1490934080] Attempting to start `ora.sale_west.vip` on member `sale_west`
2009-09-30 21:25:24.951: [  CRSRES][1493035328] Start of `ora.sale_south.gsd` on member `sale_south` succeeded.
2009-09-30 21:25:25.875: [  CRSRES][1547602240] Start of `ora.sale_west.gsd` on member `sale_west` succeeded.
2009-09-30 21:25:26.073: [  CRSRES][1495136576] Start of `ora.sale_south.ons` on member `sale_south` succeeded.
2009-09-30 21:25:27.303: [  CRSRES][1541298496] Start of `ora.sale_west.ons` on member `sale_west` succeeded.
2009-09-30 21:25:28.069: [  CRSRES][1497237824] Start of `ora.sale_south.vip` on member `sale_south` succeeded.
2009-09-30 21:25:28.179: [  CRSRES][1497237824] Attempting to start `ora.sale_south.LISTENER_sale_south.lsnr` on member `sale_south`
2009-09-30 21:25:29.000: [  CRSRES][1490934080] Start of `ora.sale_west.vip` on member `sale_west` succeeded.
2009-09-30 21:25:29.172: [  CRSRES][1490934080] Attempting to start `ora.sale_west.LISTENER_sale_west.lsnr` on member `sale_west`
2009-09-30 21:25:32.919: [  CRSRES][1497237824] Start of `ora.sale_south.LISTENER_sale_south.lsnr` on member `sale_south` succeeded.
2009-09-30 21:25:32.981: [  CRSRES][1490934080] Start of `ora.sale_west.LISTENER_sale_west.lsnr` on member `sale_west` succeeded.
2009-09-30 21:26:10.724: [  CRSRES][1545500992] Start of `ora.sale.vmshq.sales2.srv` on member `sale_west` succeeded.
2009-09-30 21:26:10.930: [    CRSD][1104738624] SM: rE2Ec: 4
2009-09-30 21:26:10.951: [    CRSD][1547602240] SM:dE2Ec: all E2E cmds done. 0
2009-09-30 21:26:10.953: [    CRSD][1104738624] SM: rE2Ec: 4
2009-09-30 21:26:10.957: [    CRSD][1547602240] SM:dE2Ec: all E2E cmds done. 0
[oracle@sale_east crsd]$
=====under crsdOUT.log message as
Variable Items:
hostname (STRING) = "sale_north"
Formatted Message:
CRS is requested to perform action fail on resource
ora.sale.sales1.inst by Instance Monitor
Event Data Items:
Event Name : sys.ora.clu.crs.app.trigger._name.orasalesales1ins
t._action.fail
Cluster Event : True
Priority : 200
PID : 7493
PPID : 1
Event Id : 17318
Member Id : 1
Timestamp : 30-Sep-2009 21:18:14
Host Name : sale_east
Cluster Name : crs
User Name : oracle
Format : CRS is requested to perform action $_action on
resource $_name by $source
Reference : cat:evmexp_crs.cat
Variable Items:
_name (STRING) = "ora.sale.sale1.inst"
_action (STRING) = "fail"
source (STRING) = "Instance Monitor"
Formatted Message:
CRS is requested to perform action fail on resource ora.sale_east.ASM1.asm
by Instance Monitor
Event Data Items:
Event Name : 2009-09-30 21:24:25 : Changing directory to /u01/app/crs/product/11.1.0/log/sale_east/crsd
2009-09-30 21:24:25 : CRSD REBOOT
[oracle@sale_east crsd]$ tail -4000 crsdOUT.log
Edited by: user589812 on Oct 1, 2009 2:06 PM

Similar Messages

  • Can we put RAC database in Archivelog mode without shutting down

    All,
    Can we put RAC database in Archivelog mode without shutting down.
    Currently our new production database (2 node RAC) is in no archive log mode, Need to enable archive log in the database...
    I believe we need to set the cluster_database=false and then put the DB in archive log mode then we need to bounce the database to take effect...
    Just curious to know in 11gR2 ...Can we put the RAC database in archive log mode without any downtime ...?

    Even RAC or non-RAC, database should bounced and enable/disable archive log mode from mount status.

  • What steps to follow to make RAC Database down.

    hi all,
    I need to know the order which we have to follow in making RAC database Down Completely,
    Information reg database:
    OS::IBM AIX,
    ASM Storage,
    2 node RAC,
    2 databases.
    order in the sence to shutdown the RAC database first what we have to shutdown like ,
    database,asm,cluster, etc.and also give respective commands for reference.
    Regards,
    vamsi.

    844795 wrote:
    hi all,
    I need to know the order which we have to follow in making RAC database Down Completely,
    Information reg database:
    OS::IBM AIX,
    ASM Storage,
    2 node RAC,
    2 databases.
    order in the sence to shutdown the RAC database first what we have to shutdown like ,
    database,asm,cluster, etc.and also give respective commands for reference.
    Regards,
    vamsi.Stopping the Oracle RAC 10g Environment
    The first step is to stop the Oracle instance. When the instance (and related services) is down, then bring down the ASM instance. Finally, shut down the node applications (Virtual IP, GSD, TNS Listener, and ONS).
    $ export ORACLE_SID=orcl1
    $ emctl stop dbconsole
    $ srvctl stop instance -d orcl -i orcl1
    $ srvctl stop asm -n linux1
    $ srvctl stop nodeapps -n linux1
    Starting the Oracle RAC 10g Environment
    The first step is to start the node applications (Virtual IP, GSD, TNS Listener, and ONS). When the node applications are successfully started, then bring up the ASM instance. Finally, bring up the Oracle instance (and related services) and the Enterprise Manager Database console.
    $ export ORACLE_SID=orcl1
    $ srvctl start nodeapps -n linux1
    $ srvctl start asm -n linux1
    $ srvctl start instance -d orcl -i orcl1
    $ emctl start dbconsole
    Start/Stop All Instances with SRVCTL
    Start/stop all the instances and their enabled services. I have included this step just for fun as a way to bring down all instances!
    $ srvctl start database -d orcl
    $ srvctl stop database -d orcl
    reference:http://www.rampant-books.com/art_hunter_rac_start_stop_cluster.htm
    refer the links for more informations:
    Starting and Stopping Instances and Oracle Real Application Clusters Databases
    http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/dbinstmgt.htm#BCEBGHHC
    Server Control Utility Reference
    http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/srvctladmin.htm
    answered by ssolbach
    Just a minor comment on the stop nodeapps.
    While it is fine to stop the nodeapps on the server, the drawback to this is, that the VIP will not failover if you stop the nodeapps, but will be stopped.
    Hence if you only shutdown one server, then you are causing clients to fail to connect to the VIP and having to wait for the TCP/Timeout.
    So if you are not going to shut down all the server, but just want to shutdown one node, you should failover the VIP the the other node.
    See: Note 749160.1 Vip Does Not Failover When Nodeapps Stopped
    So it is sometimes better instead of stopping the nodeapps, to simply shutdown the cluster with crsctl stop crs (which will failover the VIP).
    Sebastian
    reference:-
    Re: RAC Questions

  • Rolling restart of rac database

    what happens to the already present connections while doing a rolling restart of rac database?
    if it is already established connection to a node and doing something what will happen, will it wait till that session completes what it is doing?

    >>what happens to the already present connections while doing a rolling restart of rac database?
    >> if it is already established connection to a node and doing something what will happen, will it wait till that session completes what it is doing?
    As said if TAF is configured only SELECT queries will be failed over to other instance. If it is DDL then it success will depend on the parameter such as IMMEDIATE/TRANSNATIONAL  supplied with SHUTDOWN command.
    HTH,
    Pradeep

  • Can not start Rac database

    Hi,
    Oracle RAC database 10.2.0.3/RedHat4 with 2 nodes.
    In the begining we had an error ORA-600[12803] so only sys can connect to database I find the note 1026653.6 this note said that we need to create AUDSES$ sequence but befor that we have to restart the database.
    When we stop the datanbase we had another ORA-600 and it's impossible to start it!!
    Here is a coppy of our alert file:
    Picked latch-free SCN scheme 2
    Autotune of undo retention is turned on.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.3.0.
    System parameters with non-default values:
    processes = 300
    sessions = 335
    sga_max_size = 524288000
    __shared_pool_size = 310378496
    __large_pool_size = 4194304
    __java_pool_size = 8388608
    __streams_pool_size = 8388608
    spfile = +DATA/osista/spfileosista.ora
    nls_language = FRENCH
    nls_territory = FRANCE
    nls_length_semantics = CHAR
    sga_target = 524288000
    control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
    db_block_size = 8192
    __db_cache_size = 184549376
    compatible = 10.2.0.3.0
    log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
    db_file_multiblock_read_count= 16
    cluster_database = TRUE
    cluster_database_instances= 2
    db_create_file_dest = +DATA
    db_recovery_file_dest = +FLASH
    db_recovery_file_dest_size= 68543315968
    thread = 2
    instance_number = 2
    undo_management = AUTO
    undo_tablespace = UNDOTBS2
    undo_retention = 29880
    remote_login_passwordfile= EXCLUSIVE
    db_domain =
    dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
    local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
    remote_listener = LISTENERS_OSISTA
    job_queue_processes = 10
    background_dump_dest = /oracle/product/admin/OSISTA/bdump
    user_dump_dest = /oracle/product/admin/OSISTA/udump
    core_dump_dest = /oracle/product/admin/OSISTA/cdump
    audit_file_dest = /oracle/product/admin/OSISTA/adump
    db_name = OSISTA
    open_cursors = 300
    pga_aggregate_target = 104857600
    aq_tm_processes = 1
    Cluster communication is configured to use the following interface(s) for this instance
    172.16.0.2
    Wed Jun 13 11:04:30 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=8560
    DIAG started with pid=3, OS id=8562
    PSP0 started with pid=4, OS id=8566
    LMON started with pid=5, OS id=8570
    LMD0 started with pid=6, OS id=8574
    LMS0 started with pid=7, OS id=8576
    LMS1 started with pid=8, OS id=8580
    MMAN started with pid=9, OS id=8584
    DBW0 started with pid=10, OS id=8586
    LGWR started with pid=11, OS id=8588
    CKPT started with pid=12, OS id=8590
    SMON started with pid=13, OS id=8592
    RECO started with pid=14, OS id=8594
    CJQ0 started with pid=15, OS id=8596
    MMON started with pid=16, OS id=8598
    Wed Jun 13 11:04:31 2012
    starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
    MMNL started with pid=17, OS id=8600
    Wed Jun 13 11:04:31 2012
    starting up 1 shared server(s) ...
    Wed Jun 13 11:04:31 2012
    lmon registered with NM - instance id 2 (internal mem no 1)
    Wed Jun 13 11:04:31 2012
    Reconfiguration started (old inc 0, new inc 2)
    List of nodes:
    1
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Jun 13 11:04:31 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Jun 13 11:04:31 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Jun 13 11:04:31 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:04:31 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:04:31 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=20, OS id=8877
    Wed Jun 13 11:04:43 2012
    alter database mount
    Wed Jun 13 11:04:43 2012
    This instance was first to mount
    Wed Jun 13 11:04:43 2012
    Starting background process ASMB
    ASMB started with pid=25, OS id=10068
    Starting background process RBAL
    RBAL started with pid=26, OS id=10072
    Wed Jun 13 11:04:47 2012
    SUCCESS: diskgroup DATA was mounted
    Wed Jun 13 11:04:51 2012
    Setting recovery target incarnation to 1
    Wed Jun 13 11:04:52 2012
    Successful mount of redo thread 2, with mount id 3005749259
    Wed Jun 13 11:04:52 2012
    Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
    Completed: alter database mount
    Wed Jun 13 11:05:06 2012
    alter database open
    Wed Jun 13 11:05:06 2012
    This instance was first to open
    Wed Jun 13 11:05:06 2012
    Beginning crash recovery of 1 threads
    parallel recovery started with 2 processes
    Wed Jun 13 11:05:07 2012
    Started redo scan
    Wed Jun 13 11:05:07 2012
    Completed redo scan
    61 redo blocks read, 4 data blocks need recovery
    Wed Jun 13 11:05:07 2012
    Started redo application at
    Thread 1: logseq 7924, block 3, scn 506098125
    Wed Jun 13 11:05:07 2012
    Recovery of Online Redo Log: Thread 1 Group 2 Seq 7924 Reading mem 0
    Mem# 0: +DATA/osista/onlinelog/group_2.372.742132543
    Wed Jun 13 11:05:07 2012
    Completed redo application
    Wed Jun 13 11:05:07 2012
    Completed crash recovery at
    Thread 1: logseq 7924, block 64, scn 506118186
    4 data blocks read, 4 data blocks written, 61 redo blocks read
    Switch log for thread 1 to sequence 7925
    Picked broadcast on commit scheme to generate SCNs
    db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
    user-specified limit on the amount of space that will be used by this
    database for recovery-related files, and does not reflect the amount of
    space available in the underlying filesystem or ASM diskgroup.
    SUCCESS: diskgroup FLASH was mounted
    SUCCESS: diskgroup FLASH was dismounted
    Thread 1 advanced to log sequence 7926
    SUCCESS: diskgroup FLASH was mounted
    SUCCESS: diskgroup FLASH was dismounted
    Thread 1 advanced to log sequence 7927
    Wed Jun 13 11:05:11 2012
    LGWR: STARTING ARCH PROCESSES
    ARC0 started with pid=31, OS id=12747
    Wed Jun 13 11:05:11 2012
    ARC0: Archival started
    ARC1: Archival started
    LGWR: STARTING ARCH PROCESSES COMPLETE
    ARC1 started with pid=32, OS id=12749
    Wed Jun 13 11:05:12 2012
    Thread 2 opened at log sequence 7176
    Current log# 4 seq# 7176 mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
    Wed Jun 13 11:05:12 2012
    ARC1: Becoming the 'no FAL' ARCH
    ARC1: Becoming the 'no SRL' ARCH
    Wed Jun 13 11:05:12 2012
    Successful open of redo thread 2
    Wed Jun 13 11:05:12 2012
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Wed Jun 13 11:05:12 2012
    ARC0: Becoming the heartbeat ARCH
    Wed Jun 13 11:05:12 2012
    SMON: enabling cache recovery
    Wed Jun 13 11:05:15 2012
    Successfully onlined Undo Tablespace 20.
    Wed Jun 13 11:05:15 2012
    SMON: enabling tx recovery
    Wed Jun 13 11:05:15 2012
    Database Characterset is AL32UTF8
    Wed Jun 13 11:05:16 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Wed Jun 13 11:05:16 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Error 600 happened during db open, shutting down database
    USER: terminating instance due to error 600
    Instance terminated by USER, pid = 9174
    ORA-1092 signalled during: alter database open...
    Wed Jun 13 11:06:16 2012
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 eth0 172.16.0.0 configured from OCR for use as a cluster interconnect
    Interface type 1 bond0 132.147.160.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 2
    Autotune of undo retention is turned on.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.3.0.
    System parameters with non-default values:
    processes = 300
    sessions = 335
    sga_max_size = 524288000
    __shared_pool_size = 314572800
    __large_pool_size = 4194304
    __java_pool_size = 8388608
    __streams_pool_size = 8388608
    spfile = +DATA/osista/spfileosista.ora
    nls_language = FRENCH
    nls_territory = FRANCE
    nls_length_semantics = CHAR
    sga_target = 524288000
    control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
    db_block_size = 8192
    __db_cache_size = 180355072
    compatible = 10.2.0.3.0
    log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
    db_file_multiblock_read_count= 16
    cluster_database = TRUE
    cluster_database_instances= 2
    db_create_file_dest = +DATA
    db_recovery_file_dest = +FLASH
    db_recovery_file_dest_size= 68543315968
    thread = 2
    instance_number = 2
    undo_management = AUTO
    undo_tablespace = UNDOTBS2
    undo_retention = 29880
    remote_login_passwordfile= EXCLUSIVE
    db_domain =
    dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
    local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
    remote_listener = LISTENERS_OSISTA
    job_queue_processes = 10
    background_dump_dest = /oracle/product/admin/OSISTA/bdump
    user_dump_dest = /oracle/product/admin/OSISTA/udump
    core_dump_dest = /oracle/product/admin/OSISTA/cdump
    audit_file_dest = /oracle/product/admin/OSISTA/adump
    db_name = OSISTA
    open_cursors = 300
    pga_aggregate_target = 104857600
    aq_tm_processes = 1
    Cluster communication is configured to use the following interface(s) for this instance
    172.16.0.2
    Wed Jun 13 11:06:16 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=18682
    DIAG started with pid=3, OS id=18684
    PSP0 started with pid=4, OS id=18695
    LMON started with pid=5, OS id=18704
    LMD0 started with pid=6, OS id=18721
    LMS0 started with pid=7, OS id=18735
    LMS1 started with pid=8, OS id=18753
    MMAN started with pid=9, OS id=18767
    DBW0 started with pid=10, OS id=18788
    LGWR started with pid=11, OS id=18796
    CKPT started with pid=12, OS id=18799
    SMON started with pid=13, OS id=18801
    RECO started with pid=14, OS id=18803
    CJQ0 started with pid=15, OS id=18805
    MMON started with pid=16, OS id=18807
    Wed Jun 13 11:06:17 2012
    starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
    MMNL started with pid=17, OS id=18809
    Wed Jun 13 11:06:17 2012
    starting up 1 shared server(s) ...
    Wed Jun 13 11:06:17 2012
    lmon registered with NM - instance id 2 (internal mem no 1)
    Wed Jun 13 11:06:17 2012
    Reconfiguration started (old inc 0, new inc 2)
    List of nodes:
    1
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Jun 13 11:06:18 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Jun 13 11:06:18 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Jun 13 11:06:18 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:06:18 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:06:18 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=20, OS id=18816
    Wed Jun 13 11:06:18 2012
    ALTER DATABASE MOUNT
    Wed Jun 13 11:06:18 2012
    This instance was first to mount
    Wed Jun 13 11:06:18 2012
    Reconfiguration started (old inc 2, new inc 4)
    List of nodes:
    0 1
    Wed Jun 13 11:06:18 2012
    Starting background process ASMB
    Wed Jun 13 11:06:18 2012
    Global Resource Directory frozen
    Communication channels reestablished
    ASMB started with pid=22, OS id=18913
    Starting background process RBAL
    * domain 0 valid = 0 according to instance 0
    Wed Jun 13 11:06:18 2012
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Jun 13 11:06:18 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Jun 13 11:06:18 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Wed Jun 13 11:06:18 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:06:18 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:06:18 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    RBAL started with pid=23, OS id=18917
    Reconfiguration complete
    Wed Jun 13 11:06:22 2012
    SUCCESS: diskgroup DATA was mounted
    Wed Jun 13 11:06:26 2012
    Setting recovery target incarnation to 1
    Wed Jun 13 11:06:26 2012
    Successful mount of redo thread 2, with mount id 3005703530
    Wed Jun 13 11:06:26 2012
    Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
    Completed: ALTER DATABASE MOUNT
    Wed Jun 13 11:06:27 2012
    ALTER DATABASE OPEN
    This instance was first to open
    Wed Jun 13 11:06:27 2012
    Beginning crash recovery of 1 threads
    parallel recovery started with 2 processes
    Wed Jun 13 11:06:27 2012
    Started redo scan
    Wed Jun 13 11:06:27 2012
    Completed redo scan
    61 redo blocks read, 4 data blocks need recovery
    Wed Jun 13 11:06:28 2012
    Started redo application at
    Thread 2: logseq 7176, block 3
    Wed Jun 13 11:06:28 2012
    Recovery of Online Redo Log: Thread 2 Group 4 Seq 7176 Reading mem 0
    Mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
    Wed Jun 13 11:06:28 2012
    Completed redo application
    Wed Jun 13 11:06:28 2012
    Completed crash recovery at
    Thread 2: logseq 7176, block 64, scn 506138248
    4 data blocks read, 4 data blocks written, 61 redo blocks read
    Picked broadcast on commit scheme to generate SCNs
    Wed Jun 13 11:06:28 2012
    LGWR: STARTING ARCH PROCESSES
    ARC0 started with pid=28, OS id=19692
    Wed Jun 13 11:06:28 2012
    ARC0: Archival started
    ARC1: Archival started
    LGWR: STARTING ARCH PROCESSES COMPLETE
    ARC1 started with pid=29, OS id=19695
    Wed Jun 13 11:06:28 2012
    Thread 2 advanced to log sequence 7177
    Thread 2 opened at log sequence 7177
    Current log# 3 seq# 7177 mem# 0: +DATA/osista/onlinelog/group_3.291.742134597
    Successful open of redo thread 2
    Wed Jun 13 11:06:28 2012
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Wed Jun 13 11:06:28 2012
    ARC0: Becoming the 'no FAL' ARCH
    ARC0: Becoming the 'no SRL' ARCH
    Wed Jun 13 11:06:28 2012
    ARC1: Becoming the heartbeat ARCH
    Wed Jun 13 11:06:28 2012
    SMON: enabling cache recovery
    Wed Jun 13 11:06:28 2012
    db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
    user-specified limit on the amount of space that will be used by this
    database for recovery-related files, and does not reflect the amount of
    space available in the underlying filesystem or ASM diskgroup.
    SUCCESS: diskgroup FLASH was mounted
    SUCCESS: diskgroup FLASH was dismounted
    Wed Jun 13 11:06:31 2012
    Successfully onlined Undo Tablespace 20.
    Wed Jun 13 11:06:31 2012
    SMON: enabling tx recovery
    Wed Jun 13 11:06:31 2012
    Database Characterset is AL32UTF8
    Wed Jun 13 11:06:31 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_19596.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Wed Jun 13 11:06:32 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_19596.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Error 600 happened during db open, shutting down database
    USER: terminating instance due to error 600
    Instance terminated by USER, pid = 19596
    ORA-1092 signalled during: ALTER DATABASE OPEN...
    Wed Jun 13 11:11:35 2012
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 eth0 172.16.0.0 configured from OCR for use as a cluster interconnect
    Interface type 1 bond0 132.147.160.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 2
    Autotune of undo retention is turned on.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.3.0.
    System parameters with non-default values:
    processes = 300
    sessions = 335
    sga_max_size = 524288000
    __shared_pool_size = 318767104
    __large_pool_size = 4194304
    __java_pool_size = 8388608
    __streams_pool_size = 8388608
    spfile = +DATA/osista/spfileosista.ora
    nls_language = FRENCH
    nls_territory = FRANCE
    nls_length_semantics = CHAR
    sga_target = 524288000
    control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
    db_block_size = 8192
    __db_cache_size = 176160768
    compatible = 10.2.0.3.0
    log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
    db_file_multiblock_read_count= 16
    cluster_database = TRUE
    cluster_database_instances= 2
    db_create_file_dest = +DATA
    db_recovery_file_dest = +FLASH
    db_recovery_file_dest_size= 68543315968
    thread = 2
    instance_number = 2
    undo_management = AUTO
    undo_tablespace = UNDOTBS2
    undo_retention = 29880
    remote_login_passwordfile= EXCLUSIVE
    db_domain =
    dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
    local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
    remote_listener = LISTENERS_OSISTA
    job_queue_processes = 10
    background_dump_dest = /oracle/product/admin/OSISTA/bdump
    user_dump_dest = /oracle/product/admin/OSISTA/udump
    core_dump_dest = /oracle/product/admin/OSISTA/cdump
    audit_file_dest = /oracle/product/admin/OSISTA/adump
    db_name = OSISTA
    open_cursors = 300
    pga_aggregate_target = 104857600
    aq_tm_processes = 1
    Cluster communication is configured to use the following interface(s) for this instance
    172.16.0.2
    Wed Jun 13 11:11:35 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=16101
    DIAG started with pid=3, OS id=16103
    PSP0 started with pid=4, OS id=16105
    LMON started with pid=5, OS id=16107
    LMD0 started with pid=6, OS id=16110
    LMS0 started with pid=7, OS id=16112
    LMS1 started with pid=8, OS id=16116
    MMAN started with pid=9, OS id=16120
    DBW0 started with pid=10, OS id=16132
    LGWR started with pid=11, OS id=16148
    CKPT started with pid=12, OS id=16169
    SMON started with pid=13, OS id=16185
    RECO started with pid=14, OS id=16203
    CJQ0 started with pid=15, OS id=16219
    MMON started with pid=16, OS id=16227
    Wed Jun 13 11:11:36 2012
    starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
    MMNL started with pid=17, OS id=16229
    Wed Jun 13 11:11:36 2012
    starting up 1 shared server(s) ...
    Wed Jun 13 11:11:36 2012
    lmon registered with NM - instance id 2 (internal mem no 1)
    Wed Jun 13 11:11:36 2012
    Reconfiguration started (old inc 0, new inc 2)
    List of nodes:
    1
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Jun 13 11:11:36 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Jun 13 11:11:36 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Jun 13 11:11:36 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:11:36 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:11:36 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=20, OS id=16235
    Wed Jun 13 11:11:37 2012
    ALTER DATABASE MOUNT
    Wed Jun 13 11:11:37 2012
    This instance was first to mount
    Wed Jun 13 11:11:37 2012
    Starting background process ASMB
    ASMB started with pid=22, OS id=16343
    Starting background process RBAL
    RBAL started with pid=23, OS id=16347
    Wed Jun 13 11:11:44 2012
    SUCCESS: diskgroup DATA was mounted
    Wed Jun 13 11:11:49 2012
    Setting recovery target incarnation to 1
    Wed Jun 13 11:11:49 2012
    Successful mount of redo thread 2, with mount id 3005745065
    Wed Jun 13 11:11:49 2012
    Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
    Completed: ALTER DATABASE MOUNT
    Wed Jun 13 11:22:25 2012
    alter database open
    This instance was first to open
    Wed Jun 13 11:22:26 2012
    Beginning crash recovery of 1 threads
    parallel recovery started with 2 processes
    Wed Jun 13 11:22:26 2012
    Started redo scan
    Wed Jun 13 11:22:26 2012
    Completed redo scan
    61 redo blocks read, 4 data blocks need recovery
    Wed Jun 13 11:22:26 2012
    Started redo application at
    Thread 1: logseq 7927, block 3
    Wed Jun 13 11:22:26 2012
    Recovery of Online Redo Log: Thread 1 Group 1 Seq 7927 Reading mem 0
    Mem# 0: +DATA/osista/onlinelog/group_1.283.742132543
    Wed Jun 13 11:22:26 2012
    Completed redo application
    Wed Jun 13 11:22:26 2012
    Completed crash recovery at
    Thread 1: logseq 7927, block 64, scn 506178382
    4 data blocks read, 4 data blocks written, 61 redo blocks read
    Switch log for thread 1 to sequence 7928
    Picked broadcast on commit scheme to generate SCNs
    Wed Jun 13 11:22:27 2012
    LGWR: STARTING ARCH PROCESSES
    ARC0 started with pid=31, OS id=13010
    Wed Jun 13 11:22:27 2012
    ARC0: Archival started
    ARC1: Archival started
    LGWR: STARTING ARCH PROCESSES COMPLETE
    ARC1 started with pid=32, OS id=13033
    Wed Jun 13 11:22:27 2012
    Thread 2 opened at log sequence 7178
    Current log# 4 seq# 7178 mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
    Successful open of redo thread 2
    Wed Jun 13 11:22:27 2012
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Wed Jun 13 11:22:27 2012
    ARC0: Becoming the 'no FAL' ARCH
    ARC0: Becoming the 'no SRL' ARCH
    Wed Jun 13 11:22:27 2012
    ARC1: Becoming the heartbeat ARCH
    Wed Jun 13 11:22:27 2012
    SMON: enabling cache recovery
    Wed Jun 13 11:22:30 2012
    db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
    user-specified limit on the amount of space that will be used by this
    database for recovery-related files, and does not reflect the amount of
    space available in the underlying filesystem or ASM diskgroup.
    SUCCESS: diskgroup FLASH was mounted
    SUCCESS: diskgroup FLASH was dismounted
    Wed Jun 13 11:22:31 2012
    Successfully onlined Undo Tablespace 20.
    Wed Jun 13 11:22:31 2012
    SMON: enabling tx recovery
    Wed Jun 13 11:22:32 2012
    Database Characterset is AL32UTF8
    Wed Jun 13 11:22:32 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_11751.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Wed Jun 13 11:22:33 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_11751.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Error 600 happened during db open, shutting down database
    USER: terminating instance due to error 600
    Instance terminated by USER, pid = 11751
    ORA-1092 signalled during: alter database open...
    regards,

    Hi;
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:Did you check trc file?
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []You are getting oracle internal error(ORA 600) which mean you could need to work wiht oracle support team. Please see below note, if its not help than i suggest log a sr:
    Troubleshoot an ORA-600 or ORA-7445 Error Using the Error Lookup Tool [ID 153788.1]
    for your future rac issue please use Forum Home » High Availability » RAC, ASM & Clusterware Installation which is RAC dedicated forum site.
    Regard
    Helios

  • Restoring Back To 10g After A Failed 11g Upgrade Of a RAC Database

    I'm testing this out on a small test RAC database. I successfully upgraded it from 10.2.0.4 to 11.2.0.2 but wanted to test the scenario of having to go back to 10g if the upgrade really hosed up. The first recovery attempt seemed to be successful but after bringing the DB down with srvctl, it failed on the next startup saying it needed to be started in upgrade mode. Something from 11g was still in place or the fact that I was trying to restart a 10g database managed by 11g clusterware was the issue. I tried starting the DB from both 10g and 11g environments and got the same result. Even starting each instance individually got the same result.
    In all that I tried, I got the usual incarnation and "until time before reset time" messages. I've been doing this all through RMAN without EM or Grid Ctl. As usual, any docs I found have had just a little information and I have to piece my own instructions together from all of them, not knowing for sure it all steps would apply in my situation.
    Can anybody point me to a good doc or other resource that might help me out? Many thanks!

    Now I have a different issue with apparently the same problem. I successfully did the restore/recover as before but the thing pukes when I open resetlogs at the end. Something, somewhere is still pointing to 11g but I have no idea where or what has changed since the last time I did this. Maybe I've messed things up by doing this multiple times. Here are my RMAN commands:
    RMAN> connect target
    RMAN> startup force nomount;
    RMAN> RESTORE SPFILE TO '+DATA/jimg/spfilejimg.ora' from '/local/oracle/10.2.0/db_1/dbs/c-2526333028-20110915-01';
    RMAN> shutdown immediate;
    (in a different session, from command line)
    % mv /local/oracle/10.2.0/db_1/dbs/initJIMG1.ora.bak /local/oracle/10.2.0/db_1/dbs/initJIMG1.ora
    RMAN> startup force nomount pfile='/local/oracle/10.2.0/db_1/dbs/initJIMG1.ora';
    RMAN> restore controlfile from '/local/oracle/10.2.0/db_1/dbs/c-2526333028-20110915-01';
    RMAN> alter database mount;
    RMAN>
    run {
    restore database;
    recover database;
    RMAN> alter database open resetlogs;
    The errors I get after resetlogs are:
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of alter db command at 09/16/2011 09:24:50
    ORA-01092: ORACLE instance terminated. Disconnection forced
    ORA-00704: bootstrap process failure
    ORA-39700: database must be opened with UPGRADE option
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    ORA-03114: not connected to ORACLE
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of alter db command at 09/16/2011 09:24:50
    ORA-01092: ORACLE instance terminated. Disconnection forced
    ORA-00704: bootstrap process failure
    ORA-39700: database must be opened with UPGRADE option
    This is being done on a small, expendable database for practice. I think I've learned enough about what NOT to do to keep me from getting in this jam to begin with. Any thoughts?

  • Will database Auto-Start?

    I have two databases running in a windows server, due to some other issue server got rebooted. When i try to start the DB's I found one of my database is already up and running! and the other DB was down i had started it manually.
    Is it possible to have a DB auto-start on Windows? If so how to do it?

    user011232 wrote:
    Hi,
    Install Oracle Grid Infrastructure for a Standalone Server and use RESTART future, this is best
    Define "Best".
    I'd consider installing GI for the sole purpose of getting Oracle Restart to be a case of severe over-engineering.
    Besides, a server shutdown should be considered "A Big Deal" (tm)*.  Personally, I don't think having a database auto-start after a server shutdown to be desirable.  I want to check out the server itself, confirm things are working correctly after the reason for the shutdown.  Only after the server itself is checked out do I want to try to bring up a database.
    * Yes, I know that with Windows, a weekly reboot is often considered "normal operations".  One more reason I consider Windows to be a poor excuse of an OS.

  • Possible to start a rac database from another standard oracle kernel ?

    With oracle RAC 10g R2 :
    If all nodes of a ORACLE RAC platform go down, is it possible to restart the database (with existing datafiles) from another oracle standard kernel 10g R2 ?
    Thanks for help : )

    That is an interesting question.. it should be possible.. you will definitely need a copy of your init.ora file. As long as the database files (control files, archive logs and redo logs) are on shared storage you should be able to start the database. Off course you will not have all the clustering features. I think it would be safe to disable the cluster database mode by setting the parameter CLUSTER_DATABASE = 'FALSE'.
    hmmmm... what if you are using ASM? It could be bit more tricky in this case. Depending on the O/S and other ASM components for example ASMLib used. I think it should be possible.. never tried one though.
    You may need to turn off the CLUSTER_DATABASE parameter on ASM as well.
    You will need to create an ASM instance! and get the diskgroups mounted manually. if you plan to try.. do post your findings for the benefit of other readers..
    Message was edited by:
    Murali Vallath

  • Metalink Document IDs for RAC database

    Hello,
    This is Khalid, Plz any body could give me metalink document id's for RAC database managing and monitoring, backup and recoveries etc.. plz any document id related to RAC databases.....
    Thanks & Regards
    Khalid

    Clusterware References
    Metalink Notes
    259301.1      CRS and 10g RAC
         This note contains a useful awk script to improve the output of crs_stat -ls
    436067.1      Windows CRS_STAT script to display long names correctly
    309541.1      How to start/stop the 10g CRS Clusterware
    263897.1      How to stop Cluster Ready Services (CRS)
    298073.1      How to remove CRS auto start and restart for a RAC instance
    295871.1      How to verify if CRS install is valid
    316583.1      VIPCA fails complaining that interface is not public
    341214.1      How to cleanup after a failed (or successful) Oracle Clusterware installation
    280589.1      How to install Oracle 10g CRS on a cluster where one or more nodes are not to be configured with CRS immediately
    357808.1      CRS Diagnostics
    272331.1      CRS 10g Diagnostic Guide
    330358.1      CRS 10g R2 Diagnostic Collection Guide
    331168.1      Oracle Clusterware consolidated logging in 10gR2
    342590.1      CRS logs not being written
    357808.1      Diagnosability for CRS/EVM/RACG
    459694.1      Procwatcher: Script to Monitor and Examine Oracle and CRS Processes
    289690.1      Data Gathering for Troubleshooting RAC and CRS issues
    265769.1      Troubleshooting CRS Reboots
    240001.1      Troubleshooting CRS root.sh problems (10g RAC)
    239989.1      10g RAC - Stopping Reboot Loops when CRS problems occur
    294430.1      CSS Timeout Computation in 10g RAC
    284752.1      10gRAC: Steps to Increase CSS Misscount, Reboottime and Disktimeout
    462616.1      Reconfiguring the CSS disktimeout of 10gR2 Clusterware for proper LUN failover
    293819.1      Placement of voting and OCR disk file in 10g RAC
    317628.1      How to replace a corrupt OCR mirror file
    452486.1      Moving OCR and Voting Disk to another location
    399482.1      How to recreate OCR/Voting disk accidentally deleted
    358620.1      How to recreate OCR/Voting disk in 10gR1/R2 RAC
    279793.1      How to Restore a Lost Voting Disk in 10g
    264847.1      How to Configure Virtual IPs for 10g RAC
    283684.1      How to change interconnect/public interface IP subnet in a 10g cluster
    276434.1      Modifying the VIP or VIP Hostname of an Oracle 10g Clusterware Node
    294336.1      Changing the check interval for the Oracle 10g VIP
    219361.1      Troubleshooting Instance Evictions (ORA-29740)
    297498.1      Resolving Instance Evictions on Windows platforms
    315125.1      What to check if the Cluster Synchronization Services daemon (OCSSD) does not start
    270512.1      Adding a node to a 10g RAC Cluster
    269320.1      Removing a node from a 10g RAC Cluster
    338706.1      Cluster Ready Services (CRS) rolling upgrade
    399031.1      Step-by-step installation of Oracle Clusterware one-off and bundle patches for Oracle 10g
    401783.1      Changes in Oracle Clusterware after applying 10.2.0.3 Patchset
    405820.1      Known Issues After Applying 10.2 CRS bundle patches
    316817.1      Cluster Verification Utility (CLUVFY) FAQ
    372358.1      Shared disk check with the Cluster Verification Utility
    338924.1      CLUVFY Fails with error - could not find a suitable set of interfaces for VIPs
    Bugs
    5849200      CRS LOGS ARE NOT BEING WRITTEN
    5137401      OPROCD LOGFILE IS CLEARED AFTER A REBOOT
         Fixed in Oracle 10.2.0.4+ and 11.1.0.6+
    source: http://www.juliandyke.com/References/Clusterware.html
    regards,
    Rajeshkumar Govindarajan.
    http://oracleinstance.blogspot.com

  • Help needed in deleting nodes from RAC database

    Our DB is 10g RAC database and the servers are Window 2003 server. Initially the database was configured to have 4 nodes. For some reason we stopped the instances in the first two nodes and the current database is running on node3 and node4. I queried the v$thread view and it shows 3 records. Thread 2 is closed and disabled. Thread 3 is open and public and thread 4 is open and private. Now we need to disconnect nodes 1 and 2 from RAC db and cluster ware (We use Oracle cluster ware) and plan to use those two servers for other purposes. Although I read through the Oracle doc regarding deleting node from RAC “11 Adding and Deleting Nodes and Instances on Windows-Based Systems” and wrote down the steps we need to take but I am still not comfortable in doing it since it is production env and we don’t have any dev env to practice those steps. So I would like to borrow your experiences and your lessons learned while you were doing those. Please share your thoughts and insights with us.
    Thank you so much for your help,
    Shirley

    what's your full version? I can warn about specific issues in 10.1.0.3 and 10.2.0.1 - for example, in some cases (depending on how the listener was configured), it will be impossible to delete the listener for the deleted node. Known bug.
    To avoid many many known bugs, you may want to upgrade to at least 10.2.0.2 before removing a node (from my experience, this is the first stable RAC version).
    In any case, deleting a node is a rather delicate process. I really really recommend practicing. Take any pc/laptop, install VMWARE, define two virtual machines, install RAC and remove one node. It will take you an extra day or two, and could save your production.

  • How to drop oracle 10.2.0.4 rac database on RHEL5

    Hi,
    I am having oracle 10.2.0.4 rac database with ASM configuration on RHEL5. Every week/day i take rman level 0 / level 1 backups by using RMAN into disk. Now i would like to test those backups. For that i need to drop this existing database and restrore, recovery from those backups. My database is configured with asm that is having two disk groups.
    DATA for storing data
    FRA for archive logs
    How can i drop the rac 10.2.0.4 database and restore & recovery database from RMAN backups with asm. Could any one send clean steps for this testing?
    Thanks
    Vamshi
    Edited by: user12052260 on Jun 17, 2011 10:41 AM
    Edited by: user12052260 on Jun 17, 2011 10:41 AM

    I asked about instances. Before performing your steps is it mandatory to bring down all the oracle instances. Before droping this database how to deregister these services from clusterware? After recover the database how to register the database services with clusterware?if you want to register/de-register, you need to use SRVCTL in clusterware. commands to ADD/REMOVE from clusterware, refer this below note.
    http://download.oracle.com/docs/cd/E14072_01/server.112/e10595/restart005.htm

  • Upgrade of RAC database in 10g or 11g

    Hi,
    Can we upgrade RAC database without downtime like we do patch?
    Regards

    user602441 wrote:
    Hi,
    Can we upgrade RAC database without downtime like we do patch?
    RegardsHi,
    The Oracle Clusterware software always fully supports rolling upgrades, while the ASM software is rolling upgradeable at version 11.1.0.6 and beyond.
    Rolling upgrade: we mean upgrading software (Oracle Database, Oracle Clusterware, ASM or the OS itself) while the cluster is operational by shutting down a node, upgrading the software on that node, and then reintegrating it into the cluster, and so forth one node at a time until all the nodes in the cluster are at the new software level.
    For the Oracle Database software (RAC), it is possible only for certain single patches that are marked as rolling upgrade compatible. Most Bundle patches and Critical Patch Updates (CPU) are rolling upgradeable. Patchsets and DB version (10g to 11g) changes are not supported in a rolling fashion, one reason that this may be impossible is that across major releases, there may be incompatible versions of the system tablespace, for example. To upgrade these in a rolling fashion one will need to use a logical standby with Oracle Database 10g or 11g.
    Regards,
    Levi Pereira

  • Database shut down and file systems are still busy

    We have been copying our production DB over to a reporting DB for the last 3 years. The process we use on the reporting database is: shutdown the DB ABORT, restart it, then shutdown immediate. Then we umount the file systems and copy the production over using a Flex Clone on the NETAPP, remount the file systems and start the database back up. We used to shut down the listener which does not share the oracle binaries on the unix system but lately have left the listener up because other DB's use it. Every so often after shutting down the report database in prep for the copy, the file system that contains the binaries is busy and we can't unmount the ORACLE binaries.
    So my question is how does the listener know if the database is up or down? Is it reading something on the volumes we're trying to unmount to see if the database is up? We have users that schedule jobs in odd hours and was wondering if them hitting the listener and even though the database is down it's making the file systems busy?
    Any help is appreciated.

    Mi**** wrote:
    So my question is how does the listener know if the database is up or down? If/when the DB responds, then the listener knows it is up;
    otherwise listener reports error to client
    Is it reading something on the volumes we're trying to unmount to see if the database is up? We have users that schedule jobs in odd hours and was wondering if them hitting the listener and even though the database is down it's making the file systems busy?
    Listener is NEVER involved with any ongoing packet exchange between client & the DB.
    Listener takes original connection request & if the DB is up passes the request to the DB.
    After this initial handoff, the listener has NO more involvement with packet exchange between DB & client

  • How to I find a RAC database  in oracle SR support

    Hi Friends,
    I could not find a product name for oracle RAC /database in product box when I try to make a SR.
    But I need to selecte a name from dripdown list. however I can not find a name that is for oracle RAC or real application cluster or RAC database
    any suggestion?
    thanks
    Jim

    Oracle Server - Enterprise Edition
    Oracle Server - Personal Edition
    Oracle Server - Standarad Edition
    It's there. Look harder!
    RAC is selected in the Problem drop-down.

  • How to start Oracle 10g RAC database and clusterware?

    I have steps to stop the 10g RAC Database and clusterware but not sure about starting it.
    I have heard executing
    $crsctl stop crs --as root
    on each node
    will start the database,asm,nodeapps .Is that true?
    or we have to do that step by step like we do in stopping the clusterware and database below
    1.Stop the agent:
    cd to $AGENT_HOME/corpng04.amhc.amhealthways.net/bin, then run: ./emctl stop agent
    2.Stop the full database
    $ oracle_home/bin/srvctl stop database -d db_name
    3.Stop the ASM Instances on node1,node2
    $ oracle_home/bin/srvctl stop asm -n node -- I guess you can't give multiple nodes in one command with comma,you need to give this multiple times with diff node name
    4.Stop the NodeApps :vip,listener,oms and gsd
    $ oracle_home/bin/srvctl stop nodeapps -n node -- I guess you can't give multiple nodes in one command with comma,you need to give this multiple times with diff node name
    5.Stop the CRS cluster processes :those bloody 3 evmd,ocssd,crsd
    $su - root
    $CRS_home/bin/crsctl stop crs

    Paul R @ NL wrote:
    before is shutting down crs i tend to stop the instances and services via srvctl then stop crs via crsctl
    just the way i do it. not saying it's the right way but it is the one i am comfortable with.Good -) If we stop CRS, but forgot shutdown oracle instances ... we'll see shutdown abort in alert log file(that mean instances are shutdowned abort).
    We should shutdown instance before stop CRS anyway.

Maybe you are looking for