Parallel recovery on DR on RAC

Hi all,
We have a 2 node DR on 2 node RAC. Just wanted to know if it is possible to do recovery from both the nodes parallel, I mean node 1 will apply its archive and node 2 prod nodes 2 archive.
If this is possible how is that and if its not why?
Regards
ID

Hello;
Have you looked at:
D.1.1 Setting Up a Multi-Instance Primary with a Single-Instance Standby
Data Guard Concepts and Administration 11g Release 2 (11.2) E25608-04
If yes, what would parallel recovery give you?
I have never tried to do this setup but the main issues I would see it the second standby instance would not be available
if something happened to the first standby instance.
By using only one instance for apply it's much simplier to keep track of the redo coming from the two primary instances.
If you don't use a standby as a reader database it does not have a lot of load on it. Generally if you check the alert
log there are several minutes between the "waiting for log..." messages.
Great question!
Best Regards
mseberg

Similar Messages

  • Should I specify the Parallel parameter for an non-RAC database?

    The Oracle documatation state as the following:
    "The Oracle Database 10g Release 2 database controls and balances all parallel operations, based upon available resources, request priorities and actual system load." It show that Oracle can optimize the Parallel level automaticly.
    Should I specify the Parallel parameter for a non-RAC database? Most of the transactions are small OLTP.

    What parallel parameter are you talking about?
    Generally, you may benefit from parallelization in a very similar manner on RAC as on single instance system. And it is in both cases not sufficient to change the value of any initialization parameter to achieve parallelization of queries, DDL or DML.
    Kind regards
    Uwe
    http://uhesse.wordpress.com

  • Number of parallel recovery processes in standby

    Hi,
    How to find the number of parallel recovery processes that the standby is started with..MRP parallel???

    user13179227 wrote:
    Hi,
    How to find the number of parallel recovery processes
    that the standby is started with..MRP parallel???what is "MRP parallel"?

  • Multi-table INSERT with PARALLEL hint on 2 node RAC

    Multi-table INSERT statement with parallelism set to 5, works fine and spawns multiple parallel
    servers to execute. Its just that it sticks on to only one instance of a 2 node RAC. The code I
    used is what is given below.
    create table t1 ( x int );
    create table t2 ( x int );
    insert /*+ APPEND parallel(t1,5) parallel (t2,5) */
    when (dummy='X') then into t1(x) values (y)
    when (dummy='Y') then into t2(x) values (y)
    select dummy, 1 y from dual;
    I can see multiple sessions using the below query, but on only one instance only. This happens not
    only for the above statement but also for a statement where real time table(as in table with more
    than 20 million records) are used.
    select p.server_name,ps.sid,ps.qcsid,ps.inst_id,ps.qcinst_id,degree,req_degree,
    sql.sql_text
    from Gv$px_process p, Gv$sql sql, Gv$session s , gv$px_session ps
    WHERE p.sid = s.sid
    and p.serial# = s.serial#
    and p.sid = ps.sid
    and p.serial# = ps.serial#
    and s.sql_address = sql.address
    and s.sql_hash_value = sql.hash_value
    and qcsid=945
    Won't parallel servers be spawned across instances for multi-table insert with parallelism on RAC?
    Thanks,
    Mahesh

    Please take a look at these 2 articles below
    http://christianbilien.wordpress.com/2007/09/12/strategies-for-rac-inter-instance-parallelized-queries-part-12/
    http://christianbilien.wordpress.com/2007/09/14/strategies-for-parallelized-queries-across-rac-instances-part-22/
    thanks
    http://swervedba.wordpress.com

  • Transaction Recovery within an Oracle RAC environment

    Good evening everyone.
    I need some help with Oracle 11gR1 RAC transaction-level recovery issues. Here's the scenario.
    We have a three(3) node RAC Cluster running Oracle 11g R1. The Web UI portion of the application connects through WLS 9.2.3 with connection pooling set. We also have a command-line/SQL*Developer component that uses a TNSNAMES file that allows for both failover and load balancing. Within either the UI or the command line portion of the application, a user can run a process by which invokes one or more PL/SQL Packages to be invoked. The exact location of the physical to the database is dependent on which server is chosen from either the connection pooling or the TNSNAMES.ORA Load Balancing option.
    In the normal world, the process executes and all is good. The status of the execution of this process is updated by the Packages once completed. The problem we are encountering is when an Oracle Instance fails. Here's where I need some help. For Application-level (Transaction Level) recovery, the database instances are first recovered by the database background proccesses and then Users must determine which processes were in flight and either re-execute them (if restart processing is part of the process) or remove any changes and restart from scratch. Given that the database instance does not record which processes are "in flight" it is the responsibility of the application to perform its own recovery processing. Is this still true?
    If an instance fails, are "in flight" transactions/connections moved to other instances in the Grid/RAC environment? I don't think this is possible but I don't remember if this was accomplished through a combination of Application and Database Server features that provide feedback between each other. How is the underlying application notified of the change if such an issue occurs? I remember something similar to this in older versions of Oracle but I cannot remember what it was callled.
    Any help or guidance would be great as our client is being extremely difficult in pressing this issue.
    Thanks in advance
    Stephen Karniotis
    Project Architect - Compuware
    [email protected]
    (248) 408-2918

    You have not indicated whether you are using TAF or FCF ... that would be the first place to start.
    My recommendation would be to let Oracle roll back the database changes and have the application resubmit the most recent work.
    If the application knows what it did since the last "COMMIT" then you should be fine with the possible exception of variables stored
    in packages. Depending on packages retaining values is an issue best solved with PRAGMA SERIALLY_REUSABLE ... in other words
    not using the retention feature.

  • How to write Parallel DML in 2 node RAC Cluster

    Any ideas on how to write a DML that will run on a two node cluster in parallel? I would like to scale a DML statement within a RAC environment. Thanks

    Check out [this article|http://www.oracle.com/technology/pub/articles/conlon_rac.html].

  • Which difference parallel database and RAC database

    Hi Experts,
    I saw some document about parallel database and RAC database.
    My boss confused these two product.
    which difference between parallel database and RAC database?
    does parallel database is a "old RAC"?
    Thanks
    Jim

    RAC is the new name with many features for old depreciated Oracle Parallel Server (OPS).
    It was avialable till 8i, since 9i it is called RAC.
    Regards
    Edited by: skvaish1 on Mar 30, 2010 12:59 PM

  • SMON: Parallel transaction recovery tried

    Hi,
    I got these following messages in the BDUMP file. Around these time my process was always getting blocked while updating a table. Does this messages mean anything critical:
    Dump file d:\oracle\admin\usmdb\bdump\usmdb_smon_1564.trc
    Sun Mar 07 03:17:27 2004
    ORACLE V9.2.0.1.0 - Production vsnsta=0
    vsnsql=12 vsnxtr=3
    Windows 2000 Version 5.0 Service Pack 3, CPU type 586
    Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
    With the Partitioning, OLAP and Oracle Data Mining options
    JServer Release 9.2.0.1.0 - Production
    Windows 2000 Version 5.0 Service Pack 3, CPU type 586
    Instance name: usmdb
    Redo thread mounted by this instance: 1
    Oracle process number: 6
    Windows thread id: 1564, image: ORACLE.EXE
    *** 2004-03-07 03:17:27.000
    *** SESSION ID:(5.1) 2004-03-07 03:17:27.000
    *** 2004-03-07 03:17:27.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 03:58:28.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 04:38:38.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 04:39:23.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 05:20:16.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 06:01:25.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 06:42:27.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 07:23:28.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 08:04:33.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 08:45:31.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 09:26:36.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 10:07:42.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 10:48:46.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 11:29:41.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 12:10:56.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 12:51:52.000
    SMON: Parallel transaction recovery tried
    *** 2004-03-07 13:32:55.000
    SMON: Parallel transaction recovery tried
    Thanks,
    Tuhin

    That would occur because a large transaction (or transactions) had been killed / interrupted while the instance was running (or when the instance was shutdown abort). SMON takes over the job of "cleanup" and may use Parallel Recovery . You should be able to monitor the recovery in the V$FAST_START_TRANSACTIONS view.

  • Can not start Rac database

    Hi,
    Oracle RAC database 10.2.0.3/RedHat4 with 2 nodes.
    In the begining we had an error ORA-600[12803] so only sys can connect to database I find the note 1026653.6 this note said that we need to create AUDSES$ sequence but befor that we have to restart the database.
    When we stop the datanbase we had another ORA-600 and it's impossible to start it!!
    Here is a coppy of our alert file:
    Picked latch-free SCN scheme 2
    Autotune of undo retention is turned on.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.3.0.
    System parameters with non-default values:
    processes = 300
    sessions = 335
    sga_max_size = 524288000
    __shared_pool_size = 310378496
    __large_pool_size = 4194304
    __java_pool_size = 8388608
    __streams_pool_size = 8388608
    spfile = +DATA/osista/spfileosista.ora
    nls_language = FRENCH
    nls_territory = FRANCE
    nls_length_semantics = CHAR
    sga_target = 524288000
    control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
    db_block_size = 8192
    __db_cache_size = 184549376
    compatible = 10.2.0.3.0
    log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
    db_file_multiblock_read_count= 16
    cluster_database = TRUE
    cluster_database_instances= 2
    db_create_file_dest = +DATA
    db_recovery_file_dest = +FLASH
    db_recovery_file_dest_size= 68543315968
    thread = 2
    instance_number = 2
    undo_management = AUTO
    undo_tablespace = UNDOTBS2
    undo_retention = 29880
    remote_login_passwordfile= EXCLUSIVE
    db_domain =
    dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
    local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
    remote_listener = LISTENERS_OSISTA
    job_queue_processes = 10
    background_dump_dest = /oracle/product/admin/OSISTA/bdump
    user_dump_dest = /oracle/product/admin/OSISTA/udump
    core_dump_dest = /oracle/product/admin/OSISTA/cdump
    audit_file_dest = /oracle/product/admin/OSISTA/adump
    db_name = OSISTA
    open_cursors = 300
    pga_aggregate_target = 104857600
    aq_tm_processes = 1
    Cluster communication is configured to use the following interface(s) for this instance
    172.16.0.2
    Wed Jun 13 11:04:30 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=8560
    DIAG started with pid=3, OS id=8562
    PSP0 started with pid=4, OS id=8566
    LMON started with pid=5, OS id=8570
    LMD0 started with pid=6, OS id=8574
    LMS0 started with pid=7, OS id=8576
    LMS1 started with pid=8, OS id=8580
    MMAN started with pid=9, OS id=8584
    DBW0 started with pid=10, OS id=8586
    LGWR started with pid=11, OS id=8588
    CKPT started with pid=12, OS id=8590
    SMON started with pid=13, OS id=8592
    RECO started with pid=14, OS id=8594
    CJQ0 started with pid=15, OS id=8596
    MMON started with pid=16, OS id=8598
    Wed Jun 13 11:04:31 2012
    starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
    MMNL started with pid=17, OS id=8600
    Wed Jun 13 11:04:31 2012
    starting up 1 shared server(s) ...
    Wed Jun 13 11:04:31 2012
    lmon registered with NM - instance id 2 (internal mem no 1)
    Wed Jun 13 11:04:31 2012
    Reconfiguration started (old inc 0, new inc 2)
    List of nodes:
    1
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Jun 13 11:04:31 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Jun 13 11:04:31 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Jun 13 11:04:31 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:04:31 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:04:31 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=20, OS id=8877
    Wed Jun 13 11:04:43 2012
    alter database mount
    Wed Jun 13 11:04:43 2012
    This instance was first to mount
    Wed Jun 13 11:04:43 2012
    Starting background process ASMB
    ASMB started with pid=25, OS id=10068
    Starting background process RBAL
    RBAL started with pid=26, OS id=10072
    Wed Jun 13 11:04:47 2012
    SUCCESS: diskgroup DATA was mounted
    Wed Jun 13 11:04:51 2012
    Setting recovery target incarnation to 1
    Wed Jun 13 11:04:52 2012
    Successful mount of redo thread 2, with mount id 3005749259
    Wed Jun 13 11:04:52 2012
    Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
    Completed: alter database mount
    Wed Jun 13 11:05:06 2012
    alter database open
    Wed Jun 13 11:05:06 2012
    This instance was first to open
    Wed Jun 13 11:05:06 2012
    Beginning crash recovery of 1 threads
    parallel recovery started with 2 processes
    Wed Jun 13 11:05:07 2012
    Started redo scan
    Wed Jun 13 11:05:07 2012
    Completed redo scan
    61 redo blocks read, 4 data blocks need recovery
    Wed Jun 13 11:05:07 2012
    Started redo application at
    Thread 1: logseq 7924, block 3, scn 506098125
    Wed Jun 13 11:05:07 2012
    Recovery of Online Redo Log: Thread 1 Group 2 Seq 7924 Reading mem 0
    Mem# 0: +DATA/osista/onlinelog/group_2.372.742132543
    Wed Jun 13 11:05:07 2012
    Completed redo application
    Wed Jun 13 11:05:07 2012
    Completed crash recovery at
    Thread 1: logseq 7924, block 64, scn 506118186
    4 data blocks read, 4 data blocks written, 61 redo blocks read
    Switch log for thread 1 to sequence 7925
    Picked broadcast on commit scheme to generate SCNs
    db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
    user-specified limit on the amount of space that will be used by this
    database for recovery-related files, and does not reflect the amount of
    space available in the underlying filesystem or ASM diskgroup.
    SUCCESS: diskgroup FLASH was mounted
    SUCCESS: diskgroup FLASH was dismounted
    Thread 1 advanced to log sequence 7926
    SUCCESS: diskgroup FLASH was mounted
    SUCCESS: diskgroup FLASH was dismounted
    Thread 1 advanced to log sequence 7927
    Wed Jun 13 11:05:11 2012
    LGWR: STARTING ARCH PROCESSES
    ARC0 started with pid=31, OS id=12747
    Wed Jun 13 11:05:11 2012
    ARC0: Archival started
    ARC1: Archival started
    LGWR: STARTING ARCH PROCESSES COMPLETE
    ARC1 started with pid=32, OS id=12749
    Wed Jun 13 11:05:12 2012
    Thread 2 opened at log sequence 7176
    Current log# 4 seq# 7176 mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
    Wed Jun 13 11:05:12 2012
    ARC1: Becoming the 'no FAL' ARCH
    ARC1: Becoming the 'no SRL' ARCH
    Wed Jun 13 11:05:12 2012
    Successful open of redo thread 2
    Wed Jun 13 11:05:12 2012
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Wed Jun 13 11:05:12 2012
    ARC0: Becoming the heartbeat ARCH
    Wed Jun 13 11:05:12 2012
    SMON: enabling cache recovery
    Wed Jun 13 11:05:15 2012
    Successfully onlined Undo Tablespace 20.
    Wed Jun 13 11:05:15 2012
    SMON: enabling tx recovery
    Wed Jun 13 11:05:15 2012
    Database Characterset is AL32UTF8
    Wed Jun 13 11:05:16 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Wed Jun 13 11:05:16 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Error 600 happened during db open, shutting down database
    USER: terminating instance due to error 600
    Instance terminated by USER, pid = 9174
    ORA-1092 signalled during: alter database open...
    Wed Jun 13 11:06:16 2012
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 eth0 172.16.0.0 configured from OCR for use as a cluster interconnect
    Interface type 1 bond0 132.147.160.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 2
    Autotune of undo retention is turned on.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.3.0.
    System parameters with non-default values:
    processes = 300
    sessions = 335
    sga_max_size = 524288000
    __shared_pool_size = 314572800
    __large_pool_size = 4194304
    __java_pool_size = 8388608
    __streams_pool_size = 8388608
    spfile = +DATA/osista/spfileosista.ora
    nls_language = FRENCH
    nls_territory = FRANCE
    nls_length_semantics = CHAR
    sga_target = 524288000
    control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
    db_block_size = 8192
    __db_cache_size = 180355072
    compatible = 10.2.0.3.0
    log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
    db_file_multiblock_read_count= 16
    cluster_database = TRUE
    cluster_database_instances= 2
    db_create_file_dest = +DATA
    db_recovery_file_dest = +FLASH
    db_recovery_file_dest_size= 68543315968
    thread = 2
    instance_number = 2
    undo_management = AUTO
    undo_tablespace = UNDOTBS2
    undo_retention = 29880
    remote_login_passwordfile= EXCLUSIVE
    db_domain =
    dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
    local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
    remote_listener = LISTENERS_OSISTA
    job_queue_processes = 10
    background_dump_dest = /oracle/product/admin/OSISTA/bdump
    user_dump_dest = /oracle/product/admin/OSISTA/udump
    core_dump_dest = /oracle/product/admin/OSISTA/cdump
    audit_file_dest = /oracle/product/admin/OSISTA/adump
    db_name = OSISTA
    open_cursors = 300
    pga_aggregate_target = 104857600
    aq_tm_processes = 1
    Cluster communication is configured to use the following interface(s) for this instance
    172.16.0.2
    Wed Jun 13 11:06:16 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=18682
    DIAG started with pid=3, OS id=18684
    PSP0 started with pid=4, OS id=18695
    LMON started with pid=5, OS id=18704
    LMD0 started with pid=6, OS id=18721
    LMS0 started with pid=7, OS id=18735
    LMS1 started with pid=8, OS id=18753
    MMAN started with pid=9, OS id=18767
    DBW0 started with pid=10, OS id=18788
    LGWR started with pid=11, OS id=18796
    CKPT started with pid=12, OS id=18799
    SMON started with pid=13, OS id=18801
    RECO started with pid=14, OS id=18803
    CJQ0 started with pid=15, OS id=18805
    MMON started with pid=16, OS id=18807
    Wed Jun 13 11:06:17 2012
    starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
    MMNL started with pid=17, OS id=18809
    Wed Jun 13 11:06:17 2012
    starting up 1 shared server(s) ...
    Wed Jun 13 11:06:17 2012
    lmon registered with NM - instance id 2 (internal mem no 1)
    Wed Jun 13 11:06:17 2012
    Reconfiguration started (old inc 0, new inc 2)
    List of nodes:
    1
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Jun 13 11:06:18 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Jun 13 11:06:18 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Jun 13 11:06:18 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:06:18 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:06:18 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=20, OS id=18816
    Wed Jun 13 11:06:18 2012
    ALTER DATABASE MOUNT
    Wed Jun 13 11:06:18 2012
    This instance was first to mount
    Wed Jun 13 11:06:18 2012
    Reconfiguration started (old inc 2, new inc 4)
    List of nodes:
    0 1
    Wed Jun 13 11:06:18 2012
    Starting background process ASMB
    Wed Jun 13 11:06:18 2012
    Global Resource Directory frozen
    Communication channels reestablished
    ASMB started with pid=22, OS id=18913
    Starting background process RBAL
    * domain 0 valid = 0 according to instance 0
    Wed Jun 13 11:06:18 2012
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Jun 13 11:06:18 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Jun 13 11:06:18 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Wed Jun 13 11:06:18 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:06:18 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:06:18 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    RBAL started with pid=23, OS id=18917
    Reconfiguration complete
    Wed Jun 13 11:06:22 2012
    SUCCESS: diskgroup DATA was mounted
    Wed Jun 13 11:06:26 2012
    Setting recovery target incarnation to 1
    Wed Jun 13 11:06:26 2012
    Successful mount of redo thread 2, with mount id 3005703530
    Wed Jun 13 11:06:26 2012
    Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
    Completed: ALTER DATABASE MOUNT
    Wed Jun 13 11:06:27 2012
    ALTER DATABASE OPEN
    This instance was first to open
    Wed Jun 13 11:06:27 2012
    Beginning crash recovery of 1 threads
    parallel recovery started with 2 processes
    Wed Jun 13 11:06:27 2012
    Started redo scan
    Wed Jun 13 11:06:27 2012
    Completed redo scan
    61 redo blocks read, 4 data blocks need recovery
    Wed Jun 13 11:06:28 2012
    Started redo application at
    Thread 2: logseq 7176, block 3
    Wed Jun 13 11:06:28 2012
    Recovery of Online Redo Log: Thread 2 Group 4 Seq 7176 Reading mem 0
    Mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
    Wed Jun 13 11:06:28 2012
    Completed redo application
    Wed Jun 13 11:06:28 2012
    Completed crash recovery at
    Thread 2: logseq 7176, block 64, scn 506138248
    4 data blocks read, 4 data blocks written, 61 redo blocks read
    Picked broadcast on commit scheme to generate SCNs
    Wed Jun 13 11:06:28 2012
    LGWR: STARTING ARCH PROCESSES
    ARC0 started with pid=28, OS id=19692
    Wed Jun 13 11:06:28 2012
    ARC0: Archival started
    ARC1: Archival started
    LGWR: STARTING ARCH PROCESSES COMPLETE
    ARC1 started with pid=29, OS id=19695
    Wed Jun 13 11:06:28 2012
    Thread 2 advanced to log sequence 7177
    Thread 2 opened at log sequence 7177
    Current log# 3 seq# 7177 mem# 0: +DATA/osista/onlinelog/group_3.291.742134597
    Successful open of redo thread 2
    Wed Jun 13 11:06:28 2012
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Wed Jun 13 11:06:28 2012
    ARC0: Becoming the 'no FAL' ARCH
    ARC0: Becoming the 'no SRL' ARCH
    Wed Jun 13 11:06:28 2012
    ARC1: Becoming the heartbeat ARCH
    Wed Jun 13 11:06:28 2012
    SMON: enabling cache recovery
    Wed Jun 13 11:06:28 2012
    db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
    user-specified limit on the amount of space that will be used by this
    database for recovery-related files, and does not reflect the amount of
    space available in the underlying filesystem or ASM diskgroup.
    SUCCESS: diskgroup FLASH was mounted
    SUCCESS: diskgroup FLASH was dismounted
    Wed Jun 13 11:06:31 2012
    Successfully onlined Undo Tablespace 20.
    Wed Jun 13 11:06:31 2012
    SMON: enabling tx recovery
    Wed Jun 13 11:06:31 2012
    Database Characterset is AL32UTF8
    Wed Jun 13 11:06:31 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_19596.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Wed Jun 13 11:06:32 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_19596.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Error 600 happened during db open, shutting down database
    USER: terminating instance due to error 600
    Instance terminated by USER, pid = 19596
    ORA-1092 signalled during: ALTER DATABASE OPEN...
    Wed Jun 13 11:11:35 2012
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 eth0 172.16.0.0 configured from OCR for use as a cluster interconnect
    Interface type 1 bond0 132.147.160.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 2
    Autotune of undo retention is turned on.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.3.0.
    System parameters with non-default values:
    processes = 300
    sessions = 335
    sga_max_size = 524288000
    __shared_pool_size = 318767104
    __large_pool_size = 4194304
    __java_pool_size = 8388608
    __streams_pool_size = 8388608
    spfile = +DATA/osista/spfileosista.ora
    nls_language = FRENCH
    nls_territory = FRANCE
    nls_length_semantics = CHAR
    sga_target = 524288000
    control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
    db_block_size = 8192
    __db_cache_size = 176160768
    compatible = 10.2.0.3.0
    log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
    db_file_multiblock_read_count= 16
    cluster_database = TRUE
    cluster_database_instances= 2
    db_create_file_dest = +DATA
    db_recovery_file_dest = +FLASH
    db_recovery_file_dest_size= 68543315968
    thread = 2
    instance_number = 2
    undo_management = AUTO
    undo_tablespace = UNDOTBS2
    undo_retention = 29880
    remote_login_passwordfile= EXCLUSIVE
    db_domain =
    dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
    local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
    remote_listener = LISTENERS_OSISTA
    job_queue_processes = 10
    background_dump_dest = /oracle/product/admin/OSISTA/bdump
    user_dump_dest = /oracle/product/admin/OSISTA/udump
    core_dump_dest = /oracle/product/admin/OSISTA/cdump
    audit_file_dest = /oracle/product/admin/OSISTA/adump
    db_name = OSISTA
    open_cursors = 300
    pga_aggregate_target = 104857600
    aq_tm_processes = 1
    Cluster communication is configured to use the following interface(s) for this instance
    172.16.0.2
    Wed Jun 13 11:11:35 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=16101
    DIAG started with pid=3, OS id=16103
    PSP0 started with pid=4, OS id=16105
    LMON started with pid=5, OS id=16107
    LMD0 started with pid=6, OS id=16110
    LMS0 started with pid=7, OS id=16112
    LMS1 started with pid=8, OS id=16116
    MMAN started with pid=9, OS id=16120
    DBW0 started with pid=10, OS id=16132
    LGWR started with pid=11, OS id=16148
    CKPT started with pid=12, OS id=16169
    SMON started with pid=13, OS id=16185
    RECO started with pid=14, OS id=16203
    CJQ0 started with pid=15, OS id=16219
    MMON started with pid=16, OS id=16227
    Wed Jun 13 11:11:36 2012
    starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
    MMNL started with pid=17, OS id=16229
    Wed Jun 13 11:11:36 2012
    starting up 1 shared server(s) ...
    Wed Jun 13 11:11:36 2012
    lmon registered with NM - instance id 2 (internal mem no 1)
    Wed Jun 13 11:11:36 2012
    Reconfiguration started (old inc 0, new inc 2)
    List of nodes:
    1
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Jun 13 11:11:36 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Jun 13 11:11:36 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Jun 13 11:11:36 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:11:36 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Jun 13 11:11:36 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=20, OS id=16235
    Wed Jun 13 11:11:37 2012
    ALTER DATABASE MOUNT
    Wed Jun 13 11:11:37 2012
    This instance was first to mount
    Wed Jun 13 11:11:37 2012
    Starting background process ASMB
    ASMB started with pid=22, OS id=16343
    Starting background process RBAL
    RBAL started with pid=23, OS id=16347
    Wed Jun 13 11:11:44 2012
    SUCCESS: diskgroup DATA was mounted
    Wed Jun 13 11:11:49 2012
    Setting recovery target incarnation to 1
    Wed Jun 13 11:11:49 2012
    Successful mount of redo thread 2, with mount id 3005745065
    Wed Jun 13 11:11:49 2012
    Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
    Completed: ALTER DATABASE MOUNT
    Wed Jun 13 11:22:25 2012
    alter database open
    This instance was first to open
    Wed Jun 13 11:22:26 2012
    Beginning crash recovery of 1 threads
    parallel recovery started with 2 processes
    Wed Jun 13 11:22:26 2012
    Started redo scan
    Wed Jun 13 11:22:26 2012
    Completed redo scan
    61 redo blocks read, 4 data blocks need recovery
    Wed Jun 13 11:22:26 2012
    Started redo application at
    Thread 1: logseq 7927, block 3
    Wed Jun 13 11:22:26 2012
    Recovery of Online Redo Log: Thread 1 Group 1 Seq 7927 Reading mem 0
    Mem# 0: +DATA/osista/onlinelog/group_1.283.742132543
    Wed Jun 13 11:22:26 2012
    Completed redo application
    Wed Jun 13 11:22:26 2012
    Completed crash recovery at
    Thread 1: logseq 7927, block 64, scn 506178382
    4 data blocks read, 4 data blocks written, 61 redo blocks read
    Switch log for thread 1 to sequence 7928
    Picked broadcast on commit scheme to generate SCNs
    Wed Jun 13 11:22:27 2012
    LGWR: STARTING ARCH PROCESSES
    ARC0 started with pid=31, OS id=13010
    Wed Jun 13 11:22:27 2012
    ARC0: Archival started
    ARC1: Archival started
    LGWR: STARTING ARCH PROCESSES COMPLETE
    ARC1 started with pid=32, OS id=13033
    Wed Jun 13 11:22:27 2012
    Thread 2 opened at log sequence 7178
    Current log# 4 seq# 7178 mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
    Successful open of redo thread 2
    Wed Jun 13 11:22:27 2012
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Wed Jun 13 11:22:27 2012
    ARC0: Becoming the 'no FAL' ARCH
    ARC0: Becoming the 'no SRL' ARCH
    Wed Jun 13 11:22:27 2012
    ARC1: Becoming the heartbeat ARCH
    Wed Jun 13 11:22:27 2012
    SMON: enabling cache recovery
    Wed Jun 13 11:22:30 2012
    db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
    user-specified limit on the amount of space that will be used by this
    database for recovery-related files, and does not reflect the amount of
    space available in the underlying filesystem or ASM diskgroup.
    SUCCESS: diskgroup FLASH was mounted
    SUCCESS: diskgroup FLASH was dismounted
    Wed Jun 13 11:22:31 2012
    Successfully onlined Undo Tablespace 20.
    Wed Jun 13 11:22:31 2012
    SMON: enabling tx recovery
    Wed Jun 13 11:22:32 2012
    Database Characterset is AL32UTF8
    Wed Jun 13 11:22:32 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_11751.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Wed Jun 13 11:22:33 2012
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_11751.trc:
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
    Error 600 happened during db open, shutting down database
    USER: terminating instance due to error 600
    Instance terminated by USER, pid = 11751
    ORA-1092 signalled during: alter database open...
    regards,

    Hi;
    Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:Did you check trc file?
    ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []You are getting oracle internal error(ORA 600) which mean you could need to work wiht oracle support team. Please see below note, if its not help than i suggest log a sr:
    Troubleshoot an ORA-600 or ORA-7445 Error Using the Error Lookup Tool [ID 153788.1]
    for your future rac issue please use Forum Home » High Availability » RAC, ASM & Clusterware Installation which is RAC dedicated forum site.
    Regard
    Helios

  • How unhealthy is this RAC?

    Here's is the contents of v$system_event..
    Is this
    EVENT     TOTAL_WAITS     TIME_WAITED     AVERAGE_WAIT
    enq: TX - index contention     40564851     214701526      5.29
    enq: TX - row lock contention     188846     12454614     65.95
    enq: SQ - contention     141971     70568     0.5
    cause for concern?
    EVENT     TOTAL_WAITS     TIME_WAITED     AVERAGE_WAIT
    SQL*Net message to client     6015051449     607254     0
    SQL*Net message from client     6015048542     178177969892     29.62
    gcs remote message     2948555287     2633481757     0.89
    CGS wait for IPC msg     1517805027     634397     0
    db file sequential read     1500615188     816364485     0.54
    ges remote message     1247679701     1407300224     1.13
    gc cr multi block request     778432813     9913464     0.01
    gc current block 2-way     747852637     38030616     0.05
    db file scattered read     709428365     460939295     0.65
    rdbms ipc message     708473316     37650068633     53.14
    gc buffer busy acquire     671285134     1033621285     1.54
    PX Deq: reap credit     667784615     484449     0
    gcs log flush sync     592376026     171712257     0.29
    gc cr block 2-way     530861847     19607062     0.04
    library cache pin     437937120     15126237     0.03
    log file sync     379523272     797193932     2.1
    DIAG idle wait     359607166     2822108755     7.85
    log file parallel write     351225436     259263769     0.74
    LNS ASYNC end of log     350170653     1398410516     3.99
    LNS wait on SENDREQ     321652621     3209301     0.01
    PX qref latch     297396661     94308     0
    read by other session     289140108     148440270     0.51
    buffer deadlock     163505781     983055     0.01
    gc current block busy     119223825     467716658     3.92
    PX Deq: Table Q Normal     117332841     23574867     0.2
    ksxr poll remote instances     110480324     90333     0
    buffer busy waits     106938153     19933900     0.19
    direct path read     93429599     108427028     1.16
    SQL*Net more data from client     86471785     23026529     0.27
    gc current grant busy     84978157     28215346     0.33
    control file sequential read     82646297     23694583     0.29
    PX Deq Credit: send blkd     78641669     9569299     0.12
    latch: cache buffers chains     74218671     690277     0.01
    gc current grant 2-way     72557796     1920419     0.03
    library cache: mutex X     71106697     75993     0
    DFS lock handle     70722498     2716407     0.04
    gc cr grant 2-way     64558237     1633004     0.03
    PX Deq: Execution Msg     61706261     314222076     5.09
    gc cr block busy     61469863     119850802     1.95
    library cache lock     52428649     3773354     0.07
    PX Deq: Slave Session Stats     48040224     1886805     0.04
    db file parallel read     46415188     118467902     2.55
    IPC send completion sync     46250594     965101     0.02
    enq: TX - index contention     40564851     214701526     5.29
    PX Deq: Execute Reply     39689685     17243721     0.43
    gc buffer busy release     36976909     242714774     6.56
    SQL*Net more data to client     36627952     44167     0
    PX Deq: Msg Fragment     30501244     343397     0.01
    rdbms ipc reply     29725302     1352370     0.05
    RMAN backup & recovery I/O     28824547     37722662     1.31
    reliable message     27892263     3082134     0.11
    PX Idle Wait     27356097     4651277341     170.03
    ASM file metadata operation     25098749     8850323     0.35
    gc object scan     22705857     7485     0
    db file parallel write     19896252     52152606     2.62
    latch: ges resource hash list     19336183     427451     0.02
    enq: PS - contention     19143961     707455     0.04
    PX Deq: Parse Reply     19093356     895799     0.05
    gc cr disk read     17816846     448909     0.03
    ASM background timer     16101806     1383957874     85.95
    PX Deq: Slave Join Frag     16044789     233149     0.01
    wait for unread message on broadcast channel     15056320     1413552546     93.88
    cursor: mutex X     13435193     24140     0
    KJC: Wait for msg sends to complete     13268497     11397     0
    PX Deq: Signal ACK RSG     13214824     101941     0.01
    KSV master wait     13206286     4235645     0.32
    direct path read temp     12617694     5487608     0.43
    PX Deq Credit: need buffer     11675868     879967     0.08
    row cache lock     11536185     398216     0.03
    PX Deq Credit: Session Stats     9480862     78910     0.01
    SQL*Net message to dblink     9312894     1538     0
    SQL*Net message from dblink     9312894     6279631     0.67
    control file parallel write     7760982     11854435     1.53
    pmon timer     7558889     1412576090     186.88
    PX Deq: Join ACK     7548816     498931     0.07
    gc current multi block request     6035173     155898     0.03
    PING     5706961     1413230267     247.63
    enq: XR - database force logging     4662671     198813     0.04
    class slave wait     4561877     7097429006     1555.81
    Streams AQ: waiting for messages in the queue     4495828     1543411682     343.3
    SQL*Net more data from dblink     3696582     444575     0.12
    LGWR wait for redo copy     3655353     17840     0
    log file sequential read     3387305     6610414     1.95
    Log archive I/O     2990486     276772     0.09
    SQL*Net break/reset to client     2971976     2385935     0.8
    direct path write temp     2839390     2522114     0.89
    Space Manager: slave idle wait     2827526     1412987186     499.73
    latch: shared pool     2808517     298150     0.11
    latch: gc element     2421717     24688     0.01
    SGA: MMAN sleep for component shrink     2336447     2458094     1.05
    latch: enqueue hash chains     2279645     15435     0.01
    latch free     2089418     78732     0.04
    gc current split     2044784     1864009     0.91
    PX Deq: Signal ACK EXT     1976164     19263     0.01
    enq: FB - contention     1473469     61036     0.04
    cursor: pin S wait on X     1313129     1464789     1.12
    SQL*Net more data to dblink     1232891     986     0
    Streams AQ: RAC qmn coordinator idle wait     1211300     788     0
    enq: HW - contention     1175390     2077008     1.77
    latch: session allocation     1167768     21883     0.02
    Streams AQ: qmn coordinator idle wait     1144699     1412546634     1233.99
    Streams AQ: qmn slave idle wait     1031585     2227183681     2158.99
    lock deadlock retry     962937     4698     0
    enq: CF - contention     956154     609647     0.64
    latch: cache buffers lru chain     902764     37552     0.04
    latch: object queue header operation     817911     27717     0.03
    global enqueue expand wait     768633     654105     0.85
    Data file init write     756191     329758     0.44
    latch: gcs resource hash     647021     4147     0.01
    local write wait     603007     286191     0.47
    latch: row cache objects     599358     6453     0.01
    ges lmd/lmses to freeze in rcfg - mrcvr     481759     156345     0.32
    shared server idle wait     471190     1413238589     2999.3
    enq: RF - DG Broker Current File ID     469833     23209     0.05
    smon timer     432383     1411851085     3265.28
    SGA: allocation forcing component growth     363333     379008     1.04
    gc current retry     341104     1121252     3.29
    enq: RF - synch: DG Broker metadata     319143     588290     1.84
    enq: PG - contention     313659     14830     0.05
    enq: TT - contention     260134     11207172     43.08
    enq: KO - fast object checkpoint     236745     820808     3.47
    dispatcher timer     236637     1413242481     5972.2
    direct path write     231382     191008     0.83
    cursor: pin S     229011     394     0
    Streams AQ: waiting for time management or cleanup tasks     199981     1413148548     7066.41
    enq: TX - row lock contention     188846     12454614     65.95
    enq: TX - allocate ITL entry     153703     54252     0.35
    enq: SQ - contention     141971     70568     0.5
    ksdxexeother     141885     56     0
    latch: redo allocation     138912     1858     0.01
    recovery area: computing applied logs     126415     45925     0.36
    gc current block congested     126318     21768     0.17
    resmgr:cpu quantum     123074     151384     1.23
    jobq slave wait     120678     35574221     294.79
    Datapump dump file I/O     90431     9127     0.1
    ges inquiry response     89402     4041     0.05
    os thread startup     83809     222586     2.66
    cr request retry     80062     71896     0.9
    PX Deq: Table Q Sample     79665     133402     1.67
    gc cr block congested     79026     14792     0.19
    gc cr failure     77521     25019     0.32
    enq: WF - contention     73983     825388     11.16
    enq: TQ - TM contention     72871     3319     0.05
    lock escalate retry     65714     1574     0.02
    buffer exterminate     59775     64919     1.09
    fbar timer     47136     1413183353     29980.98
    log file switch completion     46911     452097     9.64
    recovery area: computing obsolete files     45699     8547     0.19
    enq: US - contention     40401     8805     0.22
    enq: TM - contention     39149     5435032     138.83
    library cache load lock     36311     382575     10.54
    kjbdrmcvtq lmon drm quiesce: ping completion     31668     47443     1.5
    enq: TD - KTF dump entries     31468     1424     0.05
    enq: RO - fast object reuse     28422     31772     1.12
    parallel recovery slave wait for change     27558     3163     0.11
    name-service call wait     23694     181533     7.66
    control file single write     22375     1624     0.07
    kksfbc child completion     21239     106926     5.03
    PX Deq: Table Q qref     19325     245     0.01
    enq: TX - contention     18805     113253     6.02
    latch: messages     17203     181     0.01
    enq: RS - prevent file delete     16913     1013     0.06
    enq: RS - prevent aging list update     15682     642     0.04
    PX Deq: Table Q Get Keys     14322     42935     3
    gc current grant congested     14292     2192     0.15
    cursor: mutex S     13285     8     0
    log file single write     13164     5371     0.41
    latch: undo global data     12649     178     0.01
    kksfbc research     11894     12680     1.07
    parallel recovery slave idle wait     11193     5872     0.52
    wait list latch free     11026     11794     1.07
    enq: CT - state     11001     417     0.04
    latch: checkpoint queue latch     10526     132     0.01
    enq: PE - contention     10506     1139     0.11
    ARCH wait on SENDREQ     9957     216480     21.74
    gc cr grant congested     9465     1584     0.17
    wait for scn ack     9377     3155     0.34
    enq: TA - contention     8856     324     0.04
    log buffer space     8777     89323     10.18
    enq: TK - Auto Task Serialization     8542     343     0.04
    enq: DR - contention     7842     323     0.04
    process diagnostic dump     7707     2072     0.27
    JOX Jit Process Sleep     7612     11286431     1482.72
    enq: TC - contention     7357     340817     46.33
    ges global resource directory to be frozen     7140     12299     1.72
    enq: CO - master slave det     6850     312     0.05
    enq: JS - job run lock - synchronize     6704     397     0.06
    gcs drm freeze in enter server mode     6542     40742     6.23
    enq: TS - contention     5959     89332     14.99
    ARCH wait for archivelog lock     5600     36     0.01
    PX Nsq: PQ load info query     5377     104798     19.49
    db file single write     5373     3452     0.64
    gc remaster     5315     50625     9.52
    latch: parallel query alloc buffer     4939     1906     0.39
    enq: TO - contention     4799     143     0.03
    enq: AF - task serialization     4395     161     0.04
    enq: PI - contention     4251     163     0.04
    ges2 LMON to wake up LMD - mrcvr     4210     28     0.01
    enq: DL - contention     3889     239     0.06
    kjctssqmg: quick message send wait     3408     22     0.01
    LNS wait on DETACH     3275     741     0.23
    ksfd: async disk IO     3274     1     0
    LNS wait on ATTACH     3273     51940     15.87
    ARCH wait on DETACH     3231     714     0.22
    ARCH wait on ATTACH     3226     43238     13.4
    enq: BR - file shrink     2787     116     0.04
    write complete waits     2631     1070     0.41
    enq: MD - contention     2596     67     0.03
    enq: WL - contention     2198     266518     121.25
    single-task message     2098     25896     12.34
    enq: OD - Serializing DDLs     2054     66     0.03
    resmgr:internal state change     2001     14735     7.36
    ARCH wait on c/f tx acquire 2     1751     175230     100.07
    enq: WR - contention     1636     69     0.04
    latch: cache buffer handles     1610     29     0.02
    statement suspended, wait error to be cleared     1497     22626     15.11
    Streams AQ: qmn coordinator waiting for slave to start     1214     678966     559.28
    enq: PD - contention     1182     33     0.03
    JS kgl get object wait     1096     4922     4.49
    undo segment extension     1070     10065     9.41
    PL/SQL lock timer     949     8739819     9209.5
    enq: AE - lock     937     28     0.03
    LGWR-LNS wait on channel     832     913     1.1
    ges DFS hang analysis phase 2 acks     816     495     0.61
    latch: redo writing     729     9     0.01
    gc quiesce     665     564     0.85
    enq: JS - queue lock     482     2111     4.38
    PX Deq: Test for credit     442     13     0.03
    enq: SS - contention     386     274     0.71
    recovery area: computing dropped files     328     1400     4.27
    recovery area: computing backed up files     328     496     1.51
    ksdxexeotherwait     279     10592     37.97
    log switch/archive     250     137570     550.28
    gc domain validation     223     39964     179.21
    auto-sqltune: wait graph update     195     96514     494.95
    wait for a undo record     170     1214     7.14
    parallel recovery coord send blocked     168     4     0.02
    enq: JS - wdw op     168     3741     22.27
    enq: KT - contention     165     5     0.03
    switch logfile command     156     6290     40.32
    gcs resource directory to be unfrozen     149     12839     86.17
    Data Guard Broker Wait     139     10906     78.46
    enq: SK - contention     129     4     0.03
    enq: JS - job recov lock     128     4     0.03
    gc cr block lost     125     6772     54.17
    virtual circuit wait     122     3     0.03
    ges LMON to get to FTDONE      100     187     1.87
    enq: CU - contention     80     242     3.02
    enq: JQ - contention     78     7     0.09
    cursor: pin X     73     83     1.14
    parallel recovery coord wait for reply     70     510     7.29
    PX Deq: Txn Recovery Start     67     3436     51.29
    SQL*Net break/reset to dblink     60     0     0
    gc current block lost     57     2869     50.33
    ges LMD suspend for testing event     51     709     13.89
    inactive session     46     4550     98.91
    recovery read     45     5     0.11
    JS kill job wait     41     3548     86.53
    enq: AS - service activation     40     1     0.03
    enq: TL - contention     35     2     0.05
    enq: UL - contention     34     524     15.42
    gcs enter server mode     33     1559     47.24
    wait for stopper event to be increased     30     218     7.27
    enq: TQ - DDL contention     24     300     12.52
    enq: MR - contention     21     1     0.03
    ges reconfiguration to start     20     54     2.72
    ges enter server mode     20     502     25.08
    buffer latch     18     1337     74.26
    enq: SR - contention     18     1     0.05
    Streams: RAC waiting for inter instance ack     18     3748     208.21
    enq: PR - contention     17     46     2.72
    kupp process wait     16     166     10.39
    checkpoint completed     15     3678     245.19
    PX Deque wait     14     68     4.87
    enq: BF - allocation contention     14     1     0.08
    enq: XL - fault extent map     14     51     3.66
    enq: FU - contention     14     17     1.18
    enq: TH - metric threshold evaluation     13     114     8.78
    enq: MW - contention     12     0     0.04
    enq: DD - contention     10     0     0.04
    process terminate     8     41     5.08
    ges cgs registration     8     151     18.9
    buffer resize     7     0     0
    ktm: instance recovery     7     698     99.66
    LNS wait on LGWR     6     0     0
    ASM background starting     6     381     63.43
    gc cr block 3-way     5     0     0.08
    enq: PV - syncstart     5     9     1.74
    Global transaction acquire instance locks     4     4     1.09
    enq: RS - read alert level     4     0     0.04
    LGWR wait on LNS     3     0     0
    gc recovery     3     540     179.85
    Streams AQ: enqueue blocked on low memory     3     544     181.2
    DBWR range invalidation sync     3     17     5.83
    enq: DM - contention     3     0     0.03
    enq: RF - FSFO Observer Heartbeat     3     0     0.03
    enq: JS - q mem clnup lck     3     0     0
    DG Broker configuration file I/O     2     0     0
    enq: RC - Result Cache: Contention     2     493     246.6
    enq: KM - contention     2     0     0.03
    enq: RT - contention     2     0     0.04
    instance state change     2     0     0.12
    kkdlgon     2     10     5.11
    enq: TQ - INI contention     2     292     146.07
    enq: JS - contention     2     0     0
    ARCH wait for netserver start     1     400     400.02
    log file switch (checkpoint incomplete)     1     3     3.44
    JS coord start wait     1     50     50.09
    ges lmd and pmon to attach     1     1     1.26
    wait for tmc2 to complete     1     3     3.03
    control file heartbeat     1     400     400.02
    enq: SW - contention     1     0     0.04
    enq: PW - perwarm status in dbw0     1     0     0.09
    enq: FS - contention     1     0     0.04
    enq: XR - quiesce database     1     0     0.04
    enq: RS - write alert level     1     0     0.02
    enq: CN - race with init     1     0     0.03
    enq: FE - contention     1     4     3.77
    Wait for shrink lock2     1     10     10.03
    enq: IA - contention     1     0     0.02
    enq: RF - atomicity     1     0     0.05
    enq: RF - synchronization: aifo master     1     0     0.02
    enq: RF - RF - Database Automatic Disable     1     0     0.06
    enq: WP - contention     1     0     0.02
    enq: TB - SQL Tuning Base Cache Load     1     0     0.05
    enq: JS - evt notify     1     0     0.02Edited by: steffi on Mar 20, 2011 12:21 AM
    Edited by: steffi on Mar 20, 2011 8:18 AM
    Edited by: steffi on Mar 20, 2011 8:19 AM

    Text can be formatted by tagging the beginning and end of the block of text with the code ta
    \Formatted text goes here.
    \Example:
    This is formatted.When cutting and pasting text such as execution plans, excerpts from AWR reports, etc, it will maintain spacing and formatting, and make for much easier reading.
    As to the content, well, dumping the contents of v$system_event is pretty close to useless.
    As to the first three events you listed, 'enq: TX - index contention', 'enq: TX - row lock contention', 'enq: SQ - contention', well, all of those are easily tunable.
    First, for 'enq: SQ - contention', check your sequences. Do you have any NOCACHE sequences? Or sequences with small caches?
    As for 'enq: TX - row lock contention', well that's fairly self-explanatory. You have multiple sessions trying to lock the same row in the same table at the same time.
    Last, 'enq: TX - index contention', that's non-row level contention on an index. For example, if you have a unique index, insert a row w/ column value 1, but don't commit, then try to insert that same value from another session.
    But, before you do any of that, I think you need to clearly understand where the bottlenecks are. Try taking some AWR snapshots, about 5 minutes apart, when you're having performance problems. Look at the AWR report for that 5 minute period. In particular, look at your Top 5 timed events.
    Hope that helps,
    -Mark

  • 2 Node RAC abnormal behaviour

    Platform: "HP-UX 11.23 64-bit"
    Database: "10.2.0.4 64-bit"
    RAC: 2 Node RAC setup
    Our RAC setup has been properly done and RAC is working fine with load balancing i.e clients are getting connection on both instances. BUT the issue I am facing with my RAC setup is High Availability testing. When I send reboot signal to "Node-2" and the "Node-1" is up what I observe and receive complain from clients that they have lost connection with database ALSO no new connections are being allowed. When I see the alert log of "Node-1" I see the following abnormal messages reported in it:
    List of nodes:
    0 1
    Global Resource Directory frozen
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Tue Aug 9 04:02:15 2011
    LMS 2: 0 GCS shadows cancelled, 0 closed
    Tue Aug 9 04:02:15 2011
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Tue Aug 9 04:02:15 2011
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Tue Aug 9 04:02:15 2011
    LMS 1: 1908 GCS shadows traversed, 1076 replayed
    Tue Aug 9 04:02:15 2011
    LMS 2: 1911 GCS shadows traversed, 1086 replayed
    Tue Aug 9 04:02:15 2011
    LMS 0: 1899 GCS shadows traversed, 1164 replayed
    Tue Aug 9 04:02:15 2011
    Submitted all GCS remote-cache requests
    Post SMON to start 1st pass IR
    Fix write in gcs resources
    Reconfiguration complete
    Tue Aug 9 04:02:16 2011
    ARCH shutting down
    ARC2: Archival stopped
    Tue Aug 9 04:02:21 2011
    Redo thread 2 internally enabled
    Tue Aug 9 04:02:35 2011
    Reconfiguration started (old inc 4, new inc 6)
    List of nodes:
    0
    Global Resource Directory frozen
    * dead instance detected - domain 0 invalid = TRUE
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Tue Aug 9 04:02:35 2011
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Tue Aug 9 04:02:35 2011
    LMS 2: 0 GCS shadows cancelled, 0 closed
    Tue Aug 9 04:02:35 2011
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Tue Aug 9 04:02:35 2011
    Instance recovery: looking for dead threads
    Tue Aug 9 04:02:35 2011
    Beginning instance recovery of 1 threads
    Tue Aug 9 04:02:35 2011
    LMS 1: 1908 GCS shadows traversed, 0 replayed
    Tue Aug 9 04:02:35 2011
    LMS 2: 1907 GCS shadows traversed, 0 replayed
    Tue Aug 9 04:02:35 2011
    LMS 0: 1899 GCS shadows traversed, 0 replayed
    Tue Aug 9 04:02:35 2011
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    Tue Aug 9 04:02:37 2011
    parallel recovery started with 11 processes
    Tue Aug 9 04:02:37 2011
    Started redo application at
    Thread 2: logseq 6, block 2, scn 1837672332
    Tue Aug 9 04:02:37 2011
    Errors in file /u01/app/oracle/product/10.2.0/db/admin/BAF/bdump/baf1_smon_10253.trc:
    ORA-00600: internal error code, arguments: [kcratr2_onepass], [], [], [], [], [], [], []
    Tue Aug 9 04:02:38 2011
    Errors in file /u01/app/oracle/product/10.2.0/db/admin/BAF/bdump/baf1_smon_10253.trc:
    ORA-00600: internal error code, arguments: [kcratr2_onepass], [], [], [], [], [], [], []
    Tue Aug 9 04:02:38 2011
    Errors in file /u01/app/oracle/product/10.2.0/db/admin/BAF/bdump/baf1_smon_10253.trc:
    ORA-00600: internal error code, arguments: [kcratr2_onepass], [], [], [], [], [], [], []
    SMON: terminating instance due to error 600
    Tue Aug 9 04:02:38 2011
    Dump system state for local instance only
    System State dumped to trace file /u01/app/oracle/product/10.2.0/db/admin/BAF/bdump/baf1_diag_10229.trc
    Tue Aug 9 04:02:38 2011
    Instance terminated by SMON, pid = 10253
    Tue Aug 9 04:04:09 2011
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 lan3 192.168.1.0 configured from OCR for use as a cluster interconnect
    Interface type 1 lan2 172.20.21.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 3
    Autotune of undo retention is turned off.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.4.0.
    System parameters with non-default values:
    processes = 300
    sessions = 335
    timed_statistics = TRUE
    Kindly help me to get rid out of this issue. Waiting for the quick and helpful response from the gurus in the forum. Thanks in advance.
    Regards,

    if above were really 100% correct, you would not be here posting about errors!Definitely but these situations could become the cause for new BUGS, isn't it?
    I don't know what is real & what is unnecessary obfuscation.What part of the thread you didn't understand.
    It is not a good idea to have subtraction sign/dash character as part of object/host name; i.e. "Node-1"."Node-1" is not the hostname it is just to make clear understanding. the hostname is "sdupn101" for node-1 and "sdupn102" for node-2.
    ORA-00600/ORA-07445/ORA-03113 = Oracle bug => search on Metalink and/or call Oracle supportNewbie is my status on this forum but I have little bit ethics of using forums and suppot blogs. I searched but unfortunately didn't find any matching solution.
    Anyway will update you once find any solution so that you can assist someone else in future.

  • Oracle Rac - Targetting clients to particular nodes?

    We have a deployment case that wanted to find out best practices.
    Currently there is a two node RAC setup.
    We have an application that on one side -- is a high write, minor read component -- and another side which is mostly read and minor write. Each component uses multiple connections and have logical separations of what tables they are writing to.
    The current proposed solution is to target the high write components to one node instance -- while distributing the other component, which does A LOT of reading and some writing, across both RAC instances.
    The question is what is the best paradigm for this: Any documentation of when to target components to single instances versus letting RAC do it's own distribution/etc. There are different user accounts for the components that do the high volume writing versus the selecting/etc..
    Thanks.

    Hi,
    The question is what is the best paradigm for this: Any documentation of when to target components to single instances versus letting RAC do it's own distribution/etc. There are different user accounts for the components that do the high volume writing versus the selecting/etc..I'm assuming you're using version 10g or later. On Oracle RAC 9i you don't have this feature.
    You can direct connections to nodes that have characteristics in common workload using Oracle Services.
    To manage workloads or a group of applications, you can define services that you assign to a particular application or to a subset of an application's operations. You can also group work by type under services.
    Oracle recommends that all users who share a service have the same service level requirements. You can define specific characteristics for services and each service can be a separate unit of work. There are many options that you can take advantage of when using services. Although you do not have to implement these options, using them helps optimize application performance.
    When you define a service, you define which instances normally support that service. These are known as the PREFERRED instances. You can also define other instances to support a service if the service's preferred instance fails. These are known as AVAILABLE instances.
    Services are integrated with Resource Manager, which enables you to restrict the resources that are used by the users who connect with a service in an instance. The Resource Manager enables you to map a consumer group to a service so that users who connect with the service are members of the specified consumer group.
    Using a Service (11.1 or later) when you execute a SQL statement in parallel, the parallel processes only run on the instances that offer the service with which you originally connected to the database. This is the default behavior. This does not affect other parallel operations such as parallel recovery or the processing of GV$ queries. To override this behavior, set a value for the PARALLEL_INSTANCE_GROUP initialization parameter.
    You'll have much more concept to learn how it works than to know how to configure.
    Understanding how it works is essential to configure the services properly.
    http://download.oracle.com/docs/cd/B28359_01/rac.111/b28254/hafeats.htm#CHDGEBED
    http://www.ardentperf.com/pub/schneider-services.pdf
    Any questions just ask.
    Regards,
    Levi Pereira

  • HP service guard and RAC or dataguard

    HP service guard must use with RAC on HP unix server?
    In dataguard how can we use the hp service guard?
    will you pls. clarify.

    HP Service guard is a clustering or High availability solution that protects you against hardware failure. Oracle Parallel server, the precursor to RAC, when implemented on HPUX platform required HP Service guard. With RAC, oracle bundles in its own clusterware and service guard is no longer a mandatory prerequisite. If Service Guard is installed oracle clusterware will delegate some of the cluster responsibilities to service guard.
    Dataguard on the other hand is a Disaster recovery solution that protects you against site disaster or whole data center outage. With that said it is not uncommon to see dataguard being setup between two servers within same datacenter to protect against hardware failure or for reporting purposes.
    How would you use dataguard with service guard? On the production site, you can use service guard for fail over the database between two nodes in case there is a hard ware failure. You can do the same for the DR site also.

  • RAC 10.2.0.5 asm redhat5 64位出现不停自己重启的现象

    手上一个rac的库,(版本10.2.0.5 64bit,操作系统是redhat5 64bit),9月份的时候down机了,查看了alert日志,在alert_asm.log中发现有io failed,在alert_orcl.log中发现有ORA-00204: error in reading (block 35, # blocks 1) of control file的报错.同事重启之后,能恢复过来,但不久之后在alert日志中又会发现,诸如:
    Reconfiguration started (old inc 16, new inc 17)
    List of nodes:
    0
    Global Resource Directory frozen
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    之类的提示,经过观察实例会不停的自己down掉再重启..
    现在拿到手上想处理一下,但不知道该怎么下手,请各位指点,谢谢..
    2个节点的alert_asm.log如下:
    asm1:
    NOTE: ASMB process exiting due to lack of ASM file activity for 12 seconds
    Wed Sep 12 13:13:25 CST 2012
    WARNING: IO Failed. au:43 diskname:ORCL:VOL1
         rq:0x2ab86feb6f88 buffer:0x627ed000 au_offset(bytes):720896 iosz:4096 operation:1
         status:2
    NOTE: cache initiating offline of disk 0 group 1
    WARNING: process 6933 initiating offline of disk 0.3915955288 (VOL1) with mask 0x3 in group 1
    WARNING: Disk 0 in group 1 in mode: 0x7,state: 0x2 will be taken offline
    NOTE: PST update: grp = 1, dsk = 0, mode = 0x6
    Wed Sep 12 13:13:25 CST 2012
    ERROR: too many offline disks in PST (grp 1)
    Wed Sep 12 13:13:25 CST 2012
    ERROR: PST-initiated MANDATORY DISMOUNT of group DATA
    Wed Sep 12 13:13:25 CST 2012
    WARNING: Disk 0 in group 1 in mode: 0x7,state: 0x2 was taken offline
    Wed Sep 12 13:13:25 CST 2012
    NOTE: halting all I/Os to diskgroup DATA
    NOTE: active pin found: 0x0x65faf748
    NOTE: active pin found: 0x0x65faf8a8
    Wed Sep 12 13:13:26 CST 2012
    NOTE: cache dismounting group 1/0xB8984CA8 (DATA)
    Wed Sep 12 13:13:27 CST 2012
    kjbdomdet send to node 1
    detach from dom 1, sending detach message to node 1
    Wed Sep 12 13:13:27 CST 2012
    Dirty detach reconfiguration started (old inc 16, new inc 16)
    List of nodes:
    0 1
    Global Resource Directory partially frozen for dirty detach
    * dirty detach - domain 1 invalid = TRUE
    116 GCS resources traversed, 0 cancelled
    6104 GCS resources on freelist, 6124 on array, 6124 allocated
    Dirty Detach Reconfiguration complete
    Wed Sep 12 13:13:27 CST 2012
    WARNING: dirty detached from domain 1
    Wed Sep 12 13:13:27 CST 2012
    NOTE: PST enabling heartbeating (grp 1)
    Wed Sep 12 13:13:27 CST 2012
    SUCCESS: diskgroup DATA was dismounted
    Wed Sep 12 13:13:27 CST 2012
    WARNING: PST-initiated MANDATORY DISMOUNT of group DATA not performed - group not mounted
    Wed Sep 12 13:13:27 CST 2012
    Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_b001_7494.trc:
    ORA-15001: diskgroup "DATA" does not exist or is not mounted
    Wed Sep 12 13:13:28 CST 2012
    freeing rdom 1
    Received dirty detach msg from node 1 for dom 1
    Wed Sep 12 14:53:41 CST 2012
    Reconfiguration started (old inc 16, new inc 17)
    List of nodes:
    0
    Global Resource Directory frozen
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Sep 12 14:53:41 CST 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Sep 12 14:53:41 CST 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Sep 12 14:53:41 CST 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    Wed Sep 12 14:53:51 CST 2012
    Shutting down instance (abort)
    License high water mark = 4
    Instance terminated by USER, pid = 14969
    Wed Sep 12 14:56:38 CST 2012
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 eth1 10.185.3.0 configured from OCR for use as a cluster interconnect
    Interface type 1 eth0 10.185.3.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 3
    Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/db_1/dbs/arch
    Autotune of undo retention is turned off.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.5.0.
    System parameters with non-default values:
    large_pool_size = 12582912
    instance_type = asm
    cluster_database = TRUE
    instance_number = 1
    remote_login_passwordfile= EXCLUSIVE
    background_dump_dest = /u01/app/oracle/admin/+ASM/bdump
    user_dump_dest = /u01/app/oracle/admin/+ASM/udump
    core_dump_dest = /u01/app/oracle/admin/+ASM/cdump
    asm_diskstring = ORCL:VOL*
    asm_diskgroups = DATA
    Cluster communication is configured to use the following interface(s) for this instance
    10.185.3.77
    Wed Sep 12 14:56:39 CST 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=7327
    LMON started with pid=5, OS id=7333
    PSP0 started with pid=4, OS id=7331
    DIAG started with pid=3, OS id=7329
    LMD0 started with pid=6, OS id=7335
    LMS0 started with pid=7, OS id=7337
    MMAN started with pid=8, OS id=7341
    DBW0 started with pid=9, OS id=7343
    LGWR started with pid=10, OS id=7345
    CKPT started with pid=11, OS id=7347
    SMON started with pid=12, OS id=7349
    RBAL started with pid=13, OS id=7351
    GMON started with pid=14, OS id=7353
    Wed Sep 12 14:56:40 CST 2012
    lmon registered with NM - instance id 1 (internal mem no 0)
    Wed Sep 12 14:56:40 CST 2012
    Reconfiguration started (old inc 0, new inc 2)
    ASM instance
    List of nodes:
    0 1
    Global Resource Directory frozen
    Communication channels reestablished
    * allocate domain 1, invalid = TRUE
    * domain 1 valid = 1 according to instance 1
    Wed Sep 12 14:56:40 CST 2012
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Sep 12 14:56:40 CST 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Wed Sep 12 14:56:40 CST 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Sep 12 14:56:40 CST 2012
    Submitted all GCS remote-cache requests
    Post SMON to start 1st pass IR
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=15, OS id=7360
    Wed Sep 12 14:56:41 CST 2012
    SQL> ALTER DISKGROUP ALL MOUNT
    Wed Sep 12 14:56:41 CST 2012
    NOTE: cache registered group DATA number=1 incarn=0xdd6857ac
    Wed Sep 12 14:56:41 CST 2012
    Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
    Wed Sep 12 14:56:41 CST 2012
    NOTE: Hbeat: instance not first (grp 1)
    NOTE: cache opening disk 0 of grp 1: VOL1 label:VOL1
    Wed Sep 12 14:56:41 CST 2012
    NOTE: F1X0 found on disk 0 fcn 0.0
    NOTE: cache mounting (not first) group 1/0xDD6857AC (DATA)
    Wed Sep 12 14:56:41 CST 2012
    kjbdomatt send to node 1
    Wed Sep 12 14:56:42 CST 2012
    NOTE: attached to recovery domain 1
    Wed Sep 12 14:56:42 CST 2012
    NOTE: LGWR attempting to mount thread 2 for disk group 1
    NOTE: LGWR mounted thread 2 for disk group 1
    NOTE: opening chunk 2 at fcn 0.131392 ABA
    NOTE: seq=41 blk=1148
    Wed Sep 12 14:56:42 CST 2012
    NOTE: cache mounting group 1/0xDD6857AC (DATA) succeeded
    SUCCESS: diskgroup DATA was mounted
    Wed Sep 12 14:56:43 CST 2012
    NOTE: recovering COD for group 1/0xdd6857ac (DATA)
    SUCCESS: completed COD recovery for group 1/0xdd6857ac (DATA)
    Wed Sep 12 14:56:45 CST 2012
    Starting background process ASMB
    ASMB started with pid=17, OS id=7454
    Wed Sep 12 14:56:55 CST 2012
    NOTE: ASMB process exiting due to lack of ASM file activity for 12 seconds
    asm2:
    NOTE: ASMB process exiting due to lack of ASM file activity for 12 seconds
    Received dirty detach msg from node 0 for dom 1
    Wed Sep 12 13:13:30 CST 2012
    Dirty detach reconfiguration started (old inc 16, new inc 16)
    List of nodes:
    0 1
    Global Resource Directory partially frozen for dirty detach
    * dirty detach - domain 1 invalid = TRUE
    14 GCS resources traversed, 0 cancelled
    6104 GCS resources on freelist, 6124 on array, 6124 allocated
    99 GCS shadows traversed, 0 replayed
    Dirty Detach Reconfiguration complete
    Wed Sep 12 13:13:30 CST 2012
    NOTE: SMON starting instance recovery for group 1 (mounted)
    Wed Sep 12 13:13:30 CST 2012
    WARNING: IO Failed. au:0 diskname:ORCL:VOL1
         rq:0x2b88463fb990 buffer:0x2b884670ca00 au_offset(bytes):0 iosz:4096 operation:0
         status:2
    WARNING: IO Failed. au:0 diskname:ORCL:VOL1
         rq:0x2b88463fb990 buffer:0x2b884670ca00 au_offset(bytes):0 iosz:4096 operation:0
         status:2
    WARNING: IO Failed. au:4 diskname:ORCL:VOL1
         rq:0xe4372e0 buffer:0x6045f000 au_offset(bytes):0 iosz:4096 operation:0
         status:2
    WARNING: cache failed to read gn 1 fn 3 blk 0 count 1 from disk 0
    ERROR: cache failed to read fn=3 blk=0 from disk(s): 0
    ORA-15081: failed to submit an I/O operation to a disk
    NOTE: cache initiating offline of disk 0 group 1
    WARNING: process 6999 initiating offline of disk 0.3915955111 (VOL1) with mask 0x3 in group 1
    NOTE: PST update: grp = 1, dsk = 0, mode = 0x6
    Wed Sep 12 13:13:30 CST 2012
    ERROR: too many offline disks in PST (grp 1)
    Wed Sep 12 13:13:30 CST 2012
    ERROR: PST-initiated MANDATORY DISMOUNT of group DATA
    Wed Sep 12 13:13:30 CST 2012
    WARNING: Disk 0 in group 1 in mode: 0x7,state: 0x2 was taken offline
    Wed Sep 12 13:13:30 CST 2012
    NOTE: halting all I/Os to diskgroup DATA
    NOTE: active pin found: 0x0x65faf748
    Wed Sep 12 13:13:30 CST 2012
    Abort recovery for domain 1
    Wed Sep 12 13:13:30 CST 2012
    NOTE: cache dismounting group 1/0xB8984B57 (DATA)
    Wed Sep 12 13:13:31 CST 2012
    kjbdomdet send to node 0
    detach from dom 1, sending detach message to node 0
    Wed Sep 12 13:13:31 CST 2012
    Dirty detach reconfiguration started (old inc 16, new inc 16)
    List of nodes:
    0 1
    Global Resource Directory partially frozen for dirty detach
    * dirty detach - domain 1 invalid = TRUE
    99 GCS resources traversed, 0 cancelled
    6104 GCS resources on freelist, 6124 on array, 6124 allocated
    Dirty Detach Reconfiguration complete
    Wed Sep 12 13:13:31 CST 2012
    freeing rdom 1
    Wed Sep 12 13:13:31 CST 2012
    WARNING: dirty detached from domain 1
    Wed Sep 12 13:13:31 CST 2012
    SUCCESS: diskgroup DATA was dismounted
    Wed Sep 12 13:13:31 CST 2012
    WARNING: PST-initiated MANDATORY DISMOUNT of group DATA not performed - group not mounted
    Wed Sep 12 13:13:31 CST 2012
    Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm2_b001_17967.trc:
    ORA-15001: diskgroup "DATA" does not exist or is not mounted
    Wed Sep 12 14:53:43 CST 2012
    Shutting down instance (abort)
    License high water mark = 4
    Instance terminated by USER, pid = 25013
    Wed Sep 12 14:56:02 CST 2012
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 eth1 10.185.3.0 configured from OCR for use as a cluster interconnect
    Interface type 1 eth0 10.185.3.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 3
    Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/db_1/dbs/arch
    Autotune of undo retention is turned off.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.5.0.
    System parameters with non-default values:
    large_pool_size = 12582912
    instance_type = asm
    cluster_database = TRUE
    instance_number = 2
    remote_login_passwordfile= EXCLUSIVE
    background_dump_dest = /u01/app/oracle/admin/+ASM/bdump
    user_dump_dest = /u01/app/oracle/admin/+ASM/udump
    core_dump_dest = /u01/app/oracle/admin/+ASM/cdump
    asm_diskstring = ORCL:VOL*
    asm_diskgroups = DATA
    Cluster communication is configured to use the following interface(s) for this instance
    10.185.3.79
    Wed Sep 12 14:56:03 CST 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    LMON started with pid=5, OS id=7171
    PSP0 started with pid=4, OS id=7169
    DIAG started with pid=3, OS id=7167
    PMON started with pid=2, OS id=7165
    LMD0 started with pid=6, OS id=7173
    LMS0 started with pid=7, OS id=7175
    MMAN started with pid=8, OS id=7179
    DBW0 started with pid=9, OS id=7181
    LGWR started with pid=10, OS id=7183
    CKPT started with pid=11, OS id=7185
    SMON started with pid=12, OS id=7187
    RBAL started with pid=13, OS id=7189
    GMON started with pid=14, OS id=7192
    Wed Sep 12 14:56:04 CST 2012
    lmon registered with NM - instance id 2 (internal mem no 1)
    Wed Sep 12 14:56:04 CST 2012
    Reconfiguration started (old inc 0, new inc 1)
    ASM instance
    List of nodes:
    1
    Global Resource Directory frozen
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Sep 12 14:56:04 CST 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Sep 12 14:56:04 CST 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Sep 12 14:56:04 CST 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=15, OS id=7198
    Wed Sep 12 14:56:05 CST 2012
    SQL> ALTER DISKGROUP ALL MOUNT
    Wed Sep 12 14:56:05 CST 2012
    NOTE: cache registered group DATA number=1 incarn=0xdd684d18
    Wed Sep 12 14:56:05 CST 2012
    Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
    Wed Sep 12 14:56:05 CST 2012
    NOTE: Hbeat: instance first (grp 1)
    Wed Sep 12 14:56:10 CST 2012
    NOTE: start heartbeating (grp 1)
    NOTE: cache opening disk 0 of grp 1: VOL1 label:VOL1
    Wed Sep 12 14:56:10 CST 2012
    NOTE: F1X0 found on disk 0 fcn 0.0
    NOTE: cache mounting (first) group 1/0xDD684D18 (DATA)
    * allocate domain 1, invalid = TRUE
    Wed Sep 12 14:56:10 CST 2012
    NOTE: attached to recovery domain 1
    Wed Sep 12 14:56:10 CST 2012
    NOTE: starting recovery of thread=1 ckpt=40.10147 group=1
    NOTE: starting recovery of thread=2 ckpt=40.1147 group=1
    NOTE: advancing ckpt for thread=2 ckpt=40.1147
    NOTE: advancing ckpt for thread=1 ckpt=40.10159
    NOTE: cache recovered group 1 to fcn 0.227416
    Wed Sep 12 14:56:10 CST 2012
    NOTE: LGWR attempting to mount thread 1 for disk group 1
    NOTE: LGWR mounted thread 1 for disk group 1
    NOTE: opening chunk 1 at fcn 0.227416 ABA
    NOTE: seq=41 blk=10160
    Wed Sep 12 14:56:10 CST 2012
    NOTE: cache mounting group 1/0xDD684D18 (DATA) succeeded
    SUCCESS: diskgroup DATA was mounted
    Wed Sep 12 14:56:13 CST 2012
    NOTE: recovering COD for group 1/0xdd684d18 (DATA)
    SUCCESS: completed COD recovery for group 1/0xdd684d18 (DATA)
    Wed Sep 12 14:56:16 CST 2012
    Starting background process ASMB
    ASMB started with pid=17, OS id=7358
    Wed Sep 12 14:56:26 CST 2012
    NOTE: ASMB process exiting due to lack of ASM file activity for 9 seconds
    Wed Sep 12 14:56:40 CST 2012
    Reconfiguration started (old inc 1, new inc 2)
    List of nodes:
    0 1
    Global Resource Directory frozen
    Communication channels reestablished
    * domain 1 valid = 1 according to instance 0
    Wed Sep 12 14:56:40 CST 2012
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Sep 12 14:56:40 CST 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Wed Sep 12 14:56:40 CST 2012
    LMS 0: 98 GCS shadows traversed, 0 replayed
    Wed Sep 12 14:56:40 CST 2012
    Submitted all GCS remote-cache requests
    Post SMON to start 1st pass IR
    Fix write in gcs resources
    Reconfiguration complete
    帖子经 suredandan编辑过

    2个节点的alert_orcl.log如下:
    实例1:
    Wed Sep 12 13:13:25 CST 2012
    NOTE:Waiting for all pending writes to complete before de-registering: grpnum 1
    Wed Sep 12 13:13:25 CST 2012
    Errors in file /u01/app/oracle/admin/orcl/bdump/orcl1_lmon_32575.trc:
    ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
    ORA-15078: ASM diskgroup was forcibly dismounted
    Wed Sep 12 13:13:25 CST 2012
    Errors in file /u01/app/oracle/admin/orcl/bdump/orcl1_lmon_32575.trc:
    ORA-00204: error in reading (block 35, # blocks 1) of control file
    ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
    ORA-15078: ASM diskgroup was forcibly dismounted
    Wed Sep 12 13:13:25 CST 2012
    LMON: terminating instance due to error 204
    Wed Sep 12 13:13:26 CST 2012
    System state dump is made for local instance
    System State dumped to trace file /u01/app/oracle/admin/orcl/bdump/orcl1_diag_32571.trc
    Wed Sep 12 13:13:26 CST 2012
    Shutting down instance (abort)
    License high water mark = 30
    Wed Sep 12 13:13:27 CST 2012
    Trace dumping is performing id=[cdmp_20120912131326]
    Wed Sep 12 13:13:27 CST 2012
    Instance terminated by LMON, pid = 32575
    Wed Sep 12 13:13:31 CST 2012
    Instance terminated by USER, pid = 7518
    Wed Sep 12 14:56:46 CST 2012
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 eth1 10.185.3.0 configured from OCR for use as a cluster interconnect
    Interface type 1 eth0 10.185.3.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 3
    Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/db_1/dbs/arch
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.5.0.
    System parameters with non-default values:
    processes = 500
    sessions = 555
    sga_max_size = 10485760000
    __shared_pool_size = 1811939328
    __large_pool_size = 16777216
    __java_pool_size = 33554432
    __streams_pool_size = 0
    spfile = +DATA/orcl/spfileorcl.ora
    sga_target = 10485760000
    control_files = +DATA/orcl/controlfile/current.260.789223249
    db_block_size = 8192
    __db_cache_size = 8489271296
    db_keep_cache_size = 117440512
    compatible = 10.2.0.5.0
    db_file_multiblock_read_count= 16
    cluster_database = TRUE
    cluster_database_instances= 2
    db_create_file_dest = +DATA
    thread = 1
    instance_number = 1
    undo_management = AUTO
    undo_tablespace = UNDOTBS1
    remote_login_passwordfile= EXCLUSIVE
    db_domain =
    dispatchers =
    local_listener = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.185.3.80)(PORT = 1521))
    remote_listener = LISTENERS_ORCL
    job_queue_processes = 10
    background_dump_dest = /u01/app/oracle/admin/orcl/bdump
    user_dump_dest = /u01/app/oracle/admin/orcl/udump
    core_dump_dest = /u01/app/oracle/admin/orcl/cdump
    audit_file_dest = /u01/app/oracle/admin/orcl/adump
    db_name = orcl
    open_cursors = 300
    pga_aggregate_target = 770703360
    Cluster communication is configured to use the following interface(s) for this instance
    10.185.3.77
    Wed Sep 12 14:56:46 CST 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=7468
    DIAG started with pid=3, OS id=7470
    PSP0 started with pid=4, OS id=7479
    LMON started with pid=5, OS id=7481
    LMD0 started with pid=6, OS id=7484
    LMS0 started with pid=7, OS id=7486
    LMS1 started with pid=8, OS id=7490
    MMAN started with pid=9, OS id=7494
    DBW0 started with pid=10, OS id=7496
    LGWR started with pid=11, OS id=7498
    CKPT started with pid=12, OS id=7500
    SMON started with pid=13, OS id=7502
    RECO started with pid=14, OS id=7504
    CJQ0 started with pid=15, OS id=7506
    MMON started with pid=16, OS id=7508
    MMNL started with pid=17, OS id=7510
    Wed Sep 12 14:56:48 CST 2012
    lmon registered with NM - instance id 1 (internal mem no 0)
    Wed Sep 12 14:56:49 CST 2012
    Reconfiguration started (old inc 0, new inc 4)
    List of nodes:
    0 1
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    * domain 0 valid according to instance 1
    * domain 0 valid = 1 according to instance 1
    Wed Sep 12 14:56:49 CST 2012
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Sep 12 14:56:49 CST 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Sep 12 14:56:49 CST 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Wed Sep 12 14:56:49 CST 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Sep 12 14:56:49 CST 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Sep 12 14:56:49 CST 2012
    Submitted all GCS remote-cache requests
    Post SMON to start 1st pass IR
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=18, OS id=7546
    Wed Sep 12 14:56:50 CST 2012
    ALTER DATABASE MOUNT
    Wed Sep 12 14:56:50 CST 2012
    Starting background process ASMB
    ASMB started with pid=20, OS id=7552
    Starting background process RBAL
    RBAL started with pid=21, OS id=7556
    Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
    Wed Sep 12 14:56:53 CST 2012
    SUCCESS: diskgroup DATA was mounted
    Wed Sep 12 14:56:57 CST 2012
    Setting recovery target incarnation to 2
    Wed Sep 12 14:56:57 CST 2012
    Successful mount of redo thread 1, with mount id 1321558804
    Wed Sep 12 14:56:57 CST 2012
    Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
    Completed: ALTER DATABASE MOUNT
    Wed Sep 12 14:56:58 CST 2012
    ALTER DATABASE OPEN
    Picked broadcast on commit scheme to generate SCNs
    Wed Sep 12 14:56:58 CST 2012
    Thread 1 opened at log sequence 354
    Current log# 2 seq# 354 mem# 0: +DATA/orcl/onlinelog/group_2.262.789223251
    Successful open of redo thread 1
    Wed Sep 12 14:56:58 CST 2012
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Wed Sep 12 14:56:58 CST 2012
    SMON: enabling cache recovery
    Wed Sep 12 14:56:59 CST 2012
    Successfully onlined Undo Tablespace 1.
    Wed Sep 12 14:56:59 CST 2012
    SMON: enabling tx recovery
    Wed Sep 12 14:56:59 CST 2012
    Database Characterset is AL32UTF8
    Opening with internal Resource Manager plan
    replication_dependency_tracking turned off (no async multimaster replication found)
    Starting background process QMNC
    QMNC started with pid=25, OS id=7618
    Wed Sep 12 14:57:00 CST 2012
    Completed: ALTER DATABASE OPEN
    实例2:
    Wed Sep 12 13:13:30 CST 2012
    SUCCESS: diskgroup DATA was dismounted
    SUCCESS: diskgroup DATA was dismounted
    Wed Sep 12 13:13:30 CST 2012
    Errors in file /u01/app/oracle/admin/orcl/bdump/orcl2_lmon_2673.trc:
    ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
    ORA-15078: ASM diskgroup was forcibly dismounted
    Wed Sep 12 13:13:30 CST 2012
    Errors in file /u01/app/oracle/admin/orcl/bdump/orcl2_ckpt_2691.trc:
    ORA-00206: error in writing (block 4, # blocks 1) of control file
    ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
    ORA-15078: ASM diskgroup was forcibly dismounted
    Wed Sep 12 13:13:30 CST 2012
    Errors in file /u01/app/oracle/admin/orcl/bdump/orcl2_ckpt_2691.trc:
    ORA-00221: error on write to control file
    ORA-00206: error in writing (block 4, # blocks 1) of control file
    ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
    ORA-15078: ASM diskgroup was forcibly dismounted
    Wed Sep 12 13:13:30 CST 2012
    CKPT: terminating instance due to error 221
    Wed Sep 12 13:13:30 CST 2012
    Errors in file /u01/app/oracle/admin/orcl/bdump/orcl2_lmon_2673.trc:
    ORA-00204: error in reading (block 35, # blocks 1) of control file
    ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
    ORA-15078: ASM diskgroup was forcibly dismounted
    Wed Sep 12 13:13:31 CST 2012
    Shutting down instance (abort)
    License high water mark = 29
    Wed Sep 12 13:13:32 CST 2012
    System state dump is made for local instance
    System State dumped to trace file /u01/app/oracle/admin/orcl/bdump/orcl2_diag_2669.trc
    Wed Sep 12 13:13:33 CST 2012
    Trace dumping is performing id=[cdmp_20120912131330]
    Wed Sep 12 13:13:36 CST 2012
    Instance terminated by CKPT, pid = 2691
    Wed Sep 12 13:13:41 CST 2012
    Instance terminated by USER, pid = 18002
    Wed Sep 12 14:56:16 CST 2012
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Interface type 1 eth1 10.185.3.0 configured from OCR for use as a cluster interconnect
    Interface type 1 eth0 10.185.3.0 configured from OCR for use as a public interface
    Picked latch-free SCN scheme 3
    Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/db_1/dbs/arch
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    ksdpec: called for event 13740 prior to event group initialization
    Starting up ORACLE RDBMS Version: 10.2.0.5.0.
    System parameters with non-default values:
    processes = 500
    sessions = 555
    sga_max_size = 10485760000
    __shared_pool_size = 1962934272
    __large_pool_size = 16777216
    __java_pool_size = 16777216
    __streams_pool_size = 0
    spfile = +DATA/orcl/spfileorcl.ora
    sga_target = 10485760000
    control_files = +DATA/orcl/controlfile/current.260.789223249
    db_block_size = 8192
    __db_cache_size = 8355053568
    db_keep_cache_size = 117440512
    compatible = 10.2.0.5.0
    db_file_multiblock_read_count= 16
    cluster_database = TRUE
    cluster_database_instances= 2
    db_create_file_dest = +DATA
    thread = 2
    instance_number = 2
    undo_management = AUTO
    undo_tablespace = UNDOTBS2
    remote_login_passwordfile= EXCLUSIVE
    db_domain =
    dispatchers =
    local_listener = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.185.3.81)(PORT = 1521))
    remote_listener = LISTENERS_ORCL
    job_queue_processes = 10
    background_dump_dest = /u01/app/oracle/admin/orcl/bdump
    user_dump_dest = /u01/app/oracle/admin/orcl/udump
    core_dump_dest = /u01/app/oracle/admin/orcl/cdump
    audit_file_dest = /u01/app/oracle/admin/orcl/adump
    db_name = orcl
    open_cursors = 300
    pga_aggregate_target = 770703360
    Cluster communication is configured to use the following interface(s) for this instance
    10.185.3.79
    Wed Sep 12 14:56:16 CST 2012
    cluster interconnect IPC version:Oracle UDP/IP (generic)
    IPC Vendor 1 proto 2
    PMON started with pid=2, OS id=7366
    DIAG started with pid=3, OS id=7368
    PSP0 started with pid=4, OS id=7370
    LMON started with pid=5, OS id=7372
    LMD0 started with pid=6, OS id=7374
    LMS0 started with pid=7, OS id=7376
    LMS1 started with pid=8, OS id=7380
    MMAN started with pid=9, OS id=7384
    DBW0 started with pid=10, OS id=7386
    LGWR started with pid=11, OS id=7388
    CKPT started with pid=12, OS id=7390
    SMON started with pid=13, OS id=7392
    RECO started with pid=14, OS id=7394
    CJQ0 started with pid=15, OS id=7396
    MMON started with pid=16, OS id=7398
    MMNL started with pid=17, OS id=7401
    Wed Sep 12 14:56:19 CST 2012
    lmon registered with NM - instance id 2 (internal mem no 1)
    Wed Sep 12 14:56:19 CST 2012
    Reconfiguration started (old inc 0, new inc 2)
    List of nodes:
    1
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Sep 12 14:56:19 CST 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Sep 12 14:56:19 CST 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Post SMON to start 1st pass IR
    Wed Sep 12 14:56:19 CST 2012
    LMS 0: 0 GCS shadows traversed, 0 replayed
    Wed Sep 12 14:56:19 CST 2012
    LMS 1: 0 GCS shadows traversed, 0 replayed
    Wed Sep 12 14:56:19 CST 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete
    LCK0 started with pid=18, OS id=7418
    Wed Sep 12 14:56:20 CST 2012
    ALTER DATABASE MOUNT
    Wed Sep 12 14:56:20 CST 2012
    This instance was first to mount
    Wed Sep 12 14:56:20 CST 2012
    Starting background process ASMB
    ASMB started with pid=20, OS id=7424
    Starting background process RBAL
    RBAL started with pid=21, OS id=7428
    Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
    Wed Sep 12 14:56:23 CST 2012
    SUCCESS: diskgroup DATA was mounted
    Wed Sep 12 14:56:27 CST 2012
    Setting recovery target incarnation to 2
    Wed Sep 12 14:56:27 CST 2012
    Successful mount of redo thread 2, with mount id 1321558804
    Wed Sep 12 14:56:27 CST 2012
    Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
    Completed: ALTER DATABASE MOUNT
    Wed Sep 12 14:56:27 CST 2012
    ALTER DATABASE OPEN
    This instance was first to open
    Wed Sep 12 14:56:27 CST 2012
    Beginning crash recovery of 2 threads
    parallel recovery started with 7 processes
    Wed Sep 12 14:56:28 CST 2012
    Started redo scan
    Wed Sep 12 14:56:28 CST 2012
    Completed redo scan
    717 redo blocks read, 25 data blocks need recovery
    Wed Sep 12 14:56:28 CST 2012
    Started redo application at
    Thread 1: logseq 353, block 70791
    Thread 2: logseq 401, block 88381
    Wed Sep 12 14:56:28 CST 2012
    Recovery of Online Redo Log: Thread 1 Group 1 Seq 353 Reading mem 0
    Mem# 0: +DATA/orcl/onlinelog/group_1.261.789223251
    Wed Sep 12 14:56:28 CST 2012
    Recovery of Online Redo Log: Thread 2 Group 3 Seq 401 Reading mem 0
    Mem# 0: +DATA/orcl/onlinelog/group_3.265.789223275
    Wed Sep 12 14:56:28 CST 2012
    Completed redo application
    Wed Sep 12 14:56:28 CST 2012
    Completed crash recovery at
    Thread 1: logseq 353, block 71419, scn 18394186
    Thread 2: logseq 401, block 88470, scn 18386240
    25 data blocks read, 25 data blocks written, 717 redo blocks read
    Wed Sep 12 14:56:29 CST 2012
    Thread 1 advanced to log sequence 354 (thread recovery)
    Picked broadcast on commit scheme to generate SCNs
    Wed Sep 12 14:56:29 CST 2012
    Thread 2 advanced to log sequence 402 (thread open)
    Thread 2 opened at log sequence 402
    Current log# 4 seq# 402 mem# 0: +DATA/orcl/onlinelog/group_4.266.789223275
    Successful open of redo thread 2
    Wed Sep 12 14:56:29 CST 2012
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Wed Sep 12 14:56:29 CST 2012
    SMON: enabling cache recovery
    Wed Sep 12 14:56:30 CST 2012
    Successfully onlined Undo Tablespace 5.
    Wed Sep 12 14:56:30 CST 2012
    SMON: enabling tx recovery
    Wed Sep 12 14:56:30 CST 2012
    Database Characterset is AL32UTF8
    Opening with internal Resource Manager plan
    replication_dependency_tracking turned off (no async multimaster replication found)
    Starting background process QMNC
    QMNC started with pid=30, OS id=7508
    Wed Sep 12 14:56:32 CST 2012
    Completed: ALTER DATABASE OPEN
    Wed Sep 12 14:56:49 CST 2012
    Reconfiguration started (old inc 2, new inc 4)
    List of nodes:
    0 1
    Global Resource Directory frozen
    Communication channels reestablished
    * domain 0 valid = 1 according to instance 0
    Wed Sep 12 14:56:49 CST 2012
    Master broadcasted resource hash value bitmaps
    Non-local Process blocks cleaned out
    Wed Sep 12 14:56:49 CST 2012
    LMS 0: 0 GCS shadows cancelled, 0 closed
    Wed Sep 12 14:56:49 CST 2012
    LMS 1: 0 GCS shadows cancelled, 0 closed
    Set master node info
    Submitted all remote-enqueue requests
    Dwn-cvts replayed, VALBLKs dubious
    All grantable enqueues granted
    Wed Sep 12 14:56:49 CST 2012
    LMS 1: 2192 GCS shadows traversed, 1025 replayed
    Wed Sep 12 14:56:49 CST 2012
    LMS 0: 2135 GCS shadows traversed, 1082 replayed
    Wed Sep 12 14:56:49 CST 2012
    Submitted all GCS remote-cache requests
    Fix write in gcs resources
    Reconfiguration complete

  • Rollback ( Tx Recovery) and Roll forward ( cache Recovey)

    Hi Guys,
    I have a some doubt after Going through the links :
    Difference between redo logs and undo tablespace and Oracle DBA ADMIN Guide (E25494-02).
    1) Redologs Contain committed and Uncommitted data . Wether they also contain before Image Data ? or its vector that points to the Undo Segments ?
    Doc Says : Redo entries record data that you can use to reconstruct all changes made to the database, including the undo segments.
    But Above forum Links says that it contains the vector only .
    2) Crash of database ( Abort) , how it happens .
    I know first it makes the roll forward and then backward. During this Time Undo Tablespace Comes into the picture ? Do the Undo tablespace segments are build from the redo logs. Please Help on this. really get confused
    I try to test few scenarios but Not able to conclude ?
    1> Make A TXN ( Updated almost 29000 rows )
    2) Abort the Database.
    3) Change the undo Tablespace Name in init file some dummy .
    Database error with the following Error :
    SMON: enabling cache recovery
    Errors in file d:\install\pracledb\diag\rdbms\dba\dba\trace\dba_ora_5592.trc:
    ORA-30012: undo tablespace 'UNDOTBS11' does not exist or of wrong type
    Errors in file d:\install\pracledb\diag\rdbms\dba\dba\trace\dba_ora_5592.trc:
    ORA-30012: undo tablespace 'UNDOTBS11' does not exist or of wrong type
    Error 30012 happened during db open, shutting down database
    USER (ospid: 5592): terminating the instance due to error 30012
    Second Test :
    1) Create a 2nd Undo Tablespace
    2) Update the same number of rows .
    3) Shut abort in another session.
    4) Modify the Pfile with new undotabs2 ( No Spfile)
    5) Startup database it started.
    Beginning crash recovery of 1 threads
    parallel recovery started with 2 processes
    Started redo scan
    Completed redo scan
    read 7971 KB redo, 1272 data blocks need recovery
    Started redo application at
    Thread 1: logseq 28, block 3
    Recovery of Online Redo Log: Thread 1 Group 1 Seq 28 Reading mem 0
    Mem# 0: D:\INSTALL\PRACLEDB\ORADATA\DBA\REDO01.LOG
    Completed redo application of 6.72MB
    Completed crash recovery at
    Thread 1: logseq 28, block 15945, scn 1487147
    1272 data blocks read, 1272 data blocks written, 7971 redo k-bytes read
    Thread 1 advanced to log sequence 29 (thread open)
    Thread 1 opened at log sequence 29
    Current log# 2 seq# 29 mem# 0: D:\INSTALL\PRACLEDB\ORADATA\DBA\REDO02.LOG
    Successful open of redo thread 1
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    SMON: enabling cache recovery
    Successfully onlined Undo Tablespace 5.
    Verifying file header compatibility for 11g tablespace encryption..
    Verifying 11g file header compatibility for tablespace encryption completed
    Sun Oct 28 18:11:03 2012
    SMON: enabling tx recovery
    Database Characterset is WE8MSWIN1252
    No Resource Manager plan active
    SMON: Parallel transaction recovery tried
    3) Third Test :
    1) Update the same number of rows
    2) Change the UNDOTBS to the New UndoTBS
    3) rename the Old Datafile
    4) db start to fail with error
    ALTER DATABASE OPEN
    Errors in file d:\install\pracledb\diag\rdbms\dba\dba\trace\dba_dbw0_3960.trc:
    ORA-01157: cannot identify/lock data file 10 - see DBWR trace file
    ORA-01110: data file 10: 'D:\INSTALL\PRACLEDB\ORADATA\DBA\UNDOTBS2.DBF'
    ORA-27041: unable to open file
    OSD-04002: unable to open file
    O/S-Error: (OS 2) The system cannot find the file specified.
    Errors in file d:\install\pracledb\diag\rdbms\dba\dba\trace\dba_ora_4552.trc:
    ORA-01157: cannot identify/lock data file 10 - see DBWR trace file
    ORA-01110: data file 10: 'D:\INSTALL\PRACLEDB\ORADATA\DBA\UNDOTBS2.DBF'
    ORA-1157 signalled during: ALTER DATABASE OPEN...
    Sun Oct 28 18:20:04 2012
    Checker run found 1 new persistent data failures
    Please Help on understand this

    Sourabh85 wrote:
    Hi Hemant,
    Thank you very much , really very helpful. One More Question :
    Why Oracle do the Double work ? First Populate the Undo Blocks and then do the roll backward ? Why Not directly recovered from the redo logs for uncommitted txn.Its not the doubling of the work. Oracle updates teh Undo blocks with the old data and since the Undo blocks are also just like the Other data blocks, any changes done to them is also logged as a Change Vector to the redo log files. This means, that there would be an Undo data block change vector recorded in the redo log file before the change vector of the data block which contains the data of your statement. This is what is meant when it is said that the redo log also contains the old images . The idea is to do the recovery, when needed, in the same sequence(based on the SCN's) using everything from the redo log files. This would include the recovery of the Undo datafiles as well in the case they are lost too!
    HTH
    Aman....

Maybe you are looking for