Parallel recovery on DR on RAC

Hi all,
We have a 2 node DR on 2 node RAC. Just wanted to know if it is possible to do recovery from both the nodes parallel, I mean node 1 will apply its archive and node 2 prod nodes 2 archive.
If this is possible how is that and if its not why?
Regards
ID

Hello;
Have you looked at:
D.1.1 Setting Up a Multi-Instance Primary with a Single-Instance Standby
Data Guard Concepts and Administration 11g Release 2 (11.2) E25608-04
If yes, what would parallel recovery give you?
I have never tried to do this setup but the main issues I would see it the second standby instance would not be available
if something happened to the first standby instance.
By using only one instance for apply it's much simplier to keep track of the redo coming from the two primary instances.
If you don't use a standby as a reader database it does not have a lot of load on it. Generally if you check the alert
log there are several minutes between the "waiting for log..." messages.
Great question!
Best Regards
mseberg

Similar Messages

Should I specify the Parallel parameter for an non-RAC database?

The Oracle documatation state as the following:
"The Oracle Database 10g Release 2 database controls and balances all parallel operations, based upon available resources, request priorities and actual system load." It show that Oracle can optimize the Parallel level automaticly.
Should I specify the Parallel parameter for a non-RAC database? Most of the transactions are small OLTP.

What parallel parameter are you talking about?
Generally, you may benefit from parallelization in a very similar manner on RAC as on single instance system. And it is in both cases not sufficient to change the value of any initialization parameter to achieve parallelization of queries, DDL or DML.
Kind regards
Uwe
http://uhesse.wordpress.com

Number of parallel recovery processes in standby

Hi,
How to find the number of parallel recovery processes that the standby is started with..MRP parallel???

user13179227 wrote:
Hi,
How to find the number of parallel recovery processes
that the standby is started with..MRP parallel???what is "MRP parallel"?

Multi-table INSERT with PARALLEL hint on 2 node RAC

Multi-table INSERT statement with parallelism set to 5, works fine and spawns multiple parallel
servers to execute. Its just that it sticks on to only one instance of a 2 node RAC. The code I
used is what is given below.
create table t1 ( x int );
create table t2 ( x int );
insert /*+ APPEND parallel(t1,5) parallel (t2,5) */
when (dummy='X') then into t1(x) values (y)
when (dummy='Y') then into t2(x) values (y)
select dummy, 1 y from dual;
I can see multiple sessions using the below query, but on only one instance only. This happens not
only for the above statement but also for a statement where real time table(as in table with more
than 20 million records) are used.
select p.server_name,ps.sid,ps.qcsid,ps.inst_id,ps.qcinst_id,degree,req_degree,
sql.sql_text
from Gv$px_process p, Gv$sql sql, Gv$session s , gv$px_session ps
WHERE p.sid = s.sid
and p.serial# = s.serial#
and p.sid = ps.sid
and p.serial# = ps.serial#
and s.sql_address = sql.address
and s.sql_hash_value = sql.hash_value
and qcsid=945
Won't parallel servers be spawned across instances for multi-table insert with parallelism on RAC?
Thanks,
Mahesh

Please take a look at these 2 articles below
http://christianbilien.wordpress.com/2007/09/12/strategies-for-rac-inter-instance-parallelized-queries-part-12/
http://christianbilien.wordpress.com/2007/09/14/strategies-for-parallelized-queries-across-rac-instances-part-22/
thanks
http://swervedba.wordpress.com

Transaction Recovery within an Oracle RAC environment

Good evening everyone.
I need some help with Oracle 11gR1 RAC transaction-level recovery issues. Here's the scenario.
We have a three(3) node RAC Cluster running Oracle 11g R1. The Web UI portion of the application connects through WLS 9.2.3 with connection pooling set. We also have a command-line/SQL*Developer component that uses a TNSNAMES file that allows for both failover and load balancing. Within either the UI or the command line portion of the application, a user can run a process by which invokes one or more PL/SQL Packages to be invoked. The exact location of the physical to the database is dependent on which server is chosen from either the connection pooling or the TNSNAMES.ORA Load Balancing option.
In the normal world, the process executes and all is good. The status of the execution of this process is updated by the Packages once completed. The problem we are encountering is when an Oracle Instance fails. Here's where I need some help. For Application-level (Transaction Level) recovery, the database instances are first recovered by the database background proccesses and then Users must determine which processes were in flight and either re-execute them (if restart processing is part of the process) or remove any changes and restart from scratch. Given that the database instance does not record which processes are "in flight" it is the responsibility of the application to perform its own recovery processing. Is this still true?
If an instance fails, are "in flight" transactions/connections moved to other instances in the Grid/RAC environment? I don't think this is possible but I don't remember if this was accomplished through a combination of Application and Database Server features that provide feedback between each other. How is the underlying application notified of the change if such an issue occurs? I remember something similar to this in older versions of Oracle but I cannot remember what it was callled.
Any help or guidance would be great as our client is being extremely difficult in pressing this issue.
Thanks in advance
Stephen Karniotis
Project Architect - Compuware
[email protected]
(248) 408-2918

You have not indicated whether you are using TAF or FCF ... that would be the first place to start.
My recommendation would be to let Oracle roll back the database changes and have the application resubmit the most recent work.
If the application knows what it did since the last "COMMIT" then you should be fine with the possible exception of variables stored
in packages. Depending on packages retaining values is an issue best solved with PRAGMA SERIALLY_REUSABLE ... in other words
not using the retention feature.

How to write Parallel DML in 2 node RAC Cluster

Any ideas on how to write a DML that will run on a two node cluster in parallel? I would like to scale a DML statement within a RAC environment. Thanks

Check out [this article|http://www.oracle.com/technology/pub/articles/conlon_rac.html].

Which difference parallel database and RAC database

Hi Experts,
I saw some document about parallel database and RAC database.
My boss confused these two product.
which difference between parallel database and RAC database?
does parallel database is a "old RAC"?
Thanks
Jim

RAC is the new name with many features for old depreciated Oracle Parallel Server (OPS).
It was avialable till 8i, since 9i it is called RAC.
Regards
Edited by: skvaish1 on Mar 30, 2010 12:59 PM

SMON: Parallel transaction recovery tried

Hi,
I got these following messages in the BDUMP file. Around these time my process was always getting blocked while updating a table. Does this messages mean anything critical:
Dump file d:\oracle\admin\usmdb\bdump\usmdb_smon_1564.trc
Sun Mar 07 03:17:27 2004
ORACLE V9.2.0.1.0 - Production vsnsta=0
vsnsql=12 vsnxtr=3
Windows 2000 Version 5.0 Service Pack 3, CPU type 586
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.1.0 - Production
Windows 2000 Version 5.0 Service Pack 3, CPU type 586
Instance name: usmdb
Redo thread mounted by this instance: 1
Oracle process number: 6
Windows thread id: 1564, image: ORACLE.EXE
*** 2004-03-07 03:17:27.000
*** SESSION ID:(5.1) 2004-03-07 03:17:27.000
*** 2004-03-07 03:17:27.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 03:58:28.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 04:38:38.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 04:39:23.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 05:20:16.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 06:01:25.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 06:42:27.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 07:23:28.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 08:04:33.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 08:45:31.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 09:26:36.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 10:07:42.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 10:48:46.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 11:29:41.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 12:10:56.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 12:51:52.000
SMON: Parallel transaction recovery tried
*** 2004-03-07 13:32:55.000
SMON: Parallel transaction recovery tried
Thanks,
Tuhin

That would occur because a large transaction (or transactions) had been killed / interrupted while the instance was running (or when the instance was shutdown abort). SMON takes over the job of "cleanup" and may use Parallel Recovery . You should be able to monitor the recovery in the V$FAST_START_TRANSACTIONS view.

Can not start Rac database

Hi,
Oracle RAC database 10.2.0.3/RedHat4 with 2 nodes.
In the begining we had an error ORA-600[12803] so only sys can connect to database I find the note 1026653.6 this note said that we need to create AUDSES$ sequence but befor that we have to restart the database.
When we stop the datanbase we had another ORA-600 and it's impossible to start it!!
Here is a coppy of our alert file:
Picked latch-free SCN scheme 2
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.3.0.
System parameters with non-default values:
processes = 300
sessions = 335
sga_max_size = 524288000
__shared_pool_size = 310378496
__large_pool_size = 4194304
__java_pool_size = 8388608
__streams_pool_size = 8388608
spfile = +DATA/osista/spfileosista.ora
nls_language = FRENCH
nls_territory = FRANCE
nls_length_semantics = CHAR
sga_target = 524288000
control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
db_block_size = 8192
__db_cache_size = 184549376
compatible = 10.2.0.3.0
log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
db_file_multiblock_read_count= 16
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +DATA
db_recovery_file_dest = +FLASH
db_recovery_file_dest_size= 68543315968
thread = 2
instance_number = 2
undo_management = AUTO
undo_tablespace = UNDOTBS2
undo_retention = 29880
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
remote_listener = LISTENERS_OSISTA
job_queue_processes = 10
background_dump_dest = /oracle/product/admin/OSISTA/bdump
user_dump_dest = /oracle/product/admin/OSISTA/udump
core_dump_dest = /oracle/product/admin/OSISTA/cdump
audit_file_dest = /oracle/product/admin/OSISTA/adump
db_name = OSISTA
open_cursors = 300
pga_aggregate_target = 104857600
aq_tm_processes = 1
Cluster communication is configured to use the following interface(s) for this instance
172.16.0.2
Wed Jun 13 11:04:30 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=8560
DIAG started with pid=3, OS id=8562
PSP0 started with pid=4, OS id=8566
LMON started with pid=5, OS id=8570
LMD0 started with pid=6, OS id=8574
LMS0 started with pid=7, OS id=8576
LMS1 started with pid=8, OS id=8580
MMAN started with pid=9, OS id=8584
DBW0 started with pid=10, OS id=8586
LGWR started with pid=11, OS id=8588
CKPT started with pid=12, OS id=8590
SMON started with pid=13, OS id=8592
RECO started with pid=14, OS id=8594
CJQ0 started with pid=15, OS id=8596
MMON started with pid=16, OS id=8598
Wed Jun 13 11:04:31 2012
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=17, OS id=8600
Wed Jun 13 11:04:31 2012
starting up 1 shared server(s) ...
Wed Jun 13 11:04:31 2012
lmon registered with NM - instance id 2 (internal mem no 1)
Wed Jun 13 11:04:31 2012
Reconfiguration started (old inc 0, new inc 2)
List of nodes:
1
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Jun 13 11:04:31 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Wed Jun 13 11:04:31 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Wed Jun 13 11:04:31 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Jun 13 11:04:31 2012
LMS 1: 0 GCS shadows traversed, 0 replayed
Wed Jun 13 11:04:31 2012
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=20, OS id=8877
Wed Jun 13 11:04:43 2012
alter database mount
Wed Jun 13 11:04:43 2012
This instance was first to mount
Wed Jun 13 11:04:43 2012
Starting background process ASMB
ASMB started with pid=25, OS id=10068
Starting background process RBAL
RBAL started with pid=26, OS id=10072
Wed Jun 13 11:04:47 2012
SUCCESS: diskgroup DATA was mounted
Wed Jun 13 11:04:51 2012
Setting recovery target incarnation to 1
Wed Jun 13 11:04:52 2012
Successful mount of redo thread 2, with mount id 3005749259
Wed Jun 13 11:04:52 2012
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Completed: alter database mount
Wed Jun 13 11:05:06 2012
alter database open
Wed Jun 13 11:05:06 2012
This instance was first to open
Wed Jun 13 11:05:06 2012
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Wed Jun 13 11:05:07 2012
Started redo scan
Wed Jun 13 11:05:07 2012
Completed redo scan
61 redo blocks read, 4 data blocks need recovery
Wed Jun 13 11:05:07 2012
Started redo application at
Thread 1: logseq 7924, block 3, scn 506098125
Wed Jun 13 11:05:07 2012
Recovery of Online Redo Log: Thread 1 Group 2 Seq 7924 Reading mem 0
Mem# 0: +DATA/osista/onlinelog/group_2.372.742132543
Wed Jun 13 11:05:07 2012
Completed redo application
Wed Jun 13 11:05:07 2012
Completed crash recovery at
Thread 1: logseq 7924, block 64, scn 506118186
4 data blocks read, 4 data blocks written, 61 redo blocks read
Switch log for thread 1 to sequence 7925
Picked broadcast on commit scheme to generate SCNs
db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
SUCCESS: diskgroup FLASH was mounted
SUCCESS: diskgroup FLASH was dismounted
Thread 1 advanced to log sequence 7926
SUCCESS: diskgroup FLASH was mounted
SUCCESS: diskgroup FLASH was dismounted
Thread 1 advanced to log sequence 7927
Wed Jun 13 11:05:11 2012
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=31, OS id=12747
Wed Jun 13 11:05:11 2012
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=32, OS id=12749
Wed Jun 13 11:05:12 2012
Thread 2 opened at log sequence 7176
Current log# 4 seq# 7176 mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
Wed Jun 13 11:05:12 2012
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Wed Jun 13 11:05:12 2012
Successful open of redo thread 2
Wed Jun 13 11:05:12 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Jun 13 11:05:12 2012
ARC0: Becoming the heartbeat ARCH
Wed Jun 13 11:05:12 2012
SMON: enabling cache recovery
Wed Jun 13 11:05:15 2012
Successfully onlined Undo Tablespace 20.
Wed Jun 13 11:05:15 2012
SMON: enabling tx recovery
Wed Jun 13 11:05:15 2012
Database Characterset is AL32UTF8
Wed Jun 13 11:05:16 2012
Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:
ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
Wed Jun 13 11:05:16 2012
Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:
ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 9174
ORA-1092 signalled during: alter database open...
Wed Jun 13 11:06:16 2012
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth0 172.16.0.0 configured from OCR for use as a cluster interconnect
Interface type 1 bond0 132.147.160.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.3.0.
System parameters with non-default values:
processes = 300
sessions = 335
sga_max_size = 524288000
__shared_pool_size = 314572800
__large_pool_size = 4194304
__java_pool_size = 8388608
__streams_pool_size = 8388608
spfile = +DATA/osista/spfileosista.ora
nls_language = FRENCH
nls_territory = FRANCE
nls_length_semantics = CHAR
sga_target = 524288000
control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
db_block_size = 8192
__db_cache_size = 180355072
compatible = 10.2.0.3.0
log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
db_file_multiblock_read_count= 16
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +DATA
db_recovery_file_dest = +FLASH
db_recovery_file_dest_size= 68543315968
thread = 2
instance_number = 2
undo_management = AUTO
undo_tablespace = UNDOTBS2
undo_retention = 29880
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
remote_listener = LISTENERS_OSISTA
job_queue_processes = 10
background_dump_dest = /oracle/product/admin/OSISTA/bdump
user_dump_dest = /oracle/product/admin/OSISTA/udump
core_dump_dest = /oracle/product/admin/OSISTA/cdump
audit_file_dest = /oracle/product/admin/OSISTA/adump
db_name = OSISTA
open_cursors = 300
pga_aggregate_target = 104857600
aq_tm_processes = 1
Cluster communication is configured to use the following interface(s) for this instance
172.16.0.2
Wed Jun 13 11:06:16 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=18682
DIAG started with pid=3, OS id=18684
PSP0 started with pid=4, OS id=18695
LMON started with pid=5, OS id=18704
LMD0 started with pid=6, OS id=18721
LMS0 started with pid=7, OS id=18735
LMS1 started with pid=8, OS id=18753
MMAN started with pid=9, OS id=18767
DBW0 started with pid=10, OS id=18788
LGWR started with pid=11, OS id=18796
CKPT started with pid=12, OS id=18799
SMON started with pid=13, OS id=18801
RECO started with pid=14, OS id=18803
CJQ0 started with pid=15, OS id=18805
MMON started with pid=16, OS id=18807
Wed Jun 13 11:06:17 2012
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=17, OS id=18809
Wed Jun 13 11:06:17 2012
starting up 1 shared server(s) ...
Wed Jun 13 11:06:17 2012
lmon registered with NM - instance id 2 (internal mem no 1)
Wed Jun 13 11:06:17 2012
Reconfiguration started (old inc 0, new inc 2)
List of nodes:
1
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Jun 13 11:06:18 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Wed Jun 13 11:06:18 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Wed Jun 13 11:06:18 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Jun 13 11:06:18 2012
LMS 1: 0 GCS shadows traversed, 0 replayed
Wed Jun 13 11:06:18 2012
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=20, OS id=18816
Wed Jun 13 11:06:18 2012
ALTER DATABASE MOUNT
Wed Jun 13 11:06:18 2012
This instance was first to mount
Wed Jun 13 11:06:18 2012
Reconfiguration started (old inc 2, new inc 4)
List of nodes:
0 1
Wed Jun 13 11:06:18 2012
Starting background process ASMB
Wed Jun 13 11:06:18 2012
Global Resource Directory frozen
Communication channels reestablished
ASMB started with pid=22, OS id=18913
Starting background process RBAL
* domain 0 valid = 0 according to instance 0
Wed Jun 13 11:06:18 2012
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Jun 13 11:06:18 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Wed Jun 13 11:06:18 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed Jun 13 11:06:18 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Jun 13 11:06:18 2012
LMS 1: 0 GCS shadows traversed, 0 replayed
Wed Jun 13 11:06:18 2012
Submitted all GCS remote-cache requests
Fix write in gcs resources
RBAL started with pid=23, OS id=18917
Reconfiguration complete
Wed Jun 13 11:06:22 2012
SUCCESS: diskgroup DATA was mounted
Wed Jun 13 11:06:26 2012
Setting recovery target incarnation to 1
Wed Jun 13 11:06:26 2012
Successful mount of redo thread 2, with mount id 3005703530
Wed Jun 13 11:06:26 2012
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Completed: ALTER DATABASE MOUNT
Wed Jun 13 11:06:27 2012
ALTER DATABASE OPEN
This instance was first to open
Wed Jun 13 11:06:27 2012
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Wed Jun 13 11:06:27 2012
Started redo scan
Wed Jun 13 11:06:27 2012
Completed redo scan
61 redo blocks read, 4 data blocks need recovery
Wed Jun 13 11:06:28 2012
Started redo application at
Thread 2: logseq 7176, block 3
Wed Jun 13 11:06:28 2012
Recovery of Online Redo Log: Thread 2 Group 4 Seq 7176 Reading mem 0
Mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
Wed Jun 13 11:06:28 2012
Completed redo application
Wed Jun 13 11:06:28 2012
Completed crash recovery at
Thread 2: logseq 7176, block 64, scn 506138248
4 data blocks read, 4 data blocks written, 61 redo blocks read
Picked broadcast on commit scheme to generate SCNs
Wed Jun 13 11:06:28 2012
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=28, OS id=19692
Wed Jun 13 11:06:28 2012
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=29, OS id=19695
Wed Jun 13 11:06:28 2012
Thread 2 advanced to log sequence 7177
Thread 2 opened at log sequence 7177
Current log# 3 seq# 7177 mem# 0: +DATA/osista/onlinelog/group_3.291.742134597
Successful open of redo thread 2
Wed Jun 13 11:06:28 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Jun 13 11:06:28 2012
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Wed Jun 13 11:06:28 2012
ARC1: Becoming the heartbeat ARCH
Wed Jun 13 11:06:28 2012
SMON: enabling cache recovery
Wed Jun 13 11:06:28 2012
db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
SUCCESS: diskgroup FLASH was mounted
SUCCESS: diskgroup FLASH was dismounted
Wed Jun 13 11:06:31 2012
Successfully onlined Undo Tablespace 20.
Wed Jun 13 11:06:31 2012
SMON: enabling tx recovery
Wed Jun 13 11:06:31 2012
Database Characterset is AL32UTF8
Wed Jun 13 11:06:31 2012
Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_19596.trc:
ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
Wed Jun 13 11:06:32 2012
Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_19596.trc:
ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 19596
ORA-1092 signalled during: ALTER DATABASE OPEN...
Wed Jun 13 11:11:35 2012
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth0 172.16.0.0 configured from OCR for use as a cluster interconnect
Interface type 1 bond0 132.147.160.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.3.0.
System parameters with non-default values:
processes = 300
sessions = 335
sga_max_size = 524288000
__shared_pool_size = 318767104
__large_pool_size = 4194304
__java_pool_size = 8388608
__streams_pool_size = 8388608
spfile = +DATA/osista/spfileosista.ora
nls_language = FRENCH
nls_territory = FRANCE
nls_length_semantics = CHAR
sga_target = 524288000
control_files = DATA/osista/controlfile/control01.ctl, DATA/osista/controlfile/control02.ctl
db_block_size = 8192
__db_cache_size = 176160768
compatible = 10.2.0.3.0
log_archive_dest_1 = LOCATION=USE_DB_RECOVERY_FILE_DEST
db_file_multiblock_read_count= 16
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +DATA
db_recovery_file_dest = +FLASH
db_recovery_file_dest_size= 68543315968
thread = 2
instance_number = 2
undo_management = AUTO
undo_tablespace = UNDOTBS2
undo_retention = 29880
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers = (PROTOCOL=TCP) (SERVICE=OSISTAXDB)
local_listener = (address=(protocol=tcp)(port=1521)(host=132.147.160.243))
remote_listener = LISTENERS_OSISTA
job_queue_processes = 10
background_dump_dest = /oracle/product/admin/OSISTA/bdump
user_dump_dest = /oracle/product/admin/OSISTA/udump
core_dump_dest = /oracle/product/admin/OSISTA/cdump
audit_file_dest = /oracle/product/admin/OSISTA/adump
db_name = OSISTA
open_cursors = 300
pga_aggregate_target = 104857600
aq_tm_processes = 1
Cluster communication is configured to use the following interface(s) for this instance
172.16.0.2
Wed Jun 13 11:11:35 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=16101
DIAG started with pid=3, OS id=16103
PSP0 started with pid=4, OS id=16105
LMON started with pid=5, OS id=16107
LMD0 started with pid=6, OS id=16110
LMS0 started with pid=7, OS id=16112
LMS1 started with pid=8, OS id=16116
MMAN started with pid=9, OS id=16120
DBW0 started with pid=10, OS id=16132
LGWR started with pid=11, OS id=16148
CKPT started with pid=12, OS id=16169
SMON started with pid=13, OS id=16185
RECO started with pid=14, OS id=16203
CJQ0 started with pid=15, OS id=16219
MMON started with pid=16, OS id=16227
Wed Jun 13 11:11:36 2012
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=17, OS id=16229
Wed Jun 13 11:11:36 2012
starting up 1 shared server(s) ...
Wed Jun 13 11:11:36 2012
lmon registered with NM - instance id 2 (internal mem no 1)
Wed Jun 13 11:11:36 2012
Reconfiguration started (old inc 0, new inc 2)
List of nodes:
1
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Jun 13 11:11:36 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Wed Jun 13 11:11:36 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Wed Jun 13 11:11:36 2012
LMS 1: 0 GCS shadows traversed, 0 replayed
Wed Jun 13 11:11:36 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Jun 13 11:11:36 2012
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=20, OS id=16235
Wed Jun 13 11:11:37 2012
ALTER DATABASE MOUNT
Wed Jun 13 11:11:37 2012
This instance was first to mount
Wed Jun 13 11:11:37 2012
Starting background process ASMB
ASMB started with pid=22, OS id=16343
Starting background process RBAL
RBAL started with pid=23, OS id=16347
Wed Jun 13 11:11:44 2012
SUCCESS: diskgroup DATA was mounted
Wed Jun 13 11:11:49 2012
Setting recovery target incarnation to 1
Wed Jun 13 11:11:49 2012
Successful mount of redo thread 2, with mount id 3005745065
Wed Jun 13 11:11:49 2012
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Completed: ALTER DATABASE MOUNT
Wed Jun 13 11:22:25 2012
alter database open
This instance was first to open
Wed Jun 13 11:22:26 2012
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Wed Jun 13 11:22:26 2012
Started redo scan
Wed Jun 13 11:22:26 2012
Completed redo scan
61 redo blocks read, 4 data blocks need recovery
Wed Jun 13 11:22:26 2012
Started redo application at
Thread 1: logseq 7927, block 3
Wed Jun 13 11:22:26 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 7927 Reading mem 0
Mem# 0: +DATA/osista/onlinelog/group_1.283.742132543
Wed Jun 13 11:22:26 2012
Completed redo application
Wed Jun 13 11:22:26 2012
Completed crash recovery at
Thread 1: logseq 7927, block 64, scn 506178382
4 data blocks read, 4 data blocks written, 61 redo blocks read
Switch log for thread 1 to sequence 7928
Picked broadcast on commit scheme to generate SCNs
Wed Jun 13 11:22:27 2012
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=31, OS id=13010
Wed Jun 13 11:22:27 2012
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=32, OS id=13033
Wed Jun 13 11:22:27 2012
Thread 2 opened at log sequence 7178
Current log# 4 seq# 7178 mem# 0: +DATA/osista/onlinelog/group_4.289.742134597
Successful open of redo thread 2
Wed Jun 13 11:22:27 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Jun 13 11:22:27 2012
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Wed Jun 13 11:22:27 2012
ARC1: Becoming the heartbeat ARCH
Wed Jun 13 11:22:27 2012
SMON: enabling cache recovery
Wed Jun 13 11:22:30 2012
db_recovery_file_dest_size of 65368 MB is 0.61% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
SUCCESS: diskgroup FLASH was mounted
SUCCESS: diskgroup FLASH was dismounted
Wed Jun 13 11:22:31 2012
Successfully onlined Undo Tablespace 20.
Wed Jun 13 11:22:31 2012
SMON: enabling tx recovery
Wed Jun 13 11:22:32 2012
Database Characterset is AL32UTF8
Wed Jun 13 11:22:32 2012
Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_11751.trc:
ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
Wed Jun 13 11:22:33 2012
Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_11751.trc:
ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 11751
ORA-1092 signalled during: alter database open...
regards,

Hi;
Errors in file /oracle/product/admin/OSISTA/udump/osista2_ora_9174.trc:Did you check trc file?
ORA-00600: code d'erreur interne, arguments : [kokiasg1], [], [], [], [], [], [], []You are getting oracle internal error(ORA 600) which mean you could need to work wiht oracle support team. Please see below note, if its not help than i suggest log a sr:
Troubleshoot an ORA-600 or ORA-7445 Error Using the Error Lookup Tool [ID 153788.1]
for your future rac issue please use Forum Home » High Availability » RAC, ASM & Clusterware Installation which is RAC dedicated forum site.
Regard
Helios

How unhealthy is this RAC?

Here's is the contents of v$system_event..
Is this
EVENT     TOTAL_WAITS     TIME_WAITED     AVERAGE_WAIT
enq: TX - index contention     40564851     214701526      5.29
enq: TX - row lock contention     188846     12454614     65.95
enq: SQ - contention     141971     70568     0.5
cause for concern?
EVENT     TOTAL_WAITS     TIME_WAITED     AVERAGE_WAIT
SQL*Net message to client     6015051449     607254     0
SQL*Net message from client     6015048542     178177969892     29.62
gcs remote message     2948555287     2633481757     0.89
CGS wait for IPC msg     1517805027     634397     0
db file sequential read     1500615188     816364485     0.54
ges remote message     1247679701     1407300224     1.13
gc cr multi block request     778432813     9913464     0.01
gc current block 2-way     747852637     38030616     0.05
db file scattered read     709428365     460939295     0.65
rdbms ipc message     708473316     37650068633     53.14
gc buffer busy acquire     671285134     1033621285     1.54
PX Deq: reap credit     667784615     484449     0
gcs log flush sync     592376026     171712257     0.29
gc cr block 2-way     530861847     19607062     0.04
library cache pin     437937120     15126237     0.03
log file sync     379523272     797193932     2.1
DIAG idle wait     359607166     2822108755     7.85
log file parallel write     351225436     259263769     0.74
LNS ASYNC end of log     350170653     1398410516     3.99
LNS wait on SENDREQ     321652621     3209301     0.01
PX qref latch     297396661     94308     0
read by other session     289140108     148440270     0.51
buffer deadlock     163505781     983055     0.01
gc current block busy     119223825     467716658     3.92
PX Deq: Table Q Normal     117332841     23574867     0.2
ksxr poll remote instances     110480324     90333     0
buffer busy waits     106938153     19933900     0.19
direct path read     93429599     108427028     1.16
SQL*Net more data from client     86471785     23026529     0.27
gc current grant busy     84978157     28215346     0.33
control file sequential read     82646297     23694583     0.29
PX Deq Credit: send blkd     78641669     9569299     0.12
latch: cache buffers chains     74218671     690277     0.01
gc current grant 2-way     72557796     1920419     0.03
library cache: mutex X     71106697     75993     0
DFS lock handle     70722498     2716407     0.04
gc cr grant 2-way     64558237     1633004     0.03
PX Deq: Execution Msg     61706261     314222076     5.09
gc cr block busy     61469863     119850802     1.95
library cache lock     52428649     3773354     0.07
PX Deq: Slave Session Stats     48040224     1886805     0.04
db file parallel read     46415188     118467902     2.55
IPC send completion sync     46250594     965101     0.02
enq: TX - index contention     40564851     214701526     5.29
PX Deq: Execute Reply     39689685     17243721     0.43
gc buffer busy release     36976909     242714774     6.56
SQL*Net more data to client     36627952     44167     0
PX Deq: Msg Fragment     30501244     343397     0.01
rdbms ipc reply     29725302     1352370     0.05
RMAN backup & recovery I/O     28824547     37722662     1.31
reliable message     27892263     3082134     0.11
PX Idle Wait     27356097     4651277341     170.03
ASM file metadata operation     25098749     8850323     0.35
gc object scan     22705857     7485     0
db file parallel write     19896252     52152606     2.62
latch: ges resource hash list     19336183     427451     0.02
enq: PS - contention     19143961     707455     0.04
PX Deq: Parse Reply     19093356     895799     0.05
gc cr disk read     17816846     448909     0.03
ASM background timer     16101806     1383957874     85.95
PX Deq: Slave Join Frag     16044789     233149     0.01
wait for unread message on broadcast channel     15056320     1413552546     93.88
cursor: mutex X     13435193     24140     0
KJC: Wait for msg sends to complete     13268497     11397     0
PX Deq: Signal ACK RSG     13214824     101941     0.01
KSV master wait     13206286     4235645     0.32
direct path read temp     12617694     5487608     0.43
PX Deq Credit: need buffer     11675868     879967     0.08
row cache lock     11536185     398216     0.03
PX Deq Credit: Session Stats     9480862     78910     0.01
SQL*Net message to dblink     9312894     1538     0
SQL*Net message from dblink     9312894     6279631     0.67
control file parallel write     7760982     11854435     1.53
pmon timer     7558889     1412576090     186.88
PX Deq: Join ACK     7548816     498931     0.07
gc current multi block request     6035173     155898     0.03
PING     5706961     1413230267     247.63
enq: XR - database force logging     4662671     198813     0.04
class slave wait     4561877     7097429006     1555.81
Streams AQ: waiting for messages in the queue     4495828     1543411682     343.3
SQL*Net more data from dblink     3696582     444575     0.12
LGWR wait for redo copy     3655353     17840     0
log file sequential read     3387305     6610414     1.95
Log archive I/O     2990486     276772     0.09
SQL*Net break/reset to client     2971976     2385935     0.8
direct path write temp     2839390     2522114     0.89
Space Manager: slave idle wait     2827526     1412987186     499.73
latch: shared pool     2808517     298150     0.11
latch: gc element     2421717     24688     0.01
SGA: MMAN sleep for component shrink     2336447     2458094     1.05
latch: enqueue hash chains     2279645     15435     0.01
latch free     2089418     78732     0.04
gc current split     2044784     1864009     0.91
PX Deq: Signal ACK EXT     1976164     19263     0.01
enq: FB - contention     1473469     61036     0.04
cursor: pin S wait on X     1313129     1464789     1.12
SQL*Net more data to dblink     1232891     986     0
Streams AQ: RAC qmn coordinator idle wait     1211300     788     0
enq: HW - contention     1175390     2077008     1.77
latch: session allocation     1167768     21883     0.02
Streams AQ: qmn coordinator idle wait     1144699     1412546634     1233.99
Streams AQ: qmn slave idle wait     1031585     2227183681     2158.99
lock deadlock retry     962937     4698     0
enq: CF - contention     956154     609647     0.64
latch: cache buffers lru chain     902764     37552     0.04
latch: object queue header operation     817911     27717     0.03
global enqueue expand wait     768633     654105     0.85
Data file init write     756191     329758     0.44
latch: gcs resource hash     647021     4147     0.01
local write wait     603007     286191     0.47
latch: row cache objects     599358     6453     0.01
ges lmd/lmses to freeze in rcfg - mrcvr     481759     156345     0.32
shared server idle wait     471190     1413238589     2999.3
enq: RF - DG Broker Current File ID     469833     23209     0.05
smon timer     432383     1411851085     3265.28
SGA: allocation forcing component growth     363333     379008     1.04
gc current retry     341104     1121252     3.29
enq: RF - synch: DG Broker metadata     319143     588290     1.84
enq: PG - contention     313659     14830     0.05
enq: TT - contention     260134     11207172     43.08
enq: KO - fast object checkpoint     236745     820808     3.47
dispatcher timer     236637     1413242481     5972.2
direct path write     231382     191008     0.83
cursor: pin S     229011     394     0
Streams AQ: waiting for time management or cleanup tasks     199981     1413148548     7066.41
enq: TX - row lock contention     188846     12454614     65.95
enq: TX - allocate ITL entry     153703     54252     0.35
enq: SQ - contention     141971     70568     0.5
ksdxexeother     141885     56     0
latch: redo allocation     138912     1858     0.01
recovery area: computing applied logs     126415     45925     0.36
gc current block congested     126318     21768     0.17
resmgr:cpu quantum     123074     151384     1.23
jobq slave wait     120678     35574221     294.79
Datapump dump file I/O     90431     9127     0.1
ges inquiry response     89402     4041     0.05
os thread startup     83809     222586     2.66
cr request retry     80062     71896     0.9
PX Deq: Table Q Sample     79665     133402     1.67
gc cr block congested     79026     14792     0.19
gc cr failure     77521     25019     0.32
enq: WF - contention     73983     825388     11.16
enq: TQ - TM contention     72871     3319     0.05
lock escalate retry     65714     1574     0.02
buffer exterminate     59775     64919     1.09
fbar timer     47136     1413183353     29980.98
log file switch completion     46911     452097     9.64
recovery area: computing obsolete files     45699     8547     0.19
enq: US - contention     40401     8805     0.22
enq: TM - contention     39149     5435032     138.83
library cache load lock     36311     382575     10.54
kjbdrmcvtq lmon drm quiesce: ping completion     31668     47443     1.5
enq: TD - KTF dump entries     31468     1424     0.05
enq: RO - fast object reuse     28422     31772     1.12
parallel recovery slave wait for change     27558     3163     0.11
name-service call wait     23694     181533     7.66
control file single write     22375     1624     0.07
kksfbc child completion     21239     106926     5.03
PX Deq: Table Q qref     19325     245     0.01
enq: TX - contention     18805     113253     6.02
latch: messages     17203     181     0.01
enq: RS - prevent file delete     16913     1013     0.06
enq: RS - prevent aging list update     15682     642     0.04
PX Deq: Table Q Get Keys     14322     42935     3
gc current grant congested     14292     2192     0.15
cursor: mutex S     13285     8     0
log file single write     13164     5371     0.41
latch: undo global data     12649     178     0.01
kksfbc research     11894     12680     1.07
parallel recovery slave idle wait     11193     5872     0.52
wait list latch free     11026     11794     1.07
enq: CT - state     11001     417     0.04
latch: checkpoint queue latch     10526     132     0.01
enq: PE - contention     10506     1139     0.11
ARCH wait on SENDREQ     9957     216480     21.74
gc cr grant congested     9465     1584     0.17
wait for scn ack     9377     3155     0.34
enq: TA - contention     8856     324     0.04
log buffer space     8777     89323     10.18
enq: TK - Auto Task Serialization     8542     343     0.04
enq: DR - contention     7842     323     0.04
process diagnostic dump     7707     2072     0.27
JOX Jit Process Sleep     7612     11286431     1482.72
enq: TC - contention     7357     340817     46.33
ges global resource directory to be frozen     7140     12299     1.72
enq: CO - master slave det     6850     312     0.05
enq: JS - job run lock - synchronize     6704     397     0.06
gcs drm freeze in enter server mode     6542     40742     6.23
enq: TS - contention     5959     89332     14.99
ARCH wait for archivelog lock     5600     36     0.01
PX Nsq: PQ load info query     5377     104798     19.49
db file single write     5373     3452     0.64
gc remaster     5315     50625     9.52
latch: parallel query alloc buffer     4939     1906     0.39
enq: TO - contention     4799     143     0.03
enq: AF - task serialization     4395     161     0.04
enq: PI - contention     4251     163     0.04
ges2 LMON to wake up LMD - mrcvr     4210     28     0.01
enq: DL - contention     3889     239     0.06
kjctssqmg: quick message send wait     3408     22     0.01
LNS wait on DETACH     3275     741     0.23
ksfd: async disk IO     3274     1     0
LNS wait on ATTACH     3273     51940     15.87
ARCH wait on DETACH     3231     714     0.22
ARCH wait on ATTACH     3226     43238     13.4
enq: BR - file shrink     2787     116     0.04
write complete waits     2631     1070     0.41
enq: MD - contention     2596     67     0.03
enq: WL - contention     2198     266518     121.25
single-task message     2098     25896     12.34
enq: OD - Serializing DDLs     2054     66     0.03
resmgr:internal state change     2001     14735     7.36
ARCH wait on c/f tx acquire 2     1751     175230     100.07
enq: WR - contention     1636     69     0.04
latch: cache buffer handles     1610     29     0.02
statement suspended, wait error to be cleared     1497     22626     15.11
Streams AQ: qmn coordinator waiting for slave to start     1214     678966     559.28
enq: PD - contention     1182     33     0.03
JS kgl get object wait     1096     4922     4.49
undo segment extension     1070     10065     9.41
PL/SQL lock timer     949     8739819     9209.5
enq: AE - lock     937     28     0.03
LGWR-LNS wait on channel     832     913     1.1
ges DFS hang analysis phase 2 acks     816     495     0.61
latch: redo writing     729     9     0.01
gc quiesce     665     564     0.85
enq: JS - queue lock     482     2111     4.38
PX Deq: Test for credit     442     13     0.03
enq: SS - contention     386     274     0.71
recovery area: computing dropped files     328     1400     4.27
recovery area: computing backed up files     328     496     1.51
ksdxexeotherwait     279     10592     37.97
log switch/archive     250     137570     550.28
gc domain validation     223     39964     179.21
auto-sqltune: wait graph update     195     96514     494.95
wait for a undo record     170     1214     7.14
parallel recovery coord send blocked     168     4     0.02
enq: JS - wdw op     168     3741     22.27
enq: KT - contention     165     5     0.03
switch logfile command     156     6290     40.32
gcs resource directory to be unfrozen     149     12839     86.17
Data Guard Broker Wait     139     10906     78.46
enq: SK - contention     129     4     0.03
enq: JS - job recov lock     128     4     0.03
gc cr block lost     125     6772     54.17
virtual circuit wait     122     3     0.03
ges LMON to get to FTDONE      100     187     1.87
enq: CU - contention     80     242     3.02
enq: JQ - contention     78     7     0.09
cursor: pin X     73     83     1.14
parallel recovery coord wait for reply     70     510     7.29
PX Deq: Txn Recovery Start     67     3436     51.29
SQL*Net break/reset to dblink     60     0     0
gc current block lost     57     2869     50.33
ges LMD suspend for testing event     51     709     13.89
inactive session     46     4550     98.91
recovery read     45     5     0.11
JS kill job wait     41     3548     86.53
enq: AS - service activation     40     1     0.03
enq: TL - contention     35     2     0.05
enq: UL - contention     34     524     15.42
gcs enter server mode     33     1559     47.24
wait for stopper event to be increased     30     218     7.27
enq: TQ - DDL contention     24     300     12.52
enq: MR - contention     21     1     0.03
ges reconfiguration to start     20     54     2.72
ges enter server mode     20     502     25.08
buffer latch     18     1337     74.26
enq: SR - contention     18     1     0.05
Streams: RAC waiting for inter instance ack     18     3748     208.21
enq: PR - contention     17     46     2.72
kupp process wait     16     166     10.39
checkpoint completed     15     3678     245.19
PX Deque wait     14     68     4.87
enq: BF - allocation contention     14     1     0.08
enq: XL - fault extent map     14     51     3.66
enq: FU - contention     14     17     1.18
enq: TH - metric threshold evaluation     13     114     8.78
enq: MW - contention     12     0     0.04
enq: DD - contention     10     0     0.04
process terminate     8     41     5.08
ges cgs registration     8     151     18.9
buffer resize     7     0     0
ktm: instance recovery     7     698     99.66
LNS wait on LGWR     6     0     0
ASM background starting     6     381     63.43
gc cr block 3-way     5     0     0.08
enq: PV - syncstart     5     9     1.74
Global transaction acquire instance locks     4     4     1.09
enq: RS - read alert level     4     0     0.04
LGWR wait on LNS     3     0     0
gc recovery     3     540     179.85
Streams AQ: enqueue blocked on low memory     3     544     181.2
DBWR range invalidation sync     3     17     5.83
enq: DM - contention     3     0     0.03
enq: RF - FSFO Observer Heartbeat     3     0     0.03
enq: JS - q mem clnup lck     3     0     0
DG Broker configuration file I/O     2     0     0
enq: RC - Result Cache: Contention     2     493     246.6
enq: KM - contention     2     0     0.03
enq: RT - contention     2     0     0.04
instance state change     2     0     0.12
kkdlgon     2     10     5.11
enq: TQ - INI contention     2     292     146.07
enq: JS - contention     2     0     0
ARCH wait for netserver start     1     400     400.02
log file switch (checkpoint incomplete)     1     3     3.44
JS coord start wait     1     50     50.09
ges lmd and pmon to attach     1     1     1.26
wait for tmc2 to complete     1     3     3.03
control file heartbeat     1     400     400.02
enq: SW - contention     1     0     0.04
enq: PW - perwarm status in dbw0     1     0     0.09
enq: FS - contention     1     0     0.04
enq: XR - quiesce database     1     0     0.04
enq: RS - write alert level     1     0     0.02
enq: CN - race with init     1     0     0.03
enq: FE - contention     1     4     3.77
Wait for shrink lock2     1     10     10.03
enq: IA - contention     1     0     0.02
enq: RF - atomicity     1     0     0.05
enq: RF - synchronization: aifo master     1     0     0.02
enq: RF - RF - Database Automatic Disable     1     0     0.06
enq: WP - contention     1     0     0.02
enq: TB - SQL Tuning Base Cache Load     1     0     0.05
enq: JS - evt notify     1     0     0.02Edited by: steffi on Mar 20, 2011 12:21 AM
Edited by: steffi on Mar 20, 2011 8:18 AM
Edited by: steffi on Mar 20, 2011 8:19 AM

Text can be formatted by tagging the beginning and end of the block of text with the code ta
\Formatted text goes here.
\Example:
This is formatted.When cutting and pasting text such as execution plans, excerpts from AWR reports, etc, it will maintain spacing and formatting, and make for much easier reading.
As to the content, well, dumping the contents of v$system_event is pretty close to useless.
As to the first three events you listed, 'enq: TX - index contention', 'enq: TX - row lock contention', 'enq: SQ - contention', well, all of those are easily tunable.
First, for 'enq: SQ - contention', check your sequences. Do you have any NOCACHE sequences? Or sequences with small caches?
As for 'enq: TX - row lock contention', well that's fairly self-explanatory. You have multiple sessions trying to lock the same row in the same table at the same time.
Last, 'enq: TX - index contention', that's non-row level contention on an index. For example, if you have a unique index, insert a row w/ column value 1, but don't commit, then try to insert that same value from another session.
But, before you do any of that, I think you need to clearly understand where the bottlenecks are. Try taking some AWR snapshots, about 5 minutes apart, when you're having performance problems. Look at the AWR report for that 5 minute period. In particular, look at your Top 5 timed events.
Hope that helps,
-Mark

2 Node RAC abnormal behaviour

Platform: "HP-UX 11.23 64-bit"
Database: "10.2.0.4 64-bit"
RAC: 2 Node RAC setup
Our RAC setup has been properly done and RAC is working fine with load balancing i.e clients are getting connection on both instances. BUT the issue I am facing with my RAC setup is High Availability testing. When I send reboot signal to "Node-2" and the "Node-1" is up what I observe and receive complain from clients that they have lost connection with database ALSO no new connections are being allowed. When I see the alert log of "Node-1" I see the following abnormal messages reported in it:
List of nodes:
0 1
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Tue Aug 9 04:02:15 2011
LMS 2: 0 GCS shadows cancelled, 0 closed
Tue Aug 9 04:02:15 2011
LMS 0: 0 GCS shadows cancelled, 0 closed
Tue Aug 9 04:02:15 2011
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Tue Aug 9 04:02:15 2011
LMS 1: 1908 GCS shadows traversed, 1076 replayed
Tue Aug 9 04:02:15 2011
LMS 2: 1911 GCS shadows traversed, 1086 replayed
Tue Aug 9 04:02:15 2011
LMS 0: 1899 GCS shadows traversed, 1164 replayed
Tue Aug 9 04:02:15 2011
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Tue Aug 9 04:02:16 2011
ARCH shutting down
ARC2: Archival stopped
Tue Aug 9 04:02:21 2011
Redo thread 2 internally enabled
Tue Aug 9 04:02:35 2011
Reconfiguration started (old inc 4, new inc 6)
List of nodes:
0
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Tue Aug 9 04:02:35 2011
LMS 1: 0 GCS shadows cancelled, 0 closed
Tue Aug 9 04:02:35 2011
LMS 2: 0 GCS shadows cancelled, 0 closed
Tue Aug 9 04:02:35 2011
LMS 0: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Tue Aug 9 04:02:35 2011
Instance recovery: looking for dead threads
Tue Aug 9 04:02:35 2011
Beginning instance recovery of 1 threads
Tue Aug 9 04:02:35 2011
LMS 1: 1908 GCS shadows traversed, 0 replayed
Tue Aug 9 04:02:35 2011
LMS 2: 1907 GCS shadows traversed, 0 replayed
Tue Aug 9 04:02:35 2011
LMS 0: 1899 GCS shadows traversed, 0 replayed
Tue Aug 9 04:02:35 2011
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
Tue Aug 9 04:02:37 2011
parallel recovery started with 11 processes
Tue Aug 9 04:02:37 2011
Started redo application at
Thread 2: logseq 6, block 2, scn 1837672332
Tue Aug 9 04:02:37 2011
Errors in file /u01/app/oracle/product/10.2.0/db/admin/BAF/bdump/baf1_smon_10253.trc:
ORA-00600: internal error code, arguments: [kcratr2_onepass], [], [], [], [], [], [], []
Tue Aug 9 04:02:38 2011
Errors in file /u01/app/oracle/product/10.2.0/db/admin/BAF/bdump/baf1_smon_10253.trc:
ORA-00600: internal error code, arguments: [kcratr2_onepass], [], [], [], [], [], [], []
Tue Aug 9 04:02:38 2011
Errors in file /u01/app/oracle/product/10.2.0/db/admin/BAF/bdump/baf1_smon_10253.trc:
ORA-00600: internal error code, arguments: [kcratr2_onepass], [], [], [], [], [], [], []
SMON: terminating instance due to error 600
Tue Aug 9 04:02:38 2011
Dump system state for local instance only
System State dumped to trace file /u01/app/oracle/product/10.2.0/db/admin/BAF/bdump/baf1_diag_10229.trc
Tue Aug 9 04:02:38 2011
Instance terminated by SMON, pid = 10253
Tue Aug 9 04:04:09 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 lan3 192.168.1.0 configured from OCR for use as a cluster interconnect
Interface type 1 lan2 172.20.21.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 3
Autotune of undo retention is turned off.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.4.0.
System parameters with non-default values:
processes = 300
sessions = 335
timed_statistics = TRUE
Kindly help me to get rid out of this issue. Waiting for the quick and helpful response from the gurus in the forum. Thanks in advance.
Regards,

if above were really 100% correct, you would not be here posting about errors!Definitely but these situations could become the cause for new BUGS, isn't it?
I don't know what is real & what is unnecessary obfuscation.What part of the thread you didn't understand.
It is not a good idea to have subtraction sign/dash character as part of object/host name; i.e. "Node-1"."Node-1" is not the hostname it is just to make clear understanding. the hostname is "sdupn101" for node-1 and "sdupn102" for node-2.
ORA-00600/ORA-07445/ORA-03113 = Oracle bug => search on Metalink and/or call Oracle supportNewbie is my status on this forum but I have little bit ethics of using forums and suppot blogs. I searched but unfortunately didn't find any matching solution.
Anyway will update you once find any solution so that you can assist someone else in future.

Oracle Rac - Targetting clients to particular nodes?

We have a deployment case that wanted to find out best practices.
Currently there is a two node RAC setup.
We have an application that on one side -- is a high write, minor read component -- and another side which is mostly read and minor write. Each component uses multiple connections and have logical separations of what tables they are writing to.
The current proposed solution is to target the high write components to one node instance -- while distributing the other component, which does A LOT of reading and some writing, across both RAC instances.
The question is what is the best paradigm for this: Any documentation of when to target components to single instances versus letting RAC do it's own distribution/etc. There are different user accounts for the components that do the high volume writing versus the selecting/etc..
Thanks.

Hi,
The question is what is the best paradigm for this: Any documentation of when to target components to single instances versus letting RAC do it's own distribution/etc. There are different user accounts for the components that do the high volume writing versus the selecting/etc..I'm assuming you're using version 10g or later. On Oracle RAC 9i you don't have this feature.
You can direct connections to nodes that have characteristics in common workload using Oracle Services.
To manage workloads or a group of applications, you can define services that you assign to a particular application or to a subset of an application's operations. You can also group work by type under services.
Oracle recommends that all users who share a service have the same service level requirements. You can define specific characteristics for services and each service can be a separate unit of work. There are many options that you can take advantage of when using services. Although you do not have to implement these options, using them helps optimize application performance.
When you define a service, you define which instances normally support that service. These are known as the PREFERRED instances. You can also define other instances to support a service if the service's preferred instance fails. These are known as AVAILABLE instances.
Services are integrated with Resource Manager, which enables you to restrict the resources that are used by the users who connect with a service in an instance. The Resource Manager enables you to map a consumer group to a service so that users who connect with the service are members of the specified consumer group.
Using a Service (11.1 or later) when you execute a SQL statement in parallel, the parallel processes only run on the instances that offer the service with which you originally connected to the database. This is the default behavior. This does not affect other parallel operations such as parallel recovery or the processing of GV$ queries. To override this behavior, set a value for the PARALLEL_INSTANCE_GROUP initialization parameter.
You'll have much more concept to learn how it works than to know how to configure.
Understanding how it works is essential to configure the services properly.
http://download.oracle.com/docs/cd/B28359_01/rac.111/b28254/hafeats.htm#CHDGEBED
http://www.ardentperf.com/pub/schneider-services.pdf
Any questions just ask.
Regards,
Levi Pereira

HP service guard and RAC or dataguard

HP service guard must use with RAC on HP unix server?
In dataguard how can we use the hp service guard?
will you pls. clarify.

HP Service guard is a clustering or High availability solution that protects you against hardware failure. Oracle Parallel server, the precursor to RAC, when implemented on HPUX platform required HP Service guard. With RAC, oracle bundles in its own clusterware and service guard is no longer a mandatory prerequisite. If Service Guard is installed oracle clusterware will delegate some of the cluster responsibilities to service guard.
Dataguard on the other hand is a Disaster recovery solution that protects you against site disaster or whole data center outage. With that said it is not uncommon to see dataguard being setup between two servers within same datacenter to protect against hardware failure or for reporting purposes.
How would you use dataguard with service guard? On the production site, you can use service guard for fail over the database between two nodes in case there is a hard ware failure. You can do the same for the DR site also.

RAC 10.2.0.5 asm redhat5 64位出现不停自己重启的现象

手上一个rac的库,(版本10.2.0.5 64bit,操作系统是redhat5 64bit),9月份的时候down机了,查看了alert日志,在alert_asm.log中发现有io failed,在alert_orcl.log中发现有ORA-00204: error in reading (block 35, # blocks 1) of control file的报错.同事重启之后,能恢复过来,但不久之后在alert日志中又会发现,诸如:
Reconfiguration started (old inc 16, new inc 17)
List of nodes:
0
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
之类的提示,经过观察实例会不停的自己down掉再重启..
现在拿到手上想处理一下,但不知道该怎么下手,请各位指点,谢谢..
2个节点的alert_asm.log如下:
asm1:
NOTE: ASMB process exiting due to lack of ASM file activity for 12 seconds
Wed Sep 12 13:13:25 CST 2012
WARNING: IO Failed. au:43 diskname:ORCL:VOL1
     rq:0x2ab86feb6f88 buffer:0x627ed000 au_offset(bytes):720896 iosz:4096 operation:1
     status:2
NOTE: cache initiating offline of disk 0 group 1
WARNING: process 6933 initiating offline of disk 0.3915955288 (VOL1) with mask 0x3 in group 1
WARNING: Disk 0 in group 1 in mode: 0x7,state: 0x2 will be taken offline
NOTE: PST update: grp = 1, dsk = 0, mode = 0x6
Wed Sep 12 13:13:25 CST 2012
ERROR: too many offline disks in PST (grp 1)
Wed Sep 12 13:13:25 CST 2012
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA
Wed Sep 12 13:13:25 CST 2012
WARNING: Disk 0 in group 1 in mode: 0x7,state: 0x2 was taken offline
Wed Sep 12 13:13:25 CST 2012
NOTE: halting all I/Os to diskgroup DATA
NOTE: active pin found: 0x0x65faf748
NOTE: active pin found: 0x0x65faf8a8
Wed Sep 12 13:13:26 CST 2012
NOTE: cache dismounting group 1/0xB8984CA8 (DATA)
Wed Sep 12 13:13:27 CST 2012
kjbdomdet send to node 1
detach from dom 1, sending detach message to node 1
Wed Sep 12 13:13:27 CST 2012
Dirty detach reconfiguration started (old inc 16, new inc 16)
List of nodes:
0 1
Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
116 GCS resources traversed, 0 cancelled
6104 GCS resources on freelist, 6124 on array, 6124 allocated
Dirty Detach Reconfiguration complete
Wed Sep 12 13:13:27 CST 2012
WARNING: dirty detached from domain 1
Wed Sep 12 13:13:27 CST 2012
NOTE: PST enabling heartbeating (grp 1)
Wed Sep 12 13:13:27 CST 2012
SUCCESS: diskgroup DATA was dismounted
Wed Sep 12 13:13:27 CST 2012
WARNING: PST-initiated MANDATORY DISMOUNT of group DATA not performed - group not mounted
Wed Sep 12 13:13:27 CST 2012
Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_b001_7494.trc:
ORA-15001: diskgroup "DATA" does not exist or is not mounted
Wed Sep 12 13:13:28 CST 2012
freeing rdom 1
Received dirty detach msg from node 1 for dom 1
Wed Sep 12 14:53:41 CST 2012
Reconfiguration started (old inc 16, new inc 17)
List of nodes:
0
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Sep 12 14:53:41 CST 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Wed Sep 12 14:53:41 CST 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Sep 12 14:53:41 CST 2012
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
Wed Sep 12 14:53:51 CST 2012
Shutting down instance (abort)
License high water mark = 4
Instance terminated by USER, pid = 14969
Wed Sep 12 14:56:38 CST 2012
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 10.185.3.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 10.185.3.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/db_1/dbs/arch
Autotune of undo retention is turned off.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.5.0.
System parameters with non-default values:
large_pool_size = 12582912
instance_type = asm
cluster_database = TRUE
instance_number = 1
remote_login_passwordfile= EXCLUSIVE
background_dump_dest = /u01/app/oracle/admin/+ASM/bdump
user_dump_dest = /u01/app/oracle/admin/+ASM/udump
core_dump_dest = /u01/app/oracle/admin/+ASM/cdump
asm_diskstring = ORCL:VOL*
asm_diskgroups = DATA
Cluster communication is configured to use the following interface(s) for this instance
10.185.3.77
Wed Sep 12 14:56:39 CST 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=7327
LMON started with pid=5, OS id=7333
PSP0 started with pid=4, OS id=7331
DIAG started with pid=3, OS id=7329
LMD0 started with pid=6, OS id=7335
LMS0 started with pid=7, OS id=7337
MMAN started with pid=8, OS id=7341
DBW0 started with pid=9, OS id=7343
LGWR started with pid=10, OS id=7345
CKPT started with pid=11, OS id=7347
SMON started with pid=12, OS id=7349
RBAL started with pid=13, OS id=7351
GMON started with pid=14, OS id=7353
Wed Sep 12 14:56:40 CST 2012
lmon registered with NM - instance id 1 (internal mem no 0)
Wed Sep 12 14:56:40 CST 2012
Reconfiguration started (old inc 0, new inc 2)
ASM instance
List of nodes:
0 1
Global Resource Directory frozen
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* domain 1 valid = 1 according to instance 1
Wed Sep 12 14:56:40 CST 2012
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Sep 12 14:56:40 CST 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed Sep 12 14:56:40 CST 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Sep 12 14:56:40 CST 2012
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=15, OS id=7360
Wed Sep 12 14:56:41 CST 2012
SQL> ALTER DISKGROUP ALL MOUNT
Wed Sep 12 14:56:41 CST 2012
NOTE: cache registered group DATA number=1 incarn=0xdd6857ac
Wed Sep 12 14:56:41 CST 2012
Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
Wed Sep 12 14:56:41 CST 2012
NOTE: Hbeat: instance not first (grp 1)
NOTE: cache opening disk 0 of grp 1: VOL1 label:VOL1
Wed Sep 12 14:56:41 CST 2012
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 1/0xDD6857AC (DATA)
Wed Sep 12 14:56:41 CST 2012
kjbdomatt send to node 1
Wed Sep 12 14:56:42 CST 2012
NOTE: attached to recovery domain 1
Wed Sep 12 14:56:42 CST 2012
NOTE: LGWR attempting to mount thread 2 for disk group 1
NOTE: LGWR mounted thread 2 for disk group 1
NOTE: opening chunk 2 at fcn 0.131392 ABA
NOTE: seq=41 blk=1148
Wed Sep 12 14:56:42 CST 2012
NOTE: cache mounting group 1/0xDD6857AC (DATA) succeeded
SUCCESS: diskgroup DATA was mounted
Wed Sep 12 14:56:43 CST 2012
NOTE: recovering COD for group 1/0xdd6857ac (DATA)
SUCCESS: completed COD recovery for group 1/0xdd6857ac (DATA)
Wed Sep 12 14:56:45 CST 2012
Starting background process ASMB
ASMB started with pid=17, OS id=7454
Wed Sep 12 14:56:55 CST 2012
NOTE: ASMB process exiting due to lack of ASM file activity for 12 seconds
asm2:
NOTE: ASMB process exiting due to lack of ASM file activity for 12 seconds
Received dirty detach msg from node 0 for dom 1
Wed Sep 12 13:13:30 CST 2012
Dirty detach reconfiguration started (old inc 16, new inc 16)
List of nodes:
0 1
Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
14 GCS resources traversed, 0 cancelled
6104 GCS resources on freelist, 6124 on array, 6124 allocated
99 GCS shadows traversed, 0 replayed
Dirty Detach Reconfiguration complete
Wed Sep 12 13:13:30 CST 2012
NOTE: SMON starting instance recovery for group 1 (mounted)
Wed Sep 12 13:13:30 CST 2012
WARNING: IO Failed. au:0 diskname:ORCL:VOL1
     rq:0x2b88463fb990 buffer:0x2b884670ca00 au_offset(bytes):0 iosz:4096 operation:0
     status:2
WARNING: IO Failed. au:0 diskname:ORCL:VOL1
     rq:0x2b88463fb990 buffer:0x2b884670ca00 au_offset(bytes):0 iosz:4096 operation:0
     status:2
WARNING: IO Failed. au:4 diskname:ORCL:VOL1
     rq:0xe4372e0 buffer:0x6045f000 au_offset(bytes):0 iosz:4096 operation:0
     status:2
WARNING: cache failed to read gn 1 fn 3 blk 0 count 1 from disk 0
ERROR: cache failed to read fn=3 blk=0 from disk(s): 0
ORA-15081: failed to submit an I/O operation to a disk
NOTE: cache initiating offline of disk 0 group 1
WARNING: process 6999 initiating offline of disk 0.3915955111 (VOL1) with mask 0x3 in group 1
NOTE: PST update: grp = 1, dsk = 0, mode = 0x6
Wed Sep 12 13:13:30 CST 2012
ERROR: too many offline disks in PST (grp 1)
Wed Sep 12 13:13:30 CST 2012
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA
Wed Sep 12 13:13:30 CST 2012
WARNING: Disk 0 in group 1 in mode: 0x7,state: 0x2 was taken offline
Wed Sep 12 13:13:30 CST 2012
NOTE: halting all I/Os to diskgroup DATA
NOTE: active pin found: 0x0x65faf748
Wed Sep 12 13:13:30 CST 2012
Abort recovery for domain 1
Wed Sep 12 13:13:30 CST 2012
NOTE: cache dismounting group 1/0xB8984B57 (DATA)
Wed Sep 12 13:13:31 CST 2012
kjbdomdet send to node 0
detach from dom 1, sending detach message to node 0
Wed Sep 12 13:13:31 CST 2012
Dirty detach reconfiguration started (old inc 16, new inc 16)
List of nodes:
0 1
Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
99 GCS resources traversed, 0 cancelled
6104 GCS resources on freelist, 6124 on array, 6124 allocated
Dirty Detach Reconfiguration complete
Wed Sep 12 13:13:31 CST 2012
freeing rdom 1
Wed Sep 12 13:13:31 CST 2012
WARNING: dirty detached from domain 1
Wed Sep 12 13:13:31 CST 2012
SUCCESS: diskgroup DATA was dismounted
Wed Sep 12 13:13:31 CST 2012
WARNING: PST-initiated MANDATORY DISMOUNT of group DATA not performed - group not mounted
Wed Sep 12 13:13:31 CST 2012
Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm2_b001_17967.trc:
ORA-15001: diskgroup "DATA" does not exist or is not mounted
Wed Sep 12 14:53:43 CST 2012
Shutting down instance (abort)
License high water mark = 4
Instance terminated by USER, pid = 25013
Wed Sep 12 14:56:02 CST 2012
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 10.185.3.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 10.185.3.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/db_1/dbs/arch
Autotune of undo retention is turned off.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.5.0.
System parameters with non-default values:
large_pool_size = 12582912
instance_type = asm
cluster_database = TRUE
instance_number = 2
remote_login_passwordfile= EXCLUSIVE
background_dump_dest = /u01/app/oracle/admin/+ASM/bdump
user_dump_dest = /u01/app/oracle/admin/+ASM/udump
core_dump_dest = /u01/app/oracle/admin/+ASM/cdump
asm_diskstring = ORCL:VOL*
asm_diskgroups = DATA
Cluster communication is configured to use the following interface(s) for this instance
10.185.3.79
Wed Sep 12 14:56:03 CST 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
LMON started with pid=5, OS id=7171
PSP0 started with pid=4, OS id=7169
DIAG started with pid=3, OS id=7167
PMON started with pid=2, OS id=7165
LMD0 started with pid=6, OS id=7173
LMS0 started with pid=7, OS id=7175
MMAN started with pid=8, OS id=7179
DBW0 started with pid=9, OS id=7181
LGWR started with pid=10, OS id=7183
CKPT started with pid=11, OS id=7185
SMON started with pid=12, OS id=7187
RBAL started with pid=13, OS id=7189
GMON started with pid=14, OS id=7192
Wed Sep 12 14:56:04 CST 2012
lmon registered with NM - instance id 2 (internal mem no 1)
Wed Sep 12 14:56:04 CST 2012
Reconfiguration started (old inc 0, new inc 1)
ASM instance
List of nodes:
1
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Sep 12 14:56:04 CST 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Wed Sep 12 14:56:04 CST 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Sep 12 14:56:04 CST 2012
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=15, OS id=7198
Wed Sep 12 14:56:05 CST 2012
SQL> ALTER DISKGROUP ALL MOUNT
Wed Sep 12 14:56:05 CST 2012
NOTE: cache registered group DATA number=1 incarn=0xdd684d18
Wed Sep 12 14:56:05 CST 2012
Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
Wed Sep 12 14:56:05 CST 2012
NOTE: Hbeat: instance first (grp 1)
Wed Sep 12 14:56:10 CST 2012
NOTE: start heartbeating (grp 1)
NOTE: cache opening disk 0 of grp 1: VOL1 label:VOL1
Wed Sep 12 14:56:10 CST 2012
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (first) group 1/0xDD684D18 (DATA)
* allocate domain 1, invalid = TRUE
Wed Sep 12 14:56:10 CST 2012
NOTE: attached to recovery domain 1
Wed Sep 12 14:56:10 CST 2012
NOTE: starting recovery of thread=1 ckpt=40.10147 group=1
NOTE: starting recovery of thread=2 ckpt=40.1147 group=1
NOTE: advancing ckpt for thread=2 ckpt=40.1147
NOTE: advancing ckpt for thread=1 ckpt=40.10159
NOTE: cache recovered group 1 to fcn 0.227416
Wed Sep 12 14:56:10 CST 2012
NOTE: LGWR attempting to mount thread 1 for disk group 1
NOTE: LGWR mounted thread 1 for disk group 1
NOTE: opening chunk 1 at fcn 0.227416 ABA
NOTE: seq=41 blk=10160
Wed Sep 12 14:56:10 CST 2012
NOTE: cache mounting group 1/0xDD684D18 (DATA) succeeded
SUCCESS: diskgroup DATA was mounted
Wed Sep 12 14:56:13 CST 2012
NOTE: recovering COD for group 1/0xdd684d18 (DATA)
SUCCESS: completed COD recovery for group 1/0xdd684d18 (DATA)
Wed Sep 12 14:56:16 CST 2012
Starting background process ASMB
ASMB started with pid=17, OS id=7358
Wed Sep 12 14:56:26 CST 2012
NOTE: ASMB process exiting due to lack of ASM file activity for 9 seconds
Wed Sep 12 14:56:40 CST 2012
Reconfiguration started (old inc 1, new inc 2)
List of nodes:
0 1
Global Resource Directory frozen
Communication channels reestablished
* domain 1 valid = 1 according to instance 0
Wed Sep 12 14:56:40 CST 2012
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Sep 12 14:56:40 CST 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed Sep 12 14:56:40 CST 2012
LMS 0: 98 GCS shadows traversed, 0 replayed
Wed Sep 12 14:56:40 CST 2012
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
帖子经 suredandan编辑过

2个节点的alert_orcl.log如下:
实例1:
Wed Sep 12 13:13:25 CST 2012
NOTE:Waiting for all pending writes to complete before de-registering: grpnum 1
Wed Sep 12 13:13:25 CST 2012
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl1_lmon_32575.trc:
ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
ORA-15078: ASM diskgroup was forcibly dismounted
Wed Sep 12 13:13:25 CST 2012
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl1_lmon_32575.trc:
ORA-00204: error in reading (block 35, # blocks 1) of control file
ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
ORA-15078: ASM diskgroup was forcibly dismounted
Wed Sep 12 13:13:25 CST 2012
LMON: terminating instance due to error 204
Wed Sep 12 13:13:26 CST 2012
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/admin/orcl/bdump/orcl1_diag_32571.trc
Wed Sep 12 13:13:26 CST 2012
Shutting down instance (abort)
License high water mark = 30
Wed Sep 12 13:13:27 CST 2012
Trace dumping is performing id=[cdmp_20120912131326]
Wed Sep 12 13:13:27 CST 2012
Instance terminated by LMON, pid = 32575
Wed Sep 12 13:13:31 CST 2012
Instance terminated by USER, pid = 7518
Wed Sep 12 14:56:46 CST 2012
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 10.185.3.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 10.185.3.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/db_1/dbs/arch
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.5.0.
System parameters with non-default values:
processes = 500
sessions = 555
sga_max_size = 10485760000
__shared_pool_size = 1811939328
__large_pool_size = 16777216
__java_pool_size = 33554432
__streams_pool_size = 0
spfile = +DATA/orcl/spfileorcl.ora
sga_target = 10485760000
control_files = +DATA/orcl/controlfile/current.260.789223249
db_block_size = 8192
__db_cache_size = 8489271296
db_keep_cache_size = 117440512
compatible = 10.2.0.5.0
db_file_multiblock_read_count= 16
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +DATA
thread = 1
instance_number = 1
undo_management = AUTO
undo_tablespace = UNDOTBS1
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers =
local_listener = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.185.3.80)(PORT = 1521))
remote_listener = LISTENERS_ORCL
job_queue_processes = 10
background_dump_dest = /u01/app/oracle/admin/orcl/bdump
user_dump_dest = /u01/app/oracle/admin/orcl/udump
core_dump_dest = /u01/app/oracle/admin/orcl/cdump
audit_file_dest = /u01/app/oracle/admin/orcl/adump
db_name = orcl
open_cursors = 300
pga_aggregate_target = 770703360
Cluster communication is configured to use the following interface(s) for this instance
10.185.3.77
Wed Sep 12 14:56:46 CST 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=7468
DIAG started with pid=3, OS id=7470
PSP0 started with pid=4, OS id=7479
LMON started with pid=5, OS id=7481
LMD0 started with pid=6, OS id=7484
LMS0 started with pid=7, OS id=7486
LMS1 started with pid=8, OS id=7490
MMAN started with pid=9, OS id=7494
DBW0 started with pid=10, OS id=7496
LGWR started with pid=11, OS id=7498
CKPT started with pid=12, OS id=7500
SMON started with pid=13, OS id=7502
RECO started with pid=14, OS id=7504
CJQ0 started with pid=15, OS id=7506
MMON started with pid=16, OS id=7508
MMNL started with pid=17, OS id=7510
Wed Sep 12 14:56:48 CST 2012
lmon registered with NM - instance id 1 (internal mem no 0)
Wed Sep 12 14:56:49 CST 2012
Reconfiguration started (old inc 0, new inc 4)
List of nodes:
0 1
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* domain 0 valid according to instance 1
* domain 0 valid = 1 according to instance 1
Wed Sep 12 14:56:49 CST 2012
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Sep 12 14:56:49 CST 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Wed Sep 12 14:56:49 CST 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed Sep 12 14:56:49 CST 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Sep 12 14:56:49 CST 2012
LMS 1: 0 GCS shadows traversed, 0 replayed
Wed Sep 12 14:56:49 CST 2012
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=18, OS id=7546
Wed Sep 12 14:56:50 CST 2012
ALTER DATABASE MOUNT
Wed Sep 12 14:56:50 CST 2012
Starting background process ASMB
ASMB started with pid=20, OS id=7552
Starting background process RBAL
RBAL started with pid=21, OS id=7556
Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
Wed Sep 12 14:56:53 CST 2012
SUCCESS: diskgroup DATA was mounted
Wed Sep 12 14:56:57 CST 2012
Setting recovery target incarnation to 2
Wed Sep 12 14:56:57 CST 2012
Successful mount of redo thread 1, with mount id 1321558804
Wed Sep 12 14:56:57 CST 2012
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Completed: ALTER DATABASE MOUNT
Wed Sep 12 14:56:58 CST 2012
ALTER DATABASE OPEN
Picked broadcast on commit scheme to generate SCNs
Wed Sep 12 14:56:58 CST 2012
Thread 1 opened at log sequence 354
Current log# 2 seq# 354 mem# 0: +DATA/orcl/onlinelog/group_2.262.789223251
Successful open of redo thread 1
Wed Sep 12 14:56:58 CST 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Sep 12 14:56:58 CST 2012
SMON: enabling cache recovery
Wed Sep 12 14:56:59 CST 2012
Successfully onlined Undo Tablespace 1.
Wed Sep 12 14:56:59 CST 2012
SMON: enabling tx recovery
Wed Sep 12 14:56:59 CST 2012
Database Characterset is AL32UTF8
Opening with internal Resource Manager plan
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=25, OS id=7618
Wed Sep 12 14:57:00 CST 2012
Completed: ALTER DATABASE OPEN
实例2:
Wed Sep 12 13:13:30 CST 2012
SUCCESS: diskgroup DATA was dismounted
SUCCESS: diskgroup DATA was dismounted
Wed Sep 12 13:13:30 CST 2012
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl2_lmon_2673.trc:
ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
ORA-15078: ASM diskgroup was forcibly dismounted
Wed Sep 12 13:13:30 CST 2012
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl2_ckpt_2691.trc:
ORA-00206: error in writing (block 4, # blocks 1) of control file
ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
ORA-15078: ASM diskgroup was forcibly dismounted
Wed Sep 12 13:13:30 CST 2012
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl2_ckpt_2691.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 4, # blocks 1) of control file
ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
ORA-15078: ASM diskgroup was forcibly dismounted
Wed Sep 12 13:13:30 CST 2012
CKPT: terminating instance due to error 221
Wed Sep 12 13:13:30 CST 2012
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl2_lmon_2673.trc:
ORA-00204: error in reading (block 35, # blocks 1) of control file
ORA-00202: control file: '+DATA/orcl/controlfile/current.260.789223249'
ORA-15078: ASM diskgroup was forcibly dismounted
Wed Sep 12 13:13:31 CST 2012
Shutting down instance (abort)
License high water mark = 29
Wed Sep 12 13:13:32 CST 2012
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/admin/orcl/bdump/orcl2_diag_2669.trc
Wed Sep 12 13:13:33 CST 2012
Trace dumping is performing id=[cdmp_20120912131330]
Wed Sep 12 13:13:36 CST 2012
Instance terminated by CKPT, pid = 2691
Wed Sep 12 13:13:41 CST 2012
Instance terminated by USER, pid = 18002
Wed Sep 12 14:56:16 CST 2012
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 10.185.3.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 10.185.3.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/db_1/dbs/arch
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.5.0.
System parameters with non-default values:
processes = 500
sessions = 555
sga_max_size = 10485760000
__shared_pool_size = 1962934272
__large_pool_size = 16777216
__java_pool_size = 16777216
__streams_pool_size = 0
spfile = +DATA/orcl/spfileorcl.ora
sga_target = 10485760000
control_files = +DATA/orcl/controlfile/current.260.789223249
db_block_size = 8192
__db_cache_size = 8355053568
db_keep_cache_size = 117440512
compatible = 10.2.0.5.0
db_file_multiblock_read_count= 16
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +DATA
thread = 2
instance_number = 2
undo_management = AUTO
undo_tablespace = UNDOTBS2
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers =
local_listener = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.185.3.81)(PORT = 1521))
remote_listener = LISTENERS_ORCL
job_queue_processes = 10
background_dump_dest = /u01/app/oracle/admin/orcl/bdump
user_dump_dest = /u01/app/oracle/admin/orcl/udump
core_dump_dest = /u01/app/oracle/admin/orcl/cdump
audit_file_dest = /u01/app/oracle/admin/orcl/adump
db_name = orcl
open_cursors = 300
pga_aggregate_target = 770703360
Cluster communication is configured to use the following interface(s) for this instance
10.185.3.79
Wed Sep 12 14:56:16 CST 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=7366
DIAG started with pid=3, OS id=7368
PSP0 started with pid=4, OS id=7370
LMON started with pid=5, OS id=7372
LMD0 started with pid=6, OS id=7374
LMS0 started with pid=7, OS id=7376
LMS1 started with pid=8, OS id=7380
MMAN started with pid=9, OS id=7384
DBW0 started with pid=10, OS id=7386
LGWR started with pid=11, OS id=7388
CKPT started with pid=12, OS id=7390
SMON started with pid=13, OS id=7392
RECO started with pid=14, OS id=7394
CJQ0 started with pid=15, OS id=7396
MMON started with pid=16, OS id=7398
MMNL started with pid=17, OS id=7401
Wed Sep 12 14:56:19 CST 2012
lmon registered with NM - instance id 2 (internal mem no 1)
Wed Sep 12 14:56:19 CST 2012
Reconfiguration started (old inc 0, new inc 2)
List of nodes:
1
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Sep 12 14:56:19 CST 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Wed Sep 12 14:56:19 CST 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Wed Sep 12 14:56:19 CST 2012
LMS 0: 0 GCS shadows traversed, 0 replayed
Wed Sep 12 14:56:19 CST 2012
LMS 1: 0 GCS shadows traversed, 0 replayed
Wed Sep 12 14:56:19 CST 2012
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=18, OS id=7418
Wed Sep 12 14:56:20 CST 2012
ALTER DATABASE MOUNT
Wed Sep 12 14:56:20 CST 2012
This instance was first to mount
Wed Sep 12 14:56:20 CST 2012
Starting background process ASMB
ASMB started with pid=20, OS id=7424
Starting background process RBAL
RBAL started with pid=21, OS id=7428
Loaded ASM Library - Generic Linux, version 2.0.4 (KABI_V2) library for asmlib interface
Wed Sep 12 14:56:23 CST 2012
SUCCESS: diskgroup DATA was mounted
Wed Sep 12 14:56:27 CST 2012
Setting recovery target incarnation to 2
Wed Sep 12 14:56:27 CST 2012
Successful mount of redo thread 2, with mount id 1321558804
Wed Sep 12 14:56:27 CST 2012
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Completed: ALTER DATABASE MOUNT
Wed Sep 12 14:56:27 CST 2012
ALTER DATABASE OPEN
This instance was first to open
Wed Sep 12 14:56:27 CST 2012
Beginning crash recovery of 2 threads
parallel recovery started with 7 processes
Wed Sep 12 14:56:28 CST 2012
Started redo scan
Wed Sep 12 14:56:28 CST 2012
Completed redo scan
717 redo blocks read, 25 data blocks need recovery
Wed Sep 12 14:56:28 CST 2012
Started redo application at
Thread 1: logseq 353, block 70791
Thread 2: logseq 401, block 88381
Wed Sep 12 14:56:28 CST 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 353 Reading mem 0
Mem# 0: +DATA/orcl/onlinelog/group_1.261.789223251
Wed Sep 12 14:56:28 CST 2012
Recovery of Online Redo Log: Thread 2 Group 3 Seq 401 Reading mem 0
Mem# 0: +DATA/orcl/onlinelog/group_3.265.789223275
Wed Sep 12 14:56:28 CST 2012
Completed redo application
Wed Sep 12 14:56:28 CST 2012
Completed crash recovery at
Thread 1: logseq 353, block 71419, scn 18394186
Thread 2: logseq 401, block 88470, scn 18386240
25 data blocks read, 25 data blocks written, 717 redo blocks read
Wed Sep 12 14:56:29 CST 2012
Thread 1 advanced to log sequence 354 (thread recovery)
Picked broadcast on commit scheme to generate SCNs
Wed Sep 12 14:56:29 CST 2012
Thread 2 advanced to log sequence 402 (thread open)
Thread 2 opened at log sequence 402
Current log# 4 seq# 402 mem# 0: +DATA/orcl/onlinelog/group_4.266.789223275
Successful open of redo thread 2
Wed Sep 12 14:56:29 CST 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Sep 12 14:56:29 CST 2012
SMON: enabling cache recovery
Wed Sep 12 14:56:30 CST 2012
Successfully onlined Undo Tablespace 5.
Wed Sep 12 14:56:30 CST 2012
SMON: enabling tx recovery
Wed Sep 12 14:56:30 CST 2012
Database Characterset is AL32UTF8
Opening with internal Resource Manager plan
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=30, OS id=7508
Wed Sep 12 14:56:32 CST 2012
Completed: ALTER DATABASE OPEN
Wed Sep 12 14:56:49 CST 2012
Reconfiguration started (old inc 2, new inc 4)
List of nodes:
0 1
Global Resource Directory frozen
Communication channels reestablished
* domain 0 valid = 1 according to instance 0
Wed Sep 12 14:56:49 CST 2012
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Sep 12 14:56:49 CST 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Wed Sep 12 14:56:49 CST 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed Sep 12 14:56:49 CST 2012
LMS 1: 2192 GCS shadows traversed, 1025 replayed
Wed Sep 12 14:56:49 CST 2012
LMS 0: 2135 GCS shadows traversed, 1082 replayed
Wed Sep 12 14:56:49 CST 2012
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete

Rollback ( Tx Recovery) and Roll forward ( cache Recovey)

Hi Guys,
I have a some doubt after Going through the links :
Difference between redo logs and undo tablespace and Oracle DBA ADMIN Guide (E25494-02).
1) Redologs Contain committed and Uncommitted data . Wether they also contain before Image Data ? or its vector that points to the Undo Segments ?
Doc Says : Redo entries record data that you can use to reconstruct all changes made to the database, including the undo segments.
But Above forum Links says that it contains the vector only .
2) Crash of database ( Abort) , how it happens .
I know first it makes the roll forward and then backward. During this Time Undo Tablespace Comes into the picture ? Do the Undo tablespace segments are build from the redo logs. Please Help on this. really get confused
I try to test few scenarios but Not able to conclude ?
1> Make A TXN ( Updated almost 29000 rows )
2) Abort the Database.
3) Change the undo Tablespace Name in init file some dummy .
Database error with the following Error :
SMON: enabling cache recovery
Errors in file d:\install\pracledb\diag\rdbms\dba\dba\trace\dba_ora_5592.trc:
ORA-30012: undo tablespace 'UNDOTBS11' does not exist or of wrong type
Errors in file d:\install\pracledb\diag\rdbms\dba\dba\trace\dba_ora_5592.trc:
ORA-30012: undo tablespace 'UNDOTBS11' does not exist or of wrong type
Error 30012 happened during db open, shutting down database
USER (ospid: 5592): terminating the instance due to error 30012
Second Test :
1) Create a 2nd Undo Tablespace
2) Update the same number of rows .
3) Shut abort in another session.
4) Modify the Pfile with new undotabs2 ( No Spfile)
5) Startup database it started.
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Started redo scan
Completed redo scan
read 7971 KB redo, 1272 data blocks need recovery
Started redo application at
Thread 1: logseq 28, block 3
Recovery of Online Redo Log: Thread 1 Group 1 Seq 28 Reading mem 0
Mem# 0: D:\INSTALL\PRACLEDB\ORADATA\DBA\REDO01.LOG
Completed redo application of 6.72MB
Completed crash recovery at
Thread 1: logseq 28, block 15945, scn 1487147
1272 data blocks read, 1272 data blocks written, 7971 redo k-bytes read
Thread 1 advanced to log sequence 29 (thread open)
Thread 1 opened at log sequence 29
Current log# 2 seq# 29 mem# 0: D:\INSTALL\PRACLEDB\ORADATA\DBA\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 5.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
Sun Oct 28 18:11:03 2012
SMON: enabling tx recovery
Database Characterset is WE8MSWIN1252
No Resource Manager plan active
SMON: Parallel transaction recovery tried
3) Third Test :
1) Update the same number of rows
2) Change the UNDOTBS to the New UndoTBS
3) rename the Old Datafile
4) db start to fail with error
ALTER DATABASE OPEN
Errors in file d:\install\pracledb\diag\rdbms\dba\dba\trace\dba_dbw0_3960.trc:
ORA-01157: cannot identify/lock data file 10 - see DBWR trace file
ORA-01110: data file 10: 'D:\INSTALL\PRACLEDB\ORADATA\DBA\UNDOTBS2.DBF'
ORA-27041: unable to open file
OSD-04002: unable to open file
O/S-Error: (OS 2) The system cannot find the file specified.
Errors in file d:\install\pracledb\diag\rdbms\dba\dba\trace\dba_ora_4552.trc:
ORA-01157: cannot identify/lock data file 10 - see DBWR trace file
ORA-01110: data file 10: 'D:\INSTALL\PRACLEDB\ORADATA\DBA\UNDOTBS2.DBF'
ORA-1157 signalled during: ALTER DATABASE OPEN...
Sun Oct 28 18:20:04 2012
Checker run found 1 new persistent data failures
Please Help on understand this

Sourabh85 wrote:
Hi Hemant,
Thank you very much , really very helpful. One More Question :
Why Oracle do the Double work ? First Populate the Undo Blocks and then do the roll backward ? Why Not directly recovered from the redo logs for uncommitted txn.Its not the doubling of the work. Oracle updates teh Undo blocks with the old data and since the Undo blocks are also just like the Other data blocks, any changes done to them is also logged as a Change Vector to the redo log files. This means, that there would be an Undo data block change vector recorded in the redo log file before the change vector of the data block which contains the data of your statement. This is what is meant when it is said that the redo log also contains the old images . The idea is to do the recovery, when needed, in the same sequence(based on the SCN's) using everything from the redo log files. This would include the recovery of the Undo datafiles as well in the case they are lost too!
HTH
Aman....

Parallel recovery on DR on RAC

Similar Messages

Maybe you are looking for