Stretch (campus) cluster and ASM bug 6656309
Hi all
we have applied 6656309 patch in our Test environment and we think
to apply it also in Production soon.
If anyone:
- has a stretch (campus) 10.2.3 RAC cluster ,
- is using ASM with storages in (at least) two different buildings,
- is aware of bug 6656309 (and related 6360485, 6747864 for example),
- plans to install the patch (it has been released for a few platforms),
- wants to share his thoughts here,
he is really welcome.
Thanks
Oscar Armanini
Please note that the patch applies to db executable(s), not ASM ones, but I prefer to post it in the ASM forum because the final result of the bug is that ASM does not declare the disks missing when one storage system get lost.
10gR2 has some limitations/issues in case of extended cluster(Read about re-silvering process in 10g). We put Extended cluster on Solaris, but we didn't use Failgroup due to same reason.
We used Sun cluster to mirror disk at OS level
we implemented 11g extended cluster on windows and it works like a charm. Of course it has it's own limitations (Placing 3rd voting issue in windows)
anyways, for extended cluster 11g is best option so far, if possible.
Similar Messages
-
After bring the Cluster and ASM is up, I'm not able to start the rdbms ins.
I am not able to bring the database up.
Cluster and ASM is up and running.
ORA-01078: failure in processing system parameters
LRM-00109: could not open parameter file '/oracle/app/oracle/product/11.1.0/db_1/dbs/initTMISC11G.ora'That is wre the confusion is: I coud nt find it under O_H/dbs,
However the log file indicates that it is using spfile which is located at :
spfile = +DATA_TIER/dintw10g/parameterfile/spfile_dintw10g.ora
Now it checks only the /dbs directory only i guess.
Could you please how can i bring it up now.
Just for the information: ASM instance is up. -
Cluster and ASM services doesnt start automatically
Hi,
I have configured a 2 node 10g RAC environment (rac, rac2) in RHEL 4 through vmware.
However the cluster services and ASM services do not come up automatically after server reboot. I have to manually bring it up using ./srvctl command.
[root@rac2 bin]# ./crs_stat -t
Name Type Target State Host
ora....SM1.asm application ONLINE UNKNOWN rac
ora....AC.lsnr application ONLINE UNKNOWN rac
ora.rac.gsd application ONLINE UNKNOWN rac
ora.rac.ons application ONLINE UNKNOWN rac
ora.rac.vip application ONLINE ONLINE rac
[email protected] application ONLINE UNKNOWN rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE ONLINE rac2
Could someone please guide me as to how these services can be brought up automatically upon every reboot?Normally all services will start automatically, I am not sure how you installed and configured your RAC.
Here i am giving all options.
Option1:-
Re: CRS auto-start
Option2:-
See the below links.
http://jaffardba.blogspot.com/2009/03/how-to-startup-rac-database-services.html
http://www.dannorris.com/2009/03/12/start-database-services-automatically-after-instance-startup/
Option3:-
if you simply want to start the service upon server reboot, put srvctl
command in rc.local script under /etc directory.
Hope this solves your issue.
Regards
Click here to [createdisk, deletedisk and querydisk in ASM|http://www.oracleracexpert.com/2009/09/createdisk-deletedisk-and-querydisk-in.html]
Click here to see [ RAC database Instance hang/restart due to node eviction and Solution.|http://www.oracleracexpert.com/2009/09/ora-29740-evicted-by-member-0-group.html]
Click here for [Cross platform Transportable tablespace using RMAN|http://www.oracleracexpert.com/2009/10/cross-platform-transportable-tablespace.html]
http://www.oracleracexpert.com -
Using ASM in a cluster, and new snapshot feature
Hi,
I'm currently studying ASM and trying to find some answers.
From what I understand and experienced so far, ASM can only be used as a local storage solution. In other words it cannot be used by network access. Is this correct?
How is the RDBMS database connecting to the ASM instance? Which process or what type of connection is it using? It's apparently not using listener, although the instance name is part of the database file path. How does this work please?
How does ASM work in a cluster environment? How does each node in a cluster connect to it?
As of 11g release 2, ASM provides a snapshot feature. I assume this can be used for the purpose of backup, but then each database should use it's own diskgroup, and I will still need to use alter database begin backup, correct?
Thanks!Markus Waldorf wrote:
Hi,
I'm currently studying ASM and trying to find some answers.
From what I understand and experienced so far, ASM can only be used as a local storage solution. In other words it cannot be used by network access. Is this correct?Well, you are missing one point that it would entirely depend on the architecture that you are going to use. If you are going to use ASM for a single node, it would be available right there. If installed for a RAC system, ASM instance would be running on each node of the cluster and would be managing the storage which would be lying on the shared storage. The process ASMB would be responsible to exchange messages, take response and push the information back to the RDBMS instance.
How is the RDBMS database connecting to the ASM instance? Which process or what type of connection is it using? It's apparently not using listener, although the instance name is part of the database file path. How does this work please?Listener is not need Markus as its required to create the server processes which is NOT the job of the ASM instance. ASM instance would be connecting the client database with itself immediately when the first request would come from that database to do any operation over the disk group. As I mentioned above, ASMB would be working afterwards to carry forward the request/response tasks.
How does ASM work in a cluster environment? How does each node in a cluster connect to it? Each node would be having its own ASM instance running locally to it. In the case of RAC, ASM sid would be +ASMn where n would be 1, 2,... , going forward to the number of nodes being a part of teh cluster.
As of 11g release 2, ASM provides a snapshot feature. I assume this can be used for the purpose of backup, but then each database should use it's own diskgroup, and I will still need to use alter database begin backup, correct?You are probably talking about the ACFS Snapshot feature of 11.2 ASM. This is not to take the backup of the disk group but its like a o/s level backup of the mount point which is created over the ASM's ACFS mount point. Oracle has given this feature so that you can take the backup of your oracle home running on the ACFS mount point and in the case of any o/s level failure like someone has deleted certain folder from that mountpoint, you can get it back through the ACFS snapshot taken. For the disk group, the only backup available is metadata backup(and restore) but that also does NOT get the data of the database back for you. For the database level backup, you would stillneed to use RMAN only.
HTH
Aman.... -
Hi,
I have noticed following 'strange' behaviour of Oracle Restart and ASM.
starting position:
-bash-3.2 $ crsctl status resource -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATA.dg
ONLINE ONLINE oracle-restart
ora.LISTENERASM.lsnr
ONLINE ONLINE oracle-restart
ora.asm
ONLINE ONLINE oracle-restart Started
Cluster Resources
ora.cssd
1 ONLINE ONLINE oracle-restart
ora.diskmon
1 ONLINE ONLINE oracle-restartstep 1:
-bash-3.2 $ srvctl stop asm
-bash-3.2 $ srvctl stop diskgroup -g data
-bash-3.2 $ srvctl disable diskgroup -g datastep 2:
via sqlplus start ASM instance
SQL> startup
ASM instance started
Total System Global Area 283930624 bytes
Fixed Size 2212656 bytes
Variable Size 256552144 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled
SQL> select * from v$asm_diskgroup;
GROUP_NUMBER NAME SECTOR_SIZE BLOCK_SIZE
ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB HOT_USED_MB
COLD_USED_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB OFFLINE_DISKS
COMPATIBILITY
DATABASE_COMPATIBILITY V
1 DATA 512 4096
1048576 MOUNTED EXTERN 10236 10177 0
59 0 10177 0
GROUP_NUMBER NAME SECTOR_SIZE BLOCK_SIZE
ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB HOT_USED_MB
COLD_USED_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB OFFLINE_DISKS
COMPATIBILITY
DATABASE_COMPATIBILITY V
11.2.0.0.0
10.1.0.0.0 N
-bash-3.2 $ crsctl status resource -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATA.dg
OFFLINE OFFLINE oracle-restart <== funny !!!
ora.LISTENERASM.lsnr
ONLINE ONLINE oracle-restart
ora.asm
ONLINE ONLINE oracle-restart Started
Cluster Resources
ora.cssd
1 ONLINE ONLINE oracle-restart
ora.diskmon
1 ONLINE ONLINE oracle-restartIs this behaviour a 'feature' or bug?
Anyone had similar experience?
thanks,
goranHi,
asm resource is depending on diskgroup resource ... if diskgroup res. is not available, crsctl status shows offline, I would expect asm should be also shown as 'offline' (and brought offline) as they are dependent.
What is the point of managing resources via srvctl when it doesn't take care of dependencies? For me it's wrong.ora.asm : is ASM Instance
ora.*.dg : is Diskgroup
ora.*.dg is dependent of ora.asm, not to the contrary.
I can have more than one diskgroup and want only one diskgroup disabled, so I need the ASM Instance (ora.asm) online.
Important:
If you shut down the database with SQL*Plus, Oracle Restart does not interpret this as a database failure and does not attempt to restart the database.
Similarly, if you shut down the Oracle ASM instance with SQL*Plus or ASMCMD, Oracle Restart does not attempt to restart it.
An important difference between starting a component with SRVCTL and starting it with SQL*Plus (or another utility) is the following:
When you start a component with SRVCTL, any components on which this component depends are automatically started first, and in the proper order.
When you start a component with SQL*Plus (or another utility), other components in the dependency chain are not automatically started; you must ensure that any components on which this component depends are started.
Oracle Restart also manages the weak dependency between database instances and the Oracle Net listener (the listener): When a database instance is started, Oracle Restart attempts to start the listener. If the listener startup fails, then the database is still started. If the listener later fails, Oracle Restart does not shut down and restart any database instances.
It makes no sense Oracle Restart to shut down all environment (databases) because the listener down.
Regards,
Levi Pereira -
Oracle RAC and ASM for SAP in SLES
Hello,
We are installing our SAP ERP with Oracle RAC 11.2.0.2 on SLES 11 x86_64. According note 527843 (Oracle RAC support in the SAP environment), this is supported for SAP. But as a requirement ASM Cluster File System must be setup.
But when Oracle Grid is installed, ACFS installation returns the following error:
ACFS-9459: ADVM/ACFS is not supported on this OS version: 'sles-release-11.1-1.152'
So Oracle GRID has been installed without ACFS, so we will not be able to use Oracle Clusterware for providing high availability of SAP.
As this is supposed to be a feasible combination (Oracle RAC 11.2.0.2 with ASM on SLES 11 SP1, Netweaver 7.0), I'm wondering if it is a bug, a lack of documentation or something I'm missing.
Kind regards,
FermíRead this below Oracle Note:
ACFS not supported on certain platforms [ID 1075058.1]
oracle@node1:~/app/oracle/product/grid/log/node1> tail -1 alertesiha.log
[client(14083)]CRS-10001:ADVM/ACFS is not supported on SUSE
Then these can be ignored, since ACFS/ADVM may not be supported for your platform. As of this writing, Sun, AIX, SuSE 10 are all supported in 11.2.0.2 only. HP-UX, SuSe 11 and RH6.0, as well as Oracle UEK ((2.6.32-100*) are not yet supported in 11.2.0.2. Keep in mind that this does not prevent your Grid Infrastructure stack (Clusterware and ASM) from starting. This is just an informational message for these platforms
Extract from SAP Note:
RAC 11.2.0.2 (x86 & x86_64 only):
Oracle Clusterware 11.2.0.2 + ASM/ACFS 11.2.0.2 (currently only for SLES10, RHEL5, OL 5.x (without UEK))
Thanks
Srikanth M -
Some error in cluster alertlog but the cluster and database is normal used.
hi.everybody
i used RAC+ASM,RAC have two nodes and ASM have 3 group(+DATA,+CRS,+FRA),database is 11gR2,ASM used ASMlib,not raw.
When i start cluster or auto start after reboot the OS,in the cluster alertlog have some error as follow:
ERROR: failed to establish dependency between database rac and diskgroup resource ora.DATA.dg
[ohasd(7964)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
I do not know what cause these error but the cluster and database can start normal and use normal.
I do not know whether these errors will affect the service.
thanks everybody!anyon has the same question?
-
11g CRS and ASM, 10.2.0.4 database, how to set up RMAN
Hi all, I am fairly green when it comes to RAC so please bear with me. I have a Linux 2 node RAC environment. The servers are lux148 and lux149. The 2 node cluster is running 11g CRS and ASM. The database is 10.2.0.4. It had to be set up this way in order for IBM DataStage to work. The database name is fictrp0. The instances are fictrp01 (lux148) and fictrp02 (lux149). I am trying to register the database with an RMAN catalog. I am under the impression that I need to register the database (fictrp0) not the instances (fictrp01 and fictrp02) with the RMAN catalog. Is this correct? So I log into lux148 and set my environment so the ORACLE_SID=FICTRP0. From the command line I issue the following:
fictrp0:/u01/app/oracle> rman target / catalog rman102/[email protected]
The command returns the following:
Recovery Manager: Release 10.2.0.4.0 - Production on Wed Oct 29 14:05:50 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database (not started)
connected to recovery catalog database
Is this normal "connected to target database (not started)"? I was expecting to see a DBID=FICTRP0 for the target. If this is typical, how will RMAN know the ID of the database? Should I be trying to register an instance perhaps such as FICTRP01? If so do I need to register both instances (FICTRP01 and FICTRP02)?
Bottom line I am very confused on how RAC and RMAN work together. Any help would be greatly appreciated.You no need to register the instance,You need to register only the database and database only will have DBID
-
Solaris 10 and Hitachi LUN mapping with Oracle 10g RAC and ASM?
Hi all,
I am working on an Oracle 10g RAC and ASM installation with Sun E6900 servers attached to a Hitachi SAN for shared storage with Sun Solaris 10 as the server OS. We are using Oracle 10g Release 2 (10.2.0.3) RAC clusterware
for the clustering software and raw devices for shared storage and Veritas VxFs 4.1 filesystem.
My question is this:
How do I map the raw devices and LUNs on the Hitachi SAN to Solaris 10 OS and Oracle 10g RAC ASM?
I am aware that with an Oracle 10g RAC and ASM instance, one needs to configure the ASM instance initialization parameter file to set the asm_diskstring setting to recognize the LUNs that are presented to the host.
I know that Sun Solaris 10 uses /dev/rdsk/CwTxDySz naming convention at the OS level for disks. However, how would I map this to Oracle 10g ASM settings?
I cannot find this critical piece of information ANYWHERE!!!!
Thanks for your help!You don't seem to state categorically that you are using Solaris Cluster, so I'll assume it since this is mainly a forum about Solaris Cluster (and IMHO, Solaris Cluster with Clusterware is better than Clusterware on its own).
Clusterware has to see the same device names from all cluster nodes. This is why Solaris Cluster (SC) is a positive benefit over Clusterware because SC provides an automatically managed, consistent name space. Clusterware on its own forces you to manage either the symbolic links (or worse mknods) to create a consistent namespace!
So, given the SC consistent namespace you simple add the raw devices into the ASM configuration, i.e. /dev/did/rdsk/dXsY. If you are using Solaris Volume Manager, you would use /dev/md/<setname>/rdsk/dXXX and if you were using CVM/VxVM you would use /dev/vx/rdsk/<dg_name>/<dev_name>.
Of course, if you genuinely are using Clusterware on its own, then you have somewhat of a management issue! ... time to think about installing SC?
Tim
--- -
Hi All,
i am getting an error message in asm_alert log file saying "NOTE: ASMB process exiting due to lack of ASM file activity".
This leads to frequent crashing of NODE 1. Please check below detail error and suggest solution.
Thu Mar 24 07:05:11 2011
LMD0 (ospid: 32493) has not called a wait for 94 secs.
GES: System Load is HIGH.
GES: Current load is 55.87 and high load threshold is 20.00
Thu Mar 24 07:06:32 2011
LMD0 (ospid: 32493) has not called a wait for 174 secs.
GES: System Load is HIGH.
GES: Current load is 71.23 and high load threshold is 20.00
Thu Mar 24 07:06:36 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:07:49 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:08:16 2011
Waiting for clusterware split-brain resolution
Thu Mar 24 07:18:17 2011
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc (incident=60073):+
ORA-29740: evicted by member 1, group incarnation 120
Incident details in: /u01/app/oracle/diag/asm/asm/+ASM1/incident/incdir_60073/+ASM1_lmon_32484_i60073.trc+
Thu Mar 24 07:18:19 2011
+Trace dumping is performing id=[cdmp_20110324071819]+
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc:+
ORA-29740: evicted by member 1, group incarnation 120
LMON (ospid: 32484): terminating the instance due to error 29740
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_diag_32459.trc+
+Trace dumping is performing id=[cdmp_20110324071820]+
Instance terminated by LMON, pid = 32484
Thu Mar 24 07:18:31 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 172.20.223.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 172.20.222.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/asm_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.7.0.
Using parameter settings in server-side pfile /u01/app/oracle/product/11.1.0/asm_1/dbs/initASM1.ora+
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
cluster_database = TRUE
instance_number = 1
asm_diskstring = "ORCL:*"
asm_diskgroups = "REDO01"
asm_diskgroups = "REDO02"
asm_diskgroups = "DATA"
asm_diskgroups = "RECOVERY"
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
+172.20.223.25+
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Thu Mar 24 07:18:36 2011
PMON started with pid=2, OS id=23120
Thu Mar 24 07:18:36 2011
VKTM started with pid=3, OS id=23123 at elevated priority
VKTM running at (20)ms precision
Thu Mar 24 07:18:36 2011
DIAG started with pid=4, OS id=23127
Thu Mar 24 07:18:37 2011
PING started with pid=5, OS id=23129
Thu Mar 24 07:18:37 2011
PSP0 started with pid=6, OS id=23131
Thu Mar 24 07:18:37 2011
DIA0 started with pid=7, OS id=23133
Thu Mar 24 07:18:37 2011
LMON started with pid=8, OS id=23135
Thu Mar 24 07:18:37 2011
LMD0 started with pid=9, OS id=23137
Thu Mar 24 07:18:37 2011
LMS0 started with pid=10, OS id=23148 at elevated priority
Thu Mar 24 07:18:37 2011
MMAN started with pid=11, OS id=23152
Thu Mar 24 07:18:38 2011
DBW0 started with pid=12, OS id=23170
Thu Mar 24 07:18:38 2011
LGWR started with pid=13, OS id=23176
Thu Mar 24 07:18:38 2011
CKPT started with pid=14, OS id=23218
Thu Mar 24 07:18:38 2011
SMON started with pid=15, OS id=23224
Thu Mar 24 07:18:38 2011
RBAL started with pid=16, OS id=23237
Thu Mar 24 07:18:38 2011
GMON started with pid=17, OS id=23239
lmon registered with NM - instance id 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 124)
ASM instance
List of nodes:
+0 1 2+
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* allocate domain 2, invalid = TRUE
* allocate domain 3, invalid = TRUE
* allocate domain 4, invalid = TRUE
* domain 0 valid = 1 according to instance 1
* domain 1 valid = 1 according to instance 1
* domain 2 valid = 1 according to instance 1
* domain 3 valid = 1 according to instance 1
* domain 4 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
LMS 0: 0 GCS shadows traversed, 0 replayed
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Thu Mar 24 07:18:40 2011
LCK0 started with pid=18, OS id=23277
ORACLE_BASE from environment = /u01/app/oracle
Thu Mar 24 07:18:41 2011
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0xf7063e39
NOTE: cache began mount (not first) of group DATA number=1 incarn=0xf7063e39
NOTE: cache registered group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache began mount (not first) of group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache registered group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache began mount (not first) of group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache registered group REDO02 number=4 incarn=0xf7163e3c
NOTE: cache began mount (not first) of group REDO02 number=4 incarn=0xf7163e3c
NOTE:Loaded lib: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
NOTE: Assigning number (1,0) to disk (ORCL:ASM_DATA1)
NOTE: Assigning number (1,1) to disk (ORCL:ASM_DATA2)
NOTE: Assigning number (2,0) to disk (ORCL:ASM_RECO1)
NOTE: Assigning number (3,0) to disk (ORCL:ASM_LOG1)
NOTE: Assigning number (4,0) to disk (ORCL:ASM_LOG2)
kfdp_query(): 5
kfdp_queryBg(): 5
NOTE: cache opening disk 0 of grp 1: DATA1 label:ASM_DATA1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATA2 label:ASM_DATA2
NOTE: cache mounting (not first) group 1/0xF7063E39 (DATA)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 1
NOTE: LGWR attempting to mount thread 1 for diskgroup 1
NOTE: LGWR mounted thread 1 for disk group 1
NOTE: opening chunk 1 at fcn 0.10794571 ABA
NOTE: seq=81 blk=1313
NOTE: cache mounting group 1/0xF7063E39 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=1 incarn=0xf7063e39
kfdp_query(): 6
kfdp_queryBg(): 6
NOTE: cache opening disk 0 of grp 2: RECO1 label:ASM_RECO1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 2/0xF7063E3A (RECOVERY)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 2
NOTE: LGWR attempting to mount thread 1 for diskgroup 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.10436377 ABA
NOTE: seq=48 blk=4298
NOTE: cache mounting group 2/0xF7063E3A (RECOVERY) succeeded
NOTE: cache ending mount (success) of group RECOVERY number=2 incarn=0xf7063e3a
kfdp_query(): 7
kfdp_queryBg(): 7
NOTE: cache opening disk 0 of grp 3: LOG1 label:ASM_LOG1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 3/0xF7163E3B (REDO01)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 3
NOTE: LGWR attempting to mount thread 1 for diskgroup 3
NOTE: LGWR mounted thread 1 for disk group 3
NOTE: opening chunk 1 at fcn 0.229332 ABA
NOTE: seq=30 blk=10690
NOTE: cache mounting group 3/0xF7163E3B (REDO01) succeeded
NOTE: cache ending mount (success) of group REDO01 number=3 incarn=0xf7163e3b
kfdp_query(): 8
kfdp_queryBg(): 8
NOTE: cache opening disk 0 of grp 4: LOG2 label:ASM_LOG2
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 4/0xF7163E3C (REDO02)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 4
NOTE: LGWR attempting to mount thread 1 for diskgroup 4
NOTE: LGWR mounted thread 1 for disk group 4
NOTE: opening chunk 1 at fcn 0.225880 ABA
NOTE: seq=30 blk=10556
NOTE: cache mounting group 4/0xF7163E3C (REDO02) succeeded
NOTE: cache ending mount (success) of group REDO02 number=4 incarn=0xf7163e3c
kfdp_query(): 9
kfdp_queryBg(): 9
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 1
SUCCESS: diskgroup DATA was mounted
kfdp_query(): 10
kfdp_queryBg(): 10
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup RECOVERY was mounted
kfdp_query(): 11
kfdp_queryBg(): 11
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 3
SUCCESS: diskgroup REDO01 was mounted
kfdp_query(): 12
kfdp_queryBg(): 12
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 4
SUCCESS: diskgroup REDO02 was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT
Thu Mar 24 08:26:28 2011
Starting background process ASMB
Thu Mar 24 08:26:28 2011
ASMB started with pid=20, OS id=9597
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Thu Mar 24 08:27:39 2011
Starting background process ASMB
Thu Mar 24 08:27:39 2011
ASMB started with pid=25, OS id=10735
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
+[oracle@qa1crmrac1 trace]$ tail -1500 alert_ASM1.log
Thu Mar 24 07:05:11 2011
LMD0 (ospid: 32493) has not called a wait for 94 secs.
GES: System Load is HIGH.
GES: Current load is 55.87 and high load threshold is 20.00
Thu Mar 24 07:06:32 2011
LMD0 (ospid: 32493) has not called a wait for 174 secs.
GES: System Load is HIGH.
GES: Current load is 71.23 and high load threshold is 20.00
Thu Mar 24 07:06:36 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:07:49 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:08:16 2011
Waiting for clusterware split-brain resolution
Thu Mar 24 07:18:17 2011
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc (incident=60073):+
ORA-29740: evicted by member 1, group incarnation 120
Incident details in: /u01/app/oracle/diag/asm/asm/+ASM1/incident/incdir_60073/+ASM1_lmon_32484_i60073.trc+
Thu Mar 24 07:18:19 2011
+Trace dumping is performing id=[cdmp_20110324071819]+
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc:+
ORA-29740: evicted by member 1, group incarnation 120
LMON (ospid: 32484): terminating the instance due to error 29740
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_diag_32459.trc+
+Trace dumping is performing id=[cdmp_20110324071820]+
Instance terminated by LMON, pid = 32484
Thu Mar 24 07:18:31 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 172.20.223.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 172.20.222.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/asm_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.7.0.
Using parameter settings in server-side pfile /u01/app/oracle/product/11.1.0/asm_1/dbs/initASM1.ora+
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
cluster_database = TRUE
instance_number = 1
asm_diskstring = "ORCL:*"
asm_diskgroups = "REDO01"
asm_diskgroups = "REDO02"
asm_diskgroups = "DATA"
asm_diskgroups = "RECOVERY"
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
+172.20.223.25+
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Thu Mar 24 07:18:36 2011
PMON started with pid=2, OS id=23120
Thu Mar 24 07:18:36 2011
VKTM started with pid=3, OS id=23123 at elevated priority
VKTM running at (20)ms precision
Thu Mar 24 07:18:36 2011
DIAG started with pid=4, OS id=23127
Thu Mar 24 07:18:37 2011
PING started with pid=5, OS id=23129
Thu Mar 24 07:18:37 2011
PSP0 started with pid=6, OS id=23131
Thu Mar 24 07:18:37 2011
DIA0 started with pid=7, OS id=23133
Thu Mar 24 07:18:37 2011
LMON started with pid=8, OS id=23135
Thu Mar 24 07:18:37 2011
LMD0 started with pid=9, OS id=23137
Thu Mar 24 07:18:37 2011
LMS0 started with pid=10, OS id=23148 at elevated priority
Thu Mar 24 07:18:37 2011
MMAN started with pid=11, OS id=23152
Thu Mar 24 07:18:38 2011
DBW0 started with pid=12, OS id=23170
Thu Mar 24 07:18:38 2011
LGWR started with pid=13, OS id=23176
Thu Mar 24 07:18:38 2011
CKPT started with pid=14, OS id=23218
Thu Mar 24 07:18:38 2011
SMON started with pid=15, OS id=23224
Thu Mar 24 07:18:38 2011
RBAL started with pid=16, OS id=23237
Thu Mar 24 07:18:38 2011
GMON started with pid=17, OS id=23239
lmon registered with NM - instance id 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 124)
ASM instance
List of nodes:
+0 1 2+
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* allocate domain 2, invalid = TRUE
* allocate domain 3, invalid = TRUE
* allocate domain 4, invalid = TRUE
* domain 0 valid = 1 according to instance 1
* domain 1 valid = 1 according to instance 1
* domain 2 valid = 1 according to instance 1
* domain 3 valid = 1 according to instance 1
* domain 4 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
LMS 0: 0 GCS shadows traversed, 0 replayed
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Thu Mar 24 07:18:40 2011
LCK0 started with pid=18, OS id=23277
ORACLE_BASE from environment = /u01/app/oracle
Thu Mar 24 07:18:41 2011
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0xf7063e39
NOTE: cache began mount (not first) of group DATA number=1 incarn=0xf7063e39
NOTE: cache registered group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache began mount (not first) of group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache registered group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache began mount (not first) of group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache registered group REDO02 number=4 incarn=0xf7163e3c
NOTE: cache began mount (not first) of group REDO02 number=4 incarn=0xf7163e3c
NOTE:Loaded lib: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
NOTE: Assigning number (1,0) to disk (ORCL:ASM_DATA1)
NOTE: Assigning number (1,1) to disk (ORCL:ASM_DATA2)
NOTE: Assigning number (2,0) to disk (ORCL:ASM_RECO1)
NOTE: Assigning number (3,0) to disk (ORCL:ASM_LOG1)
NOTE: Assigning number (4,0) to disk (ORCL:ASM_LOG2)
kfdp_query(): 5
kfdp_queryBg(): 5
NOTE: cache opening disk 0 of grp 1: DATA1 label:ASM_DATA1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATA2 label:ASM_DATA2
NOTE: cache mounting (not first) group 1/0xF7063E39 (DATA)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 1
NOTE: LGWR attempting to mount thread 1 for diskgroup 1
NOTE: LGWR mounted thread 1 for disk group 1
NOTE: opening chunk 1 at fcn 0.10794571 ABA
NOTE: seq=81 blk=1313
NOTE: cache mounting group 1/0xF7063E39 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=1 incarn=0xf7063e39
kfdp_query(): 6
kfdp_queryBg(): 6
NOTE: cache opening disk 0 of grp 2: RECO1 label:ASM_RECO1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 2/0xF7063E3A (RECOVERY)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 2
NOTE: LGWR attempting to mount thread 1 for diskgroup 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.10436377 ABA
NOTE: seq=48 blk=4298
NOTE: cache mounting group 2/0xF7063E3A (RECOVERY) succeeded
NOTE: cache ending mount (success) of group RECOVERY number=2 incarn=0xf7063e3a
kfdp_query(): 7
kfdp_queryBg(): 7
NOTE: cache opening disk 0 of grp 3: LOG1 label:ASM_LOG1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 3/0xF7163E3B (REDO01)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 3
NOTE: LGWR attempting to mount thread 1 for diskgroup 3
NOTE: LGWR mounted thread 1 for disk group 3
NOTE: opening chunk 1 at fcn 0.229332 ABA
NOTE: seq=30 blk=10690
NOTE: cache mounting group 3/0xF7163E3B (REDO01) succeeded
NOTE: cache ending mount (success) of group REDO01 number=3 incarn=0xf7163e3b
kfdp_query(): 8
kfdp_queryBg(): 8
NOTE: cache opening disk 0 of grp 4: LOG2 label:ASM_LOG2
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 4/0xF7163E3C (REDO02)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 4
NOTE: LGWR attempting to mount thread 1 for diskgroup 4
NOTE: LGWR mounted thread 1 for disk group 4
NOTE: opening chunk 1 at fcn 0.225880 ABA
NOTE: seq=30 blk=10556
NOTE: cache mounting group 4/0xF7163E3C (REDO02) succeeded
NOTE: cache ending mount (success) of group REDO02 number=4 incarn=0xf7163e3c
kfdp_query(): 9
kfdp_queryBg(): 9
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 1
SUCCESS: diskgroup DATA was mounted
kfdp_query(): 10
kfdp_queryBg(): 10
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup RECOVERY was mounted
kfdp_query(): 11
kfdp_queryBg(): 11
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 3
SUCCESS: diskgroup REDO01 was mounted
kfdp_query(): 12
kfdp_queryBg(): 12
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 4
SUCCESS: diskgroup REDO02 was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT
Thu Mar 24 08:26:28 2011
Starting background process ASMB
Thu Mar 24 08:26:28 2011
ASMB started with pid=20, OS id=9597
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Thu Mar 24 08:27:39 2011
Starting background process ASMB
Thu Mar 24 08:27:39 2011
ASMB started with pid=25, OS id=10735
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Do i need to set the compatible parameter?
Regards,
VishIt looks to me like your server is absolutely buried, and ASM may just be an innocent bystander. What is going on in the database when this happens? Also, run sar samples at 30 second intervals up to when this happens to see what is happening. It's overhead, but you need to find what is causing the problem with the server(s).
Are you swapping? -
Patching Strategy for CRS and ASM homes
I'm fairly new to RAC/ASM and haven't performed any patch set upgrades yet. Back in the simple days when I wanted to apply a patch set to a database, say from 10.2.0.4 to 10.2.0.5, I would create a brand new Oracle home ahead of time and apply the patch set to it. I'd name my homes like this:
/opt/oracle/product/10.2.0.4/db1
/opt/oracle/product/10.2.0.5/db1
During the maintenance window I would change /etc/oratab to point the database to the new 10.2.0.5 and complete the database upgrade scripts. The advantages of this strategy:
1 - Less risk installing software as nothing uses the new home yet. If something goes wrong in the install, no big deal. Research the problem and try again without being under the stress of a defined maintenance window.
2 - No need to backup old home for back-out purposes.
3 - Less time required for database to be down during actual patch window since Oracle Installer does not need to run.
Now with CRS and ASM, is there a way to pre-stage a new home for those, but not have them "active" to the node until later during the maintenance window?
For ASM, it seems like it would be possible to treat the same way as database and simply update ASM SID in /etc/oratab
+ASM1:/opt/oracle/product/10.2.0.5/asm1
but I'm not totally confident in that as I'm afraid the CRS home may already have references to the ASM home in the cluster registry.
For CRS, it seems like the home is pretty well hard-wired into the node startup scripts and installing a brand new CRS home will probably disrupt the running CRS home.
Any thoughts about this?Hi,
user5448593 wrote:
I'm fairly new to RAC/ASM and haven't performed any patch set upgrades yet. Back in the simple days when I wanted to apply a patch set to a database, say from 10.2.0.4 to 10.2.0.5, I would create a brand new Oracle home ahead of time and apply the patch set to it.
Now with CRS and ASM, is there a way to pre-stage a new home for those, but not have them "active" to the node until later during the maintenance window?Although you have not mentioned the version you are actually on, it is a quite up-to-date question and dilemma.
Starting with 11.2 for Grid Infrastructure only "out-of-place" patchset upgrades are supported.
>
For ASM, it seems like it would be possible to treat the same way as database and simply update ASM SID in /etc/oratab
+ASM1:/opt/oracle/product/10.2.0.5/asm1
but I'm not totally confident in that as I'm afraid the CRS home may already have references to the ASM home in the cluster registry.
For CRS, it seems like the home is pretty well hard-wired into the node startup scripts and installing a brand new CRS home will probably disrupt the running CRS home.
Any thoughts about this?As of 11gR2 the ASM is part of the Grid Infrastructure, therefore it is running from the same home and not recommended to separate them. (although you can do that)
By the way, what is your upgrade path? It could be easier to answer your questions if we knew that as there has been a quite a few enhancements and changes in the upgrade/patching process from 10g to 11g. (even between 11gR1 and 11gR2)
Regards,
Jozsef -
Campus cluster with storage replication
Hi all..
we are planning to implement a campus cluster with storage replication over a distance of 4km using remote mirror feature of sun storagetek 6140.
Primary storage ( the one where quorum resides ) and replicated secondary storage will be in separate sites interconnected with dedicated single mode fiber.
The nodes of the cluster will be using primary storage and the data from primary will be replicated to the second storage using remote mirror.
Now in case the primary storage failed completely how the cluster can continue operation with the second storage? what is the procedure ? how does the initial configuration look like?
Regards..
SHi,
a high level overview with a list of restrictions can be found here:
http://docs.sun.com/app/docs/doc/819-2971/6n57mi28m?q=TrueCopy&a=view
More details how to set this up can be found at:
http://docs.sun.com/app/docs/doc/819-2971/6n57mi28r?a=view
The basic setup would be to have 2 nodes, 2 storageboxes, TrueCopy between the 2 boxes but no crosscabling. The HAStoragePlus resource being part of a service resource group would use a device that had been "cldevice replicate"ed by the administrator. so that the "same" device could be used on both nodes.
I am not sure how a failover is triggered if the primary storage box failed. But due to the "replication" mentioned above, SC knows how to reconfigure the replication in the case of a failover.
Unfortunately, due to lack of HDS storage in my own lab, I was not able to test this setup; so this is all theory.
Regards
Hartmut
PS: Keep in mind, that the only replication technology integrated into SC today is HDS TrueCopy. If you're thinking of doing manual failovers anyway, you could have a look at the Sun Cluster Geographic Edition which is more a disaster recovery like configuration that combines 2 or more clusters and is able to failover resource groups including replication; this product already supports more replication technologies and will even more in the future. Have a look at http://docsview.sfbay.sun.com/app/docs/coll/1191.3 -
Clone multi-node RAC and ASM to single node
Hi everyone.
I need clone system with 3 application server and 2 oracle database RAC and ASM to single-node. The operating system is RHEL 5.
I see some metalink notes, but we can't found nothing. I find notes with multi-node to single-node, but nothing with RAC to non-RAC.
The eBS version is 11.5.10.2 and database version is 10.2.0.3
Is possible this clone?
Thanks very much.Hi User;
Please follow below and see its helpful:
EBS R12 with RAC and non-RAC
Re: RAC to single instance ebs R12
Re: Clone Oracle Apps 11.5.10.2 RacDB to Non-RAC DB
Re: CLONING R12 RAC to NON RAC CLONING giving error RMAN-05517 temporary file
[b]Migrating the DB-Tier (DB and CM) to Two node non RAC cluster[/b]
Also check:
http://www.oracle.com/technology/pub/articles/chan_sing2rac_install.html
Regard
Helios -
Reinstall DB and ASM using existing RAW devices in RAC
Hi,
We have two Database servers in Cluster environment DB1 and DB2 using CX300(SAN) as Storage device.Recently we had upgraded the OS kernel on DB1 from RHEL 3 to RHEL 4 and DB2 is still running on RHEL 3. Due to some application problems we wanted the DB1 to be rollbacked to RHEL 3
DB1-has ASM1 instance and DB2 has ASM2 instance running.Similary SID1 and SID2 on both of them.
Since we want to roll back to RHEL 3 it is a clean install on DB1. So my problem is I had never done this kind of reinstallation of DB and ASM using existing raw devices.
Can Someone sent me out some instructions and steps on how to do the reinstall without disturbing RAC,DB2 and Data on RAW/CX300(SAN) device.
I am basically a system admin not a complete Oracle DBA.I will be grateful for your help.
Thanks,
ShivaIt means, before proceeding a proper database backup must be taken, then it has to be started from scratch. RHEL3 is certified with 10gR1 and 10gR2, you should be aware of the patchsets available for the oracle version.
I suggest you to read the Clusterware and RDBMS installation guides:
Oracle® Database Release Notes
10g Release 2 (10.2) for Linux x86
B15659-03
Oracle® Database Oracle Clusterware and Oracle Real Application Clusters Installation Guide
10g Release 2 (10.2) for Linux
Part Number B14203-08
Installing Oracle RAC 10g Release 2 on Linux x86
~ Madrid -
High availability without RAC DB on windows cluster and multiple DB instanc
Is it possible to achieve High Availability on 2 Oracle DB in windows 2003 cluster servers and shared SAN server. without installing and configuring RAC DB and ASM.
Can we use Veritas or Symantec or anyother tool to get it done.
What are the options available to achive this?
Appreciate response.
Thanks
NoorPlease no double postings, this will not give you answers faster...
For answer see here:
HA Oracle DB Clustering requirement without RAC DB on shared SAN Server?
Maybe you are looking for
-
Hi All Hi All I have configured LSMW in Quality for which i have only 1 Project under which there are 13 sub Project & under each sub project there is only one Object . Totally there are 13 objects . Now i want to transport all of them in to a produ
-
Xdodelivery.cfg for email delivery - R12 Separate Payment Remittance Advice
Hi all, I'm trying to configure the Email Document Delivery for Separate Remittance Advice in R12 but have been unsuccessful so far. I filled the xdodelivery.cfg file with the relevant information and placed it under $XDO_TOP/resource directory and s
-
Please help me in creating macro
Hi all, How to create Excel report with macro which access data from another file(BEx report stored in .cub file) please help me in this regard. it's very urgent. let me know whether i have to create macro in Excel or using BEx..... Give me solution.
-
Problems with windows live mail
Hi there, Today I received my blackberry 8520, but i've got one problem with my windows live mail account. I only receive the e-mails on my live account after i checked them on my computer. I did receive an announcement that when i use an e-mailclien
-
[solved] my C code gives floating point errors.
I've tried to create a simple function to determine if a number is prime or not.. whenever I compile and try to run I get a floating point exception and can't quite figure out why, I'm very new to C and compiling manually using gcc so any help is app