Advice on Oracle RAC Listener Set up

Hi Forum
I am installing two databases on a 2 node RAC environment.
We have the SCAN setup which is listening on port 1521
I want the two databases I create to have seperate listener ports
i.e.
database1 to listen on port 1531
database2 to listen on port 1532
We are using GRID 11.2.0.3 and same for databases
Is there a best practice / process that I can follow to do this?
Thanks in advance.

The best practice is to have common SCAN listener (since it already have redundancy) and you are free to choose local port for each database or share the same port (default configuration). You can always change the ports even after default installation so nothing to worry here. Just make sure to decide on SCAN early before giving out to users since anything after SCAN is transparent to users but SCAN port must be advised in advance.
Regards
Tushar

Similar Messages

Oracle RAC listener password protection

Dear Gurus,
We have 2 node RAC setup 11gR2 and as a part of hardening we wish to set password for listener.
Can some one please guide how can we set password on listener that registered with CRS. What would be the impact if any?
Also, there are two things with which should be noted.
1) We are not using SCAN feature.
2) Listener created should be owned by oracle user but all listener are getting started by Grid.
Node 1 -
ps -ef | grep -i tns
root 125 2 0 Oct30 ? 00:00:00 [netns]
ora11g 35141 73510 0 12:50 pts/0 00:00:00 grep -i tns
grid 41763 1 0 Nov04 ? 00:00:05 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER -inherit
grid 49634 1 0 Nov04 ? 00:00:05 /u01/app/ora11g/product/11.2.0/db_1/bin/tnslsnr LISTENER_REMCORP1 -inherit
Node 2 -
ps -ef | grep -i tnsroot 125 2 0 Oct30 ? 00:00:00 [netns]
ora11g 33783 33742 0 12:50 pts/1 00:00:00 grep -i tns
grid 49817 1 0 Nov04 ? 00:00:05 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER -inherit
grid 56446 1 0 Nov04 ? 00:00:05 /u01/app/ora11g/product/11.2.0/db_1/bin/tnslsnr LISTENER_REMCORP2 -inherit
Regards,
Nikhil Mehta.
Edited by: 905267 on Nov 6, 2012 1:13 AM

Thanks for your reply Vlethakula.
When firing command from GRID/ASM home, it says service not available where as status is available from oracle home. While stopping listener from oracle home it gives TNS-01190 error.
remedy-ebu-db1*+ASM1:/home/grid>lsnrctl
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 06-NOV-2012 18:20:00
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Welcome to LSNRCTL, type "help" for information.
LSNRCTL> set current_listener LISTENER_REMCORP1
Current Listener is LISTENER_REMCORP1
LSNRCTL> stop LISTENER_REMCORP1
TNS-01101: Could not find service name
LSNRCTL> stop LISTENER_REMCORP1
TNS-01101: Could not find service name
LSNRCTL> status
TNS-01101: Could not find service name
LSNRCTL> exit
remedy-ebu-db1*+ASM1:/home/grid>su - ora11
su: user ora11 does not exist
remedy-ebu-db1*+ASM1:/home/grid>su - ora11g
Password:
remedy-ebu-db1*REMCORP1:/home/ora11g>lsnrctl
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 07-NOV-2012 09:18:52
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Welcome to LSNRCTL, type "help" for information.
LSNRCTL> set current_listener LISTENER_REMCORP1
Current Listener is LISTENER_REMCORP1
LSNRCTL> status
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=remedy-vip-ebu-db1)(PORT=1526)(IP=FIRST)))
STATUS of the LISTENER
Alias LISTENER_REMCORP1
Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date 04-NOV-2012 14:56:49
Uptime 2 days 18 hr. 22 min. 17 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/ora11g/product/11.2.0/db_1/network/admin/listener.ora
Listener Log File /u01/app/ora11g/product/11.2.0/db_1/log/diag/tnslsnr/remedy-ebu-db1/listener_remcorp1/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=121.244.255.54)(PORT=1526)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=121.244.255.50)(PORT=1526)))
Services Summary...
Service "REMCORP" has 2 instance(s).
Instance "REMCORP1", status READY, has 1 handler(s) for this service...
Instance "REMCORP2", status READY, has 1 handler(s) for this service...
Service "REMCORPXDB" has 2 instance(s).
Instance "REMCORP1", status READY, has 1 handler(s) for this service...
Instance "REMCORP2", status READY, has 1 handler(s) for this service...
The command completed successfully
LSNRCTL> stop
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=remedy-vip-ebu-db1)(PORT=1526)(IP=FIRST)))
TNS-01190: The user is not authorized to execute the requested listener command
LSNRCTL>
Regards,
Nikhil Mehta.

Oracle RAC listener password

Hi Guys,
We have 2 node RAC setup oracle 10g(10.2.0.4) and we wish to set password on listener which registered with CRS.
Can some one please guide how can we set password on listenet thts registered with CRS.
What would be the impact if any....
Help is appreciated.
Regards,
Milan

http://docs.oracle.com/cd/B19306_01/network.102/b14213/lsnrctl.htm#CIHEFEDH
just fyi,from 10g by default we have
lsnrctl status
Alias                     LISTENER
Version                   TNSLSNR for Solaris: Version 11.2.0.3.0 - Production
Start Date                29-MAR-2012 12:11:31
Uptime                    5 days 0 hr. 46 min. 19 sec
Trace Level               off
Security                  ON: Local OS Authentication     <<--------------see this
SNMP                      OFF

Linux set up for Oracle RAC (real application cluster)

Hi Guys,
I m wrkig as Oracle DBA.
Very curious to know the initials for RAC set up at OS level.
Can anyone provide his/her usefull guidelines for the same.
Although I know all steps at OS level also, but didn't did the set up of before Oracle RAC installation.
Want to increase knowlegde on like:
--how we sahre storage.
--how we set up network (private & virtual IP) and how can check working of NIC's.
--and other required things.
Will appreciate ur help and if someone want to share his/her personal experience.
Thx in advance.

[email protected] wrote:
Want to increase knowlegde on like:Here are very basic answers to very complex questions - from a pure Linux perspective running an Open Source stack and untainted kernel.
--how we sahre storage.Using multipath - this should ship with most 2.6 kernels. The kernel sees the shared storage LUNs as scsi devices - multipath does the rest. (and ASM can directly use a multipath device).
On a physical layer. Typical setup (on a RAC node) is using a HBA PCI card that runs fibre connections into a SAN switch. You can also use Infiniband (IB) as the I/O layer (as Oracle's Exadata database machine does). In this case the servers will use HCA PCI cards, run IB cables into the switch, and so will the storage array run an IB cable into the switch.
--how we set up network (private & virtual IP) and how can check working of NIC's.Depends on the achitecture choses as Interconnect. Typical choices are GigE or Infiniband (IB). Oracle's Exadata database machine (RAC) uses IB as already mentioned. (and is also our preferred Interconnect technology)
With IB you would use the OFED driver stack and have a range of ib.. commands available. These can be used to configure IP over IB (IPoIB) for use as an IP-based Interconnect, bonding of NICs, check a port's status, and so on.
--and other required things.As both Daniel and Hans indicated.. you are asking quite complex questions that require a manual (if not several) to be written in response. So best to refer to the manuals and OTN material available.
Also, if you and your company are serious about using RAC, then you should make use of Oracle's RAC Assurance group to assist you. They will provide you with starter kit information for the o/s selected. They will check every single configuration parameter afterwards and deliver a comprehensive report on what's wrong, what works and what doesn't. With recommended changes that need to be done.

Adding Standalone listener Oracle RAC

Dear Experts
We have oracle RAC setup on in our organization, now we also need to do streaming between our RAC server and another oracle server for public reports. We installed another network interface card on of our Oracle RAC server and connect it directly to other server but we are not able to add listener for that interfaces. I did manually entered listener configuration in "listener.ora" and added it also in CRS using "srvctl add listener". Srvctl start listener properly but when i check the status of listener using "lsnrctl status <listener_name> than it shows that listener do not support any services.
Your help will really be appreciate.

Dear P
Thanks for prompt reply. My listener for RAC is working fine, but standalone listener for one node on specific interface is not working. However i have added the listener using "srvctl add listener" command and it also start successfully but it does not support any service. See below the output of lsnrctl status.
[oracle@mangla ~]$ lsnrctl status listener_mangla_priv2
LSNRCTL for Linux: Version 11.1.0.6.0 - Production on 08-OCT-2010 13:01:35
Copyright (c) 1991, 2007, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=mangla-priv2)(PORT=1522)))
STATUS of the LISTENER
Alias listener_mangla_priv2
Version TNSLSNR for Linux: Version 11.1.0.6.0 - Production
Start Date 08-OCT-2010 12:35:54
Uptime 0 days 0 hr. 25 min. 41 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/oracle/product/network/admin/listener.ora
Listener Log File /u01/app/oracle/diag/tnslsnr/mangla/listener_mangla_priv2/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.1)(PORT=1522)))
The listener supports no services
The command completed successfully
[oracle@mangla ~]$ lsnrctl status listener_mangla
LSNRCTL for Linux: Version 11.1.0.6.0 - Production on 08-OCT-2010 13:03:07
Copyright (c) 1991, 2007, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=mangla-vip)(PORT=1521)(IP=FIRST)))
STATUS of the LISTENER
Alias LISTENER_MANGLA
Version TNSLSNR for Linux: Version 11.1.0.6.0 - Production
Start Date 08-OCT-2010 08:14:41
Uptime 0 days 4 hr. 48 min. 26 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/oracle/product/network/admin/listener.ora
Listener Log File /u01/app/oracle/diag/tnslsnr/mangla/listener_mangla/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.0.11)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.0.211)(PORT=1521)))
Services Summary...
Service "SYS$STRMADMIN.STREAMS_CAPTURE_CB_Q.PCBA" has 1 instance(s).
Instance "pcba1", status READY, has 1 handler(s) for this service...
Service "SYS$STRMADMIN.STREAMS_CAPTURE_GLB_Q.PCBA" has 1 instance(s).
Instance "pcba1", status READY, has 1 handler(s) for this service...
Service "SYS$STRMADMIN.STREAMS_CAPTURE_Q.PCBA" has 1 instance(s).
Instance "pcba1", status READY, has 1 handler(s) for this service...
Service "pcba" has 2 instance(s).
Instance "pcba1", status READY, has 1 handler(s) for this service...
Instance "pcba2", status READY, has 2 handler(s) for this service...
Service "pcbaXDB" has 2 instance(s).
Instance "pcba1", status READY, has 1 handler(s) for this service...
Instance "pcba2", status READY, has 1 handler(s) for this service...
Service "pcba_XPT" has 2 instance(s).
Instance "pcba1", status READY, has 1 handler(s) for this service...
Instance "pcba2", status READY, has 2 handler(s) for this service...
The command completed successfully
[oracle@mangla ~]$ crs_stat -t
Name Type Target State Host
ora....LA.lsnr application ONLINE ONLINE mangla
ora.mangla.gsd application ONLINE ONLINE mangla
ora....v2.lsnr application ONLINE ONLINE mangla
ora.mangla.ons application ONLINE ONLINE mangla
ora.mangla.vip application ONLINE ONLINE mangla
ora.pcba.db application ONLINE ONLINE mangla
ora....a1.inst application ONLINE ONLINE tarbela
ora....a2.inst application ONLINE ONLINE mangla
ora....LA.lsnr application ONLINE ONLINE tarbela
ora....ela.gsd application ONLINE ONLINE tarbela
ora....ela.ons application ONLINE ONLINE tarbela
ora....ela.vip application ONLINE ONLINE tarbela

Trying to change Oracle listener port 1521 to nodefault port on Oracle RAC

Could somebody please help me in the process of changing teh Oracle listener port 1521 to a non-default port on an Oracle RAC environment. I am total of four instance.
Regards.

Please read carefully about LOCAL_LISTENER parameter, you shouldn't put there just hostname....
Another way to do so - statically register database SID in listener. You should do it in listener.ora file, please read carefully documentation, otherwise you can use netca utility - it could make configuration for you properly.

Oracle RAC installation failover

Hi,
I have an Oracle RAC installation with 2 nodes with the data stored on a shared OCFS partition. I had a client test the connection using jdbc string for RAC failover. I tried shutting down one of the nodes on the RAC installation and the client could not connect to the oracle cluster database for the next 5 to 10mins.
I understand that the client would failover to the next available listener (On the next retry connection) if the node it is currently listening to has failed. Is there any configuration i should make to increase the failover efficiency?
Thanks for any advice.

Hi,
Server side failover is arranged by setting the remote_listener parameter.
Client side failover is set by using T(ransparent) A(pplication) F(ailover) (9i and higher)
or F(ast)C(onnection)F(ailover). Both are documented in the Net administrators manual for the version you didn't care to mention.
As far as I know, both TAF and FCF are not supported by the JDBC thin driver.
Sybrand Bakker
Senior Oracle DBA

How to configure sun application server 8.2 for Oracle RAC 10g

Hello,
We have numerous boxes running the sun platform application server 8.2 and 2 boxes running enterprise version 8.2 all connecting to a 4 node Oracle RAC 10 G release 2 database. We have the system up and working. The application servers are connecting just fine to the database and the apps don't have any problems querying, inserting, etc. However, when we try to do failover testing of situations when a node or nodes of the Oracle RAC database goes down the application server does not gain new valid connections. Our configuration is this, OracleDataSource for the data source, table validation turned on with a valid table, ONS configuration set in properties, connectionCache enabled, and fastconnectionfailover enabled as well in the properties. We have that long oracle rac url with load balancing turned on set fot the database URL. We have the checkbox checked to fail all connections on any failure. ONS is configured properly within the database because we have a java application that runs outside of the application server that uses all the same settings described above (only set manually in our code for the OracleDataSource). This application works seemlessly when DB nodes are shutdown. We can shutdown all but one node and it's still humming along without skipping a beat. Start up one of the others, kill the last node, it still hums along nicely without skipping a beat. We'd really like to get the applications running in the application server to work the same way. Any help would be greatly appreciated. We've tried all the combinations that we can think of with configuration settings in the application server and it never works. Am tempted to rip out the database connection pool from inside the application server and configure it manually in the code but we are using entity beans and this is the much easier approach, if it will work. It's down to the point of does sun application server actually work with oracle RAC for connection failovers.

Hi,
We are also facing similar execption. Here is the error, we are getting, when a node is failed on RAC.
[#|2007-11-11T12:43:53.685+0000|WARNING|sun-appserver-ee8.1_02|javax.enterprise.system.core.transaction|_ThreadID=38;|JTS5041: The resource manager is doing work outside a global transaction
oracle.jdbc.xa.OracleXAException
at oracle.jdbc.xa.OracleXAResource.checkError(OracleXAResource.java:1270)
at oracle.jdbc.xa.client.OracleXAResource.start(OracleXAResource.java:318)
at com.sun.gjc.spi.XAResourceImpl.start(XAResourceImpl.java:184)
at com.sun.jts.jta.TransactionState.startAssociation(TransactionState.java:258)
at com.sun.jts.jta.TransactionImpl.enlistResource(TransactionImpl.java:181)
at com.sun.enterprise.distributedtx.J2EETransaction.enlistResource(J2EETransaction.java:397)
at com.sun.enterprise.distributedtx.J2EETransactionManagerImpl.enlistResource(J2EETransactionManagerImpl.java:312)
at com.sun.enterprise.distributedtx.J2EETransactionManagerOpt.enlistResource(J2EETransactionManagerOpt.java:114)
at com.sun.enterprise.resource.ResourceManagerImpl.registerResource(ResourceManagerImpl.java:113)
at com.sun.enterprise.resource.ResourceManagerImpl.enlistResource(ResourceManagerImpl.java:71)
at com.sun.enterprise.resource.PoolManagerImpl.getResource(PoolManagerImpl.java:176)
at com.sun.enterprise.connectors.ConnectionManagerImpl.internalGetConnection(ConnectionManagerImpl.java:268)
at com.sun.enterprise.connectors.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:193)
at com.sun.enterprise.connectors.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:122)
at com.sun.gjc.spi.DataSource.getConnection(DataSource.java:70)
at com.syntegra.nasp.etp.dax.DBManager.getConnection(DBManager.java:192)
at com.syntegra.nasp.etp.dax.DBManager.createDBCommand(DBManager.java:241)
at com.syntegra.nasp.etp.dax.DBManager.createDBCommand(DBManager.java:251)
at com.syntegra.nasp.etp.dax.sp.SPS_PRESCRIPTION_GUID_PROC.getCommand(SPS_PRESCRIPTION_GUID_PROC.java:31)
at com.syntegra.nasp.etp.dax.sp.SPS_PRESCRIPTION_GUID_PROC.execute(SPS_PRESCRIPTION_GUID_PROC.java:23)
at com.syntegra.nasp.etp.dax.PrescriptionBaseDataMapper.loadPresciptionByGUID(PrescriptionBaseDataMapper.java:203)
at com.syntegra.nasp.etp.model.PrescriptionBase.findByPrescriptionGUID(PrescriptionBase.java:176)
at com.syntegra.nasp.etp.messages.PatientPrescriptionReleaseRequest.execute(PatientPrescriptionReleaseRequest.java:120)
at com.syntegra.nasp.etp.service.ETPSLBean.processMessage(ETPSLBean.java:159)
at sun.reflect.GeneratedMethodAccessor97.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at com.sun.enterprise.security.SecurityUtil.invoke(SecurityUtil.java:147)
at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:128)
at $Proxy6.processMessage(Unknown Source)
at com.syntegra.nasp.etp.listener.RequestListener.onRequest(RequestListener.java:204)
at com.syntegra.spine.csf.consumer.mdb.CSFListenerRegisteringConsumer.onRequest(CSFListenerRegisteringConsumer.java:54)
at com.syntegra.spine.csf.consumer.mdb.CSFConsumerBase.invokeListener(CSFConsumerBase.java:267)
at com.syntegra.spine.csf.consumer.mdb.CSFConsumerBase.processMessage(CSFConsumerBase.java:180)
at com.syntegra.spine.csf.consumer.mdb.CSFConsumerBase.onMessage(CSFConsumerBase.java:102)
at sun.reflect.GeneratedMethodAccessor96.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at com.sun.enterprise.security.SecurityUtil$2.run(SecurityUtil.java:153)
at java.security.AccessController.doPrivileged(Native Method)
at com.sun.enterprise.security.application.EJBSecurityManager.doAsPrivileged(EJBSecurityManager.java:955)
at com.sun.enterprise.security.SecurityUtil.invoke(SecurityUtil.java:158)
at com.sun.ejb.containers.MessageBeanContainer.deliverMessage(MessageBeanContainer.java:956)
at com.sun.ejb.containers.MessageBeanListenerImpl.deliverMessage(MessageBeanListenerImpl.java:42)
at com.sun.enterprise.connectors.inflow.MessageEndpointInvocationHandler.invoke(MessageEndpointInvocationHandler.java:130)
at $Proxy9.onMessage(Unknown Source)
at com.sun.genericra.inbound.DeliveryHelper.deliverMessage(DeliveryHelper.java:183)
at com.sun.genericra.inbound.DeliveryHelper.deliver(DeliveryHelper.
Regards
Selvan.

Problem when I extend an oracle rac 10g on new node

Hi everyone
I need to extend an oracle RAC but i have problems when I add a new node. My actual enviroment is:
1) Oracle Grid Infraestructure 11gR2 - 11.2.0.3 (Upgraded from Clusterware 10gR2 + ASM 10gR2)
2) Oracle Rac Database - 10.2.0.5
(all on one only node)
The first problem was when I executed the script "root.sh" on the new node because this script called the old Clusterware home (/oracle/product/10.2.0/crshome). I edited the file and I changed this path for /oracle/gridbase/product/11.2.0/gridhome (current home for GI). Finally, I execute the script.
Now, I tried to extend the rac through of DBCA, but when, I choose the new node and I clic on "next" button then appears the following error:
"The nodes "[rstatbdbpm02]" are not part of the cluster. Make sure clusterware is active on these nodes before proceeding"
However, when I execute the "crsctl" command to view the status of cluster the result is correct:
[oracle@rstatbdbpm01] /home/oracle > crsctl status res -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATA.dg
ONLINE ONLINE rstatbdbpm01
ONLINE ONLINE rstatbdbpm02
ora.LISTENER.lsnr
ONLINE ONLINE rstatbdbpm01
ONLINE ONLINE rstatbdbpm02
ora.asm
ONLINE ONLINE rstatbdbpm01 Started
ONLINE ONLINE rstatbdbpm02 Started
ora.gsd
OFFLINE OFFLINE rstatbdbpm01
OFFLINE OFFLINE rstatbdbpm02
ora.net1.network
ONLINE ONLINE rstatbdbpm01
ONLINE ONLINE rstatbdbpm02
ora.ons
ONLINE ONLINE rstatbdbpm01
ONLINE ONLINE rstatbdbpm02
ora.registry.acfs
ONLINE ONLINE rstatbdbpm01
ONLINE ONLINE rstatbdbpm02
Cluster Resources
ora.BDBPM.BDBPM1.inst
1 ONLINE ONLINE rstatbdbpm01
ora.BDBPM.BPMVEH.BDBPM1.srv
1 ONLINE ONLINE rstatbdbpm01
ora.BDBPM.BPMVEH.cs
1 ONLINE ONLINE rstatbdbpm01
ora.BDBPM.db
1 ONLINE ONLINE rstatbdbpm01
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rstatbdbpm02
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE rstatbdbpm02
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE rstatbdbpm01
ora.cvu
1 ONLINE ONLINE rstatbdbpm01
ora.oc4j
1 ONLINE ONLINE rstatbdbpm01
ora.rstatbdbpm01.vip
1 ONLINE ONLINE rstatbdbpm01
ora.rstatbdbpm02.vip
1 ONLINE ONLINE rstatbdbpm02
ora.scan1.vip
1 ONLINE ONLINE rstatbdbpm02
ora.scan2.vip
1 ONLINE ONLINE rstatbdbpm02
ora.scan3.vip
1 ONLINE ONLINE rstatbdbpm01
[oracle@rstatbdbpm01] /home/oracle >
Please, Any idea with that problem?
Thanks,
Luis

Hi,
Please check dbca trace logs for further checks, it will give an idea what command is being run to check status of cluster.
Generally first checks should be on inventory for rdbms home, grid home and making sure no ORACLE related parameter is set in environment.
Regards,
Sharma

Oracle RAC 10.2G reboots node every 45 minutes

Hello:
- We have installed Oracle RAC 10.2G for Solaris X86 ( 64 bit ).
- On one node, there are no issues. But the other node ( I think )
is being rebooted by CRS every 45 minutes or so.
- Is this issue caused by some misconfiguration I did during the install ?
- Or is there a patch available to fix this ?
- Has anyone else encountered this problem ?
Thanks
jlem

Hello:
- I re-installed Oracle RAC. The nodes were only rebooted once so far.
So, the second install may be ok. If not, I have provided answers to the first email reply.
- Any help given is most welcome. In meantime, I will continue searching the oracle forums
for solutions.
- My environment is:
- both nodes are running under vmware ESX server version 3.0.1
- the shared storage for OCR and Voting Disk is a raw shared device under vmware
- both nodes are using Solaris X86 5.10 update 5
- Oracle version is: 10.2.0.3 ( patched from version 10.2.0.1 )
- My public network configuration is:
node 1:
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 10.20.1.74 netmask ffff0000 broadcast 10.20.255.255
ether 0:c:29:3a:45:a9
e1000g0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 10.20.1.77 netmask ffff0000 broadcast 10.20.255.255
node 2:
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 10.20.1.75 netmask ffff0000 broadcast 10.20.255.255
ether 0:c:29:2b:db:90
e1000g0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 10.20.1.78 netmask ffff0000 broadcast 10.20.255.255
- My private network configuration is:
node 1:
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 192.168.0.1 netmask ffffff00 broadcast 192.168.0.255
ether 0:c:29:3a:45:b3
node 2:
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 192.168.0.2 netmask ffffff00 broadcast 192.168.0.255
ether 0:c:29:2b:db:9a
- My storage solution is:
- 3 virtual shared SCSI hard disks ( each 500 MB in size )
- My log files are:
- /var/adm/messages
- doesn't report much only the following:
Nov 12 10:57:05 saucer nfs4cbd[328]: [ID 867284 daemon.notice] nfsv4 cannot determine local hostname binding for transport
tcp6 - delegations will not be available on this transport
Nov 12 10:57:21 saucer savecore: [ID 570001 auth.error] reboot after panic: forced crash dump initiated at user requestNov 12 10:57:21 saucer savecore: [ID 748169 auth.error] saving system crash dump in /var/crash/saucer/*.2Nov 12 10:57:41 saucer root: [ID 702911 user.error] Oracle Cluster Ready Services disabled by administrator.Nov 12 10:57:54 saucer rootnex: [ID 349649 kern.info] xsvc0 at rootNov 12 10:57:54 saucer genunix: [ID 936769 kern.info] xsvc0 is /xsvc
- ocssd.log file for node1 indicates that node2 was evicted for impeding a reconfig. Details are:
[    CSSD]2008-11-12 10:55:43.700 [15] >TRACE: clssnmPollingThread: node saucer (2) is impending reconfig
[    CSSD]2008-11-12 10:55:43.700 [15] >WARNING: clssnmPollingThread: node saucer (2) at 90% heartbeat fatal, eviction in 0
.973 seconds
[    CSSD]2008-11-12 10:55:44.679 [15] >TRACE: clssnmPollingThread: node saucer (2) is impending reconfig
[    CSSD]2008-11-12 10:55:44.679 [15] >TRACE: clssnmPollingThread: Eviction started for node saucer (2), flags 0x000d, s
tate 3, wt4c 0
[    CSSD]2008-11-12 10:55:44.690 [17] >TRACE: clssnmDoSyncUpdate: Initiating sync 3
[    CSSD]2008-11-12 10:55:44.690 [17] >TRACE: clssnmDoSyncUpdate: diskTimeout set to (27000)ms
[    CSSD]2008-11-12 10:55:44.691 [17] >TRACE: clssnmSetupAckWait: Ack message type (11)
[    CSSD]2008-11-12 10:55:44.691 [17] >TRACE: clssnmSetupAckWait: node(1) is ALIVE
[    CSSD]2008-11-12 10:55:44.691 [17] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[    CSSD]2008-11-12 10:55:44.691 [17] >TRACE: clssnmSendSync: syncSeqNo(3)
- node2 ocssd.log does not indicate the problem. See below for details:
[    CSSD]2008-11-12 10:52:34.731 [11] >TRACE: clssgmClientConnectMsg: Connect from con(da8410) proc(dab900) pid() proto(
10:2:1:1)
[    CSSD]2008-11-12 10:53:37.305 [11] >TRACE: clssgmClientConnectMsg: Connect from con(da8410) proc(dab900) pid() proto(
10:2:1:1)
[    CSSD]2008-11-12 10:54:40.515 [11] >TRACE: clssgmClientConnectMsg: Connect from con(da8410) proc(dab900) pid() proto(
10:2:1:1)
[    CSSD]2008-11-12 11:18:09.997 >USER: Oracle Database 10g CSS Release 10.2.0.3.0 Production Copyright 1996, 2004 Orac
le. All rights reserved.
[    CSSD]2008-11-12 11:18:09.997 >USER: CSS daemon log for node saucer, number 2, in cluster crs
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=saucerDBG_CSSD))
[    CSSD]2008-11-12 11:18:10.016 [1] >TRACE: clssscmain: local-only set to false
[    CSSD]2008-11-12 11:18:10.031 [1] >TRACE: clssnmReadNodeInfo: added node 1 (flying) to cluster
[    CSSD]2008-11-12 11:18:10.042 [1] >TRACE: clssnmReadNodeInfo: added node 2 (saucer) to cluster
[    CSSD]2008-11-12 11:18:10.057 [5] >TRACE: clssnm_skgxnmon: skgxn init failed
[    CSSD]2008-11-12 11:18:10.057 [1] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
- ORACLE VERIFY: cluvfy was run on node2 resulting with the following:
bash-3.00$ ./cluvfy comp ocr -n all -verbose
Verifying OCR integrity
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Uniqueness check for OCR device passed.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.
Verification of OCR integrity was successful.
bash-3.00$
Thanks
jlem

Oracle RAC crs无法启动的问题

这两个节点的RAC是做为DataGuard备库。
版本：Red Linux 5.6，Oracle 10.2.0.3.0
node1->$ crsctl check crs
CSS appears healthy
Cannot communicate with CRS
EVM appears healthy
node1->$ crsctl query css votedisk
0. 0 /dev/raw/raw1
located 1 votedisk(s).
node1->$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 497744
Used space (kbytes) : 3820
Available space (kbytes) : 493924
ID : 1682116375
Device/File Name : /dev/raw/raw4
Device/File integrity check succeeded
Device/File not configured
Cluster registry integrity check succeeded
# *./oifcfg getif*
eth0 10.17.19.0 global cluster_interconnect
eth1 172.17.19.0 global public
# */etc/init.d/init.crs start*
node1->$ ps -ef|grep crs
root 5083 1 0 15:10 ? 00:00:00 /bin/su -l oracle -c sh -c 'ulimit -c unlimited; cd /app/oracle/product/10.2.0/crs_1/log/node1/evmd; exec /app/oracle/product/10.2.0/crs_1/bin/evmd '
oracle 17459 4769 0 16:09 pts/1 00:00:00 grep crs
oracle 26397 5083 0 15:51 ? 00:00:00 /app/oracle/product/10.2.0/crs_1/bin/evmd.bin
root 26619 26370 0 15:51 ? 00:00:00 /bin/su -l oracle -c /bin/sh -c 'cd /app/oracle/product/10.2.0/crs_1/log/node1/cssd/oclsomon; ulimit -c unlimited; /app/oracle/product/10.2.0/crs_1/bin/oclsomon || exit $?'
oracle 26626 26619 0 15:51 ? 00:00:00 /bin/sh -c cd /app/oracle/product/10.2.0/crs_1/log/node1/cssd/oclsomon; ulimit -c unlimited; /app/oracle/product/10.2.0/crs_1/bin/oclsomon || exit $?
oracle 26672 26626 0 15:51 ? 00:00:00 /app/oracle/product/10.2.0/crs_1/bin/oclsomon.bin
oracle 26691 26371 0 15:51 ? 00:00:00 /app/oracle/product/10.2.0/crs_1/bin/ocssd.bin
oracle 27094 26397 0 15:51 ? 00:00:00 /app/oracle/product/10.2.0/crs_1/bin/evmlogger.bin -o /app/oracle/product/10.2.0/crs_1/evm/log/evmlogger.info -l /app/oracle/product/10.2.0/crs_1/evm/log/evmlogger.log
alertnode1.log 文件部份内容：
2012-11-13 15:51:07.152
[cssd(26691)]CRS-1605:CSSD voting file is online: /dev/raw/raw1. Details in /app/oracle/product/10.2.0/crs_1/log/node1/cssd/ocssd.log.
2012-11-13 15:51:08.084
[cssd(26691)]CRS-1601:CSSD Reconfiguration complete. Active nodes are node1 node2 .
2012-11-13 15:51:08.320
[evmd(26397)]CRS-1401:EVMD started on node node1.
ocssd.log 文件内容：
[    CSSD]2012-11-13 15:51:05.037 >USER: Oracle Database 10g CSS Release 10.2.0.3.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[    CSSD]2012-11-13 15:51:05.037 >USER: CSS daemon log for node node1, number 1, in cluster crs
[    CSSD]2012-11-13 15:51:05.040 [2246605696] >TRACE: clssscmain: local-only set to false
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=node1DBG_CSSD))
[    CSSD]2012-11-13 15:51:05.065 [2246605696] >TRACE: clssnmReadNodeInfo: added node 1 (node1) to cluster
[    CSSD]2012-11-13 15:51:05.074 [2246605696] >TRACE: clssnmReadNodeInfo: added node 2 (node2) to cluster
[    CSSD]2012-11-13 15:51:05.077 [1120115008] >TRACE: clssnm_skgxnmon: skgxn init failed
[    CSSD]2012-11-13 15:51:05.077 [2246605696] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
[    CSSD]2012-11-13 15:51:05.079 [2246605696] >TRACE: clssnmNMInitialize: misscount set to (60), impending reconfig threshold set to (56000)
[    CSSD]2012-11-13 15:51:05.079 [2246605696] >TRACE: clssnmNMInitialize: diskShortTimeout set to (57000)ms
[    CSSD]2012-11-13 15:51:05.080 [2246605696] >TRACE: clssnmNMInitialize: diskLongTimeout set to (200000)ms
[    CSSD]2012-11-13 15:51:05.082 [2246605696] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw1)
[    CSSD]2012-11-13 15:51:05.082 [1120115008] >TRACE: clssnmvDPT: spawned for disk 0 (/dev/raw/raw1)
[    CSSD]2012-11-13 15:51:07.127 [1120115008] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/raw/raw1)
[    CSSD]2012-11-13 15:51:07.153 [1130604864] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (/dev/raw/raw1) initial sleep interval (1000)ms
[    CSSD]2012-11-13 15:51:07.161 [2246605696] >TRACE: clssnmFatalInit: fatal mode enabled
[    CSSD]2012-11-13 15:51:07.161 [1151584576] >TRACE: clssnmconnect: connecting to node 1, flags 0x0001, connector 1
[    CSSD]2012-11-13 15:51:07.161 [1120115008] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(12) wrtcnt(78619) LATS(1830084) Disk lastSeqNo(78619)
[    CSSD]2012-11-13 15:51:07.162 [1151584576] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=node1-priv)(PORT=49895))
[    CSSD]2012-11-13 15:51:07.162 [1151584576] >TRACE: clssnmconnect: connecting to node 0, flags 0x0000, connector 1
[    CSSD]2012-11-13 15:51:07.162 [1151584576] >TRACE: clssnmClusterListener: Probing node 2, con (0x2aaaac10c320)
[    CSSD]2012-11-13 15:51:07.171 [1162074432] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[    CSSD]2012-11-13 15:51:07.171 [1162074432] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node1_crs))
[    CSSD]2012-11-13 15:51:07.172 [1193544000] >TRACE: clssgmPeerListener: Listening on (ADDRESS=(PROTOCOL=tcp)(DEV=19)(HOST=10.17.19.20)(PORT=18701))
[    CSSD]2012-11-13 15:51:07.198 [1151584576] >TRACE: clssnmConnComplete: connected to node 2 (con 0x2aaaac163b50), state 3 birth 0, unique 1352712566/1352712566 prevConuni(0)
[    CSSD]2012-11-13 15:51:07.673 [1204033856] >TRACE: clssnmPollingThread: Connection complete
[    CSSD]2012-11-13 15:51:07.673 [1214523712] >TRACE: clssnmSendingThread: Connection complete
[    CSSD]2012-11-13 15:51:07.673 [1225013568] >TRACE: clssnmRcfgMgrThread: Connection complete
[    CSSD]2012-11-13 15:51:08.003 [1151584576] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[node2] seq[45] sync[12]
[    CSSD]2012-11-13 15:51:08.003 [1151584576] >TRACE: clssnmHandleSync: diskTimeout set to (57000)ms
[    CSSD]2012-11-13 15:51:08.003 [1151584576] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(12)
[    CSSD]2012-11-13 15:51:08.004 [1151584576] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[    CSSD]2012-11-13 15:51:08.004 [1151584576] >TRACE: clssnmDeactivateNode: node 0 () left cluster
[    CSSD]2012-11-13 15:51:08.004 [1151584576] >TRACE: clssnmUpdateNodeState: node 1, state (1/2) unique (1352793064/1352793064) prevConuni(0) birth (0/12) (old/new)
[    CSSD]2012-11-13 15:51:08.004 [1151584576] >TRACE: clssnmUpdateNodeState: node 2, state (4/3) unique (1352712566/1352712566) prevConuni(0) birth (0/1) (old/new)
[    CSSD]2012-11-13 15:51:08.004 [1151584576] >USER: clssnmHandleUpdate: SYNC(12) from node(2) completed
[    CSSD]2012-11-13 15:51:08.004 [1151584576] >USER: clssnmHandleUpdate: NODE 1 (node1) IS ACTIVE MEMBER OF CLUSTER
[    CSSD]2012-11-13 15:51:08.004 [1151584576] >USER: clssnmHandleUpdate: NODE 2 (node2) IS ACTIVE MEMBER OF CLUSTER
[    CSSD]2012-11-13 15:51:08.004 [1151584576] >TRACE: clssnmHandleUpdate: diskTimeout set to (200000)ms
[    CSSD]2012-11-13 15:51:08.081 [2246605696] >USER: NMEVENT_SUSPEND [00][00][00][00]
[    CSSD]2012-11-13 15:51:08.081 [1235503424] >TRACE: clssgmReconfigThread: started for reconfig (12)
[    CSSD]2012-11-13 15:51:08.081 [1235503424] >USER: NMEVENT_RECONFIG [00][00][00][06]
[    CSSD]2012-11-13 15:51:08.081 [1235503424] >TRACE: clssgmEstablishConnections: 2 nodes in cluster incarn 12
[    CSSD]2012-11-13 15:51:08.082 [1193544000] >TRACE: clssgmInitialRecv: (0xd9ae050) accepted a new connection from node 2 born at 1 active (2, 2), vers (10,3,1,2)
[    CSSD]2012-11-13 15:51:08.082 [1193544000] >TRACE: clssgmInitialRecv: conns done (2/2)
[    CSSD]2012-11-13 15:51:08.082 [1235503424] >TRACE: clssgmEstablishMasterNode: MASTER for 12 is node(2) birth(1)
[    CSSD]2012-11-13 15:51:08.082 [1235503424] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
[    CSSD]2012-11-13 15:51:08.083 [1193544000] >TRACE: clssgmHandleDBDone(): src/dest (2/65535) size(72) incarn 12
[    CSSD]CLSS-3000: reconfiguration successful, incarnation 12 with 2 nodes
[    CSSD]CLSS-3001: local node number 1, master node number 2
[    CSSD]2012-11-13 15:51:08.084 [1235503424] >TRACE: clssgmReconfigThread: completed for reconfig(12), with status(1)
[    CSSD]2012-11-13 15:51:08.268 [1162074432] >TRACE: clssgmClientConnectMsg: Connect from con(0xd9b4d50) proc(0xd9b9d50) pid() proto(10:2:1:1)
[    CSSD]2012-11-13 15:51:08.268 [1193544000] >TRACE: clssgmCommonAddMember: clsomon joined (1/0x1000000/#CSS_CLSSOMON)
[    CSSD]2012-11-13 15:51:08.269 [1162074432] >TRACE: clssgmClientConnectMsg: Connect from con(0xd9b7910) proc(0xd9ba0a0) pid() proto(10:2:1:1)
查看ocr，表决磁盘，存储，网络，裸设备权限，都没有发现问题，有时候执行/etc/init.d/init.crs start还会导致服务器重启，日志内容如下：
/var/log/message重启时的日志
Nov 13 15:51:03 node1 logger: Cluster Ready Services completed waiting on dependencies.
Nov 13 15:51:03 node1 logger: Cluster Ready Services completed waiting on dependencies.
Nov 13 16:10:54 node1 auditd[3667]: Audit daemon rotating log files
Nov 13 16:49:14 node1 auditd[3667]: Audit daemon rotating log files
Nov 13 16:50:37 node1 root: Cluster Ready Services completed waiting on dependencies.
Nov 13 16:52:07 node1 logger: Oracle CSS family monitor shutting down. 3
Nov 13 16:52:07 node1 root: Oracle CRSD 5797 set to stop
Nov 13 16:52:07 node1 root: Oracle CRSD 5797 shutdown completed
Nov 13 16:52:07 node1 root: Oracle EVMD set to stop
Nov 13 16:52:07 node1 root: Oracle CSSD being stopped
Nov 13 16:52:17 node1 root: Oracle CSSD being stopped
Nov 13 16:52:27 node1 root: Oracle EVMD set to stop
Nov 13 16:52:45 node1 root: Oracle CSSD being stopped
Nov 13 17:03:14 node1 root: Oracle CRSD 5797 set to stop
Nov 13 17:03:14 node1 root: Oracle CRSD 5797 shutdown completed
Nov 13 17:03:14 node1 root: Oracle EVMD set to stop
Nov 13 17:03:14 node1 root: Oracle CSSD being stopped
Nov 13 17:03:26 node1 root: Oracle Cluster Ready Services starting by user request.
Nov 13 17:03:35 node1 logger: Cluster Ready Services completed waiting on dependencies.
Nov 13 17:03:36 node1 logger: Oracle CSSD shell script failure. Duplicate CSSD.
Nov 13 17:03:36 node1 kernel: md: stopping all md devices.
Nov 13 17:21:49 node1 syslogd 1.4.1: restart.
Nov 13 17:21:49 node1 kernel: klogd 1.4.1, log source = /proc/kmsg started.
出现 Nov 13 17:03:36 node1 logger: Oracle CSSD shell script failure. Duplicate CSSD. 之后，服务器就重启了
在网上查了不少类似问题，其他网友无法启动CRS主要集中在几个方面：
1、/tmp权限不正确
2、删除/var/tmp/.oracle下的文件，再重启
3、oifcfg查看到网卡设置问题
但我遇到的问题，以上3项都是正常的，跟这个http://www.itpub.net/thread-1330782-1-1.html 问题类似。
请问这个问题是什么原因导致的？
帖子经 user1738965编辑过
帖子经 user1738965编辑过

关掉第1个节点，重启第2个节点，
crsd.log文件还是没有写入任何信息
alertnode2.log 部份日志
2012-11-14 12:57:53.568
[cssd(10296)]CRS-1605:CSSD voting file is online: /dev/raw/raw1. Details in /app/oracle/product/10.2.0/crs_1/log/node2/cssd/ocssd.log.
2012-11-14 13:01:13.616
[cssd(10296)]CRS-1601:CSSD Reconfiguration complete. Active nodes are node2 .
2012-11-14 13:01:13.776
[evmd(10080)]CRS-1401:EVMD started on node node2.
ocssd.log 部份日志
[    CSSD]2012-11-14 12:57:51.475 >USER: Oracle Database 10g CSS Release 10.2.0.3.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=node2DBG_CSSD))
[    CSSD]2012-11-14 12:57:51.475 >USER: CSS daemon log for node node2, number 2, in cluster crs
[    CSSD]2012-11-14 12:57:51.482 [1618381696] >TRACE: clssscmain: local-only set to false
[    CSSD]2012-11-14 12:57:51.496 [1618381696] >TRACE: clssnmReadNodeInfo: added node 1 (node1) to cluster
[    CSSD]2012-11-14 12:57:51.500 [1618381696] >TRACE: clssnmReadNodeInfo: added node 2 (node2) to cluster
[    CSSD]2012-11-14 12:57:51.503 [1105389888] >TRACE: clssnm_skgxnmon: skgxn init failed
[    CSSD]2012-11-14 12:57:51.503 [1618381696] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
[    CSSD]2012-11-14 12:57:51.505 [1618381696] >TRACE: clssnmNMInitialize: misscount set to (60), impending reconfig threshold set to (56000)
[    CSSD]2012-11-14 12:57:51.505 [1618381696] >TRACE: clssnmNMInitialize: diskShortTimeout set to (57000)ms
[    CSSD]2012-11-14 12:57:51.506 [1618381696] >TRACE: clssnmNMInitialize: diskLongTimeout set to (200000)ms
[    CSSD]2012-11-14 12:57:51.508 [1618381696] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw1)
[    CSSD]2012-11-14 12:57:51.508 [1105389888] >TRACE: clssnmvDPT: spawned for disk 0 (/dev/raw/raw1)
[    CSSD]2012-11-14 12:57:53.552 [1105389888] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/raw/raw1)
[    CSSD]2012-11-14 12:57:53.575 [1128057152] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (/dev/raw/raw1) initial sleep interval (1000)ms
[    CSSD]2012-11-14 12:57:53.587 [1105389888] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(15) wrtcnt(59321) LATS(3927324) Disk lastSeqNo(59321)
[    CSSD]2012-11-14 12:57:53.589 [1618381696] >TRACE: clssnmFatalInit: fatal mode enabled
[    CSSD]2012-11-14 12:57:53.589 [1149036864] >TRACE: clssnmconnect: connecting to node 2, flags 0x0001, connector 1
[    CSSD]2012-11-14 12:57:53.590 [1149036864] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=node2-priv)(PORT=49895))
[    CSSD]2012-11-14 12:57:53.590 [1149036864] >TRACE: clssnmconnect: connecting to node 0, flags 0x0000, connector 1
[    CSSD]2012-11-14 12:57:53.590 [1149036864] >TRACE: clssnmconnect: connecting to node 1, flags 0x0001, connector 0
[    CSSD]2012-11-14 12:57:53.595 [1149036864] >TRACE: clsc_send_msg: (0x108cf430) NS err (12571, 12560), transport (530, 111, 0)
[    CSSD]2012-11-14 12:57:53.600 [1159526720] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_2))
[    CSSD]2012-11-14 12:57:53.600 [1159526720] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node2_crs))
[    CSSD]2012-11-14 12:57:53.601 [1190996288] >TRACE: clssgmPeerListener: Listening on (ADDRESS=(PROTOCOL=tcp)(DEV=19)(HOST=10.17.19.21)(PORT=52492))
[    CSSD]2012-11-14 12:57:53.601 [1201486144] >TRACE: clssnmPollingThread: Connection complete
[    CSSD]2012-11-14 12:57:53.601 [1211976000] >TRACE: clssnmSendingThread: Connection complete
[    CSSD]2012-11-14 12:57:53.601 [1222465856] >TRACE: clssnmRcfgMgrThread: Connection complete
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmRcfgMgrThread: Local Join
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmDoSyncUpdate: Initiating sync 1
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmDoSyncUpdate: diskTimeout set to (57000)ms
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmSetupAckWait: Ack message type (11)
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmSendSync: syncSeqNo(1)
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(1)
[    CSSD]2012-11-14 12:58:00.616 [1149036864] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[node2] seq[1] sync[1]
[    CSSD]2012-11-14 12:58:00.616 [1149036864] >TRACE: clssnmHandleSync: diskTimeout set to (57000)ms
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmWaitForAcks: done, msg type(11)
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmSetupAckWait: Ack message type (13)
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmSendVote: syncSeqNo(1)
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(1)
[    CSSD]2012-11-14 12:58:00.616 [1149036864] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(1)
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmWaitForAcks: done, msg type(13)
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmCheckDskInfo: Checking disk info...
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmCheckDskInfo: diskTimeout set to (200000)ms
[    CSSD]2012-11-14 12:58:00.616 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(7030) state_network(0) state_disk(3) misstime(3934354)
[    CSSD]2012-11-14 12:58:00.671 [1618381696] >USER: NMEVENT_SUSPEND [00][00][00][00]
[    CSSD]2012-11-14 12:58:01.616 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(8030) state_network(0) state_disk(3) misstime(3934354)
[    CSSD]2012-11-14 12:58:02.618 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(9030) state_network(0) state_disk(3) misstime(3935354)
[    CSSD]2012-11-14 12:58:03.619 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(10030) state_network(0) state_disk(3) misstime(3936354)
[    CSSD]2012-11-14 12:58:04.620 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(11030) state_network(0) state_disk(3) misstime(3937354)
[    CSSD]2012-11-14 12:58:05.620 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(12030) state_network(0) state_disk(3) misstime(3938354)
[    CSSD]2012-11-14 12:58:06.621 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(13030) state_network(0) state_disk(3) misstime(3939364)
[    CSSD]2012-11-14 12:58:07.622 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(14030) state_network(0) state_disk(3) misstime(3940364)
中间这个clssnmCheckDskInfo日志有点多，超出回复字数限制，这里就去掉了一部份。
[    CSSD]2012-11-14 13:01:11.813 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(198200) state_network(0) state_disk(3) misstime(4124704)
[    CSSD]2012-11-14 13:01:12.814 [1222465856] >TRACE: clssnmCheckDskInfo: node(1) timeout(199200) state_network(0) state_disk(3) misstime(4125704)
[    CSSD]2012-11-14 13:01:13.615 [1222465856] >TRACE: clssnmEvict: Start
[    CSSD]2012-11-14 13:01:13.615 [1222465856] >TRACE: clssnmWaitOnEvictions: Start
[    CSSD]2012-11-14 13:01:13.615 [1222465856] >TRACE: clssnmSetupAckWait: Ack message type (15)
[    CSSD]2012-11-14 13:01:13.615 [1222465856] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[    CSSD]2012-11-14 13:01:13.615 [1222465856] >TRACE: clssnmSendUpdate: syncSeqNo(1)
[    CSSD]2012-11-14 13:01:13.615 [1222465856] >TRACE: clssnmWaitForAcks: Ack message type(15), ackCount(1)
[    CSSD]2012-11-14 13:01:13.615 [1149036864] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[    CSSD]2012-11-14 13:01:13.615 [1149036864] >TRACE: clssnmDeactivateNode: node 0 () left cluster
[    CSSD]2012-11-14 13:01:13.615 [1149036864] >TRACE: clssnmUpdateNodeState: node 1, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[    CSSD]2012-11-14 13:01:13.615 [1149036864] >TRACE: clssnmDeactivateNode: node 1 (node1) left cluster
[    CSSD]2012-11-14 13:01:13.615 [1149036864] >TRACE: clssnmUpdateNodeState: node 2, state (2/2) unique (1352869071/1352869071) prevConuni(0) birth (1/1) (old/new)
[    CSSD]2012-11-14 13:01:13.615 [1149036864] >USER: clssnmHandleUpdate: SYNC(1) from node(2) completed
[    CSSD]2012-11-14 13:01:13.615 [1149036864] >USER: clssnmHandleUpdate: NODE 2 (node2) IS ACTIVE MEMBER OF CLUSTER
[    CSSD]2012-11-14 13:01:13.615 [1149036864] >TRACE: clssnmHandleUpdate: diskTimeout set to (200000)ms
[    CSSD]2012-11-14 13:01:13.615 [1222465856] >TRACE: clssnmWaitForAcks: done, msg type(15)
[    CSSD]2012-11-14 13:01:13.615 [1222465856] >TRACE: clssnmDoSyncUpdate: Sync Complete!
[    CSSD]2012-11-14 13:01:13.615 [1232955712] >TRACE: clssgmReconfigThread: started for reconfig (1)
[    CSSD]2012-11-14 13:01:13.615 [1232955712] >USER: NMEVENT_RECONFIG [00][00][00][04]
[    CSSD]2012-11-14 13:01:13.615 [1232955712] >TRACE: clssgmEstablishConnections: 1 nodes in cluster incarn 1
[    CSSD]2012-11-14 13:01:13.616 [1190996288] >TRACE: clssgmPeerListener: connects done (1/1)
[    CSSD]2012-11-14 13:01:13.616 [1232955712] >TRACE: clssgmEstablishMasterNode: MASTER for 1 is node(2) birth(1)
[    CSSD]2012-11-14 13:01:13.616 [1232955712] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
[    CSSD]2012-11-14 13:01:13.616 [1232955712] >TRACE: clssgmMasterCMSync: Synchronizing group/lock status
[    CSSD]2012-11-14 13:01:13.616 [1232955712] >TRACE: clssgmMasterSendDBDone: group/lock status synchronization complete
[    CSSD]CLSS-3000: reconfiguration successful, incarnation 1 with 1 nodes
[    CSSD]CLSS-3001: local node number 2, master node number 2
[    CSSD]2012-11-14 13:01:13.616 [1232955712] >TRACE: clssgmReconfigThread: completed for reconfig(1), with status(1)
[    CSSD]2012-11-14 13:01:13.732 [1159526720] >TRACE: clssgmClientConnectMsg: Connect from con(0x10a0b9a0) proc(0x10a10980) pid() proto(10:2:1:1)
[    CSSD]2012-11-14 13:01:13.732 [1159526720] >TRACE: clssgmClientConnectMsg: Connect from con(0x10a0e540) proc(0x10a10c50) pid() proto(10:2:1:1)
[    CSSD]2012-11-14 13:01:13.733 [1159526720] >TRACE: clssgmCommonAddMember: clsomon joined (2/0x1000000/#CSS_CLSSOMON)

Oracle RAC / Logical Data Guard causing network problems on VMware

We have VMWare 5.0 cluster across the 12 blades (6 per chassis) running a mixture of Red Hat and Windows 2008 R2 vms. The Red Hat boxes are two times two node Oracle RAC (primary and secondary), also apache web servers and jboss application servers. The Windows servers are for AV/DC/Management/Monitoring.
The problem is that intermittent network connectivity to random Windows and Red Hat boxes occur when the Oracle RAC builds up archive logs and then ships / applies them to the secondary nodes, between ESX nodes either on different blades in the same chassis or across the chassis and even when all RAC nodes are on the same ESX host.
We are using NFS, Oracle 11g and Red Hat 6.2.
Sorry if this info is a bit vague, im not an Oracle expert! :-)
thanks,
Dave

Hi,
1.) The calculation for Standby RedoLogs is:
(Max Number of Logfiles per thread (Instance) +1) * Max Number of Threads (Instances))
So if you have 4 Redo Log Groups on your primary (which is 2 Redo Log Groups per Instance), then it ends up:
(2 +1) * 2 = 6
So actually you will only need 6 standby redo logs, not 8. But 2 more don't harm.
Your primary will need exactly the same number (6 or in your case 8). Which will be 3 per thread/instance or in your case 4.
2.) The SID List in the listener.ora is a listing of SIDs the Listener is listening on. It is not the listener name.
Hence it is not "lsnrctl guard_dgmgrl start" but only "lsnrctl LISTENER start", whereas the LISTENER is the default and "lsnrctl start" would be sufficient.
However since this is grid infrastructure with the listener running out of ASM home, be sure to have set your environment to GI Home not to DB_HOME for the listener.ora entries, but to DB_HOME for the tnsnames.ora entries necessary for data guard.
And since listener is running under clusterware you should use "srvctl stop listener" and start.
Last but not least the SID entries for dataguard have to use DGMGRL not dgmgrl.
3.) Here is the whitepaper you are looking for:
www.oracle.com/goto/maa
Also for client failover best practices.
(Here the direct link to the RAC whitepaper):
http://www.oracle.com/technetwork/database/features/availability/maa-wp-10g-racprimarysingleinstance-131970.pdf
However since this is 10g you should combine this with the 11g RAC standy paper (e.g. SCAN Listener setup).
Sebastian

Oracle RAC scalability(does it linearly scalable upto 20-30 nodes)

We are looking datastorage solution as Oracle -RAC for following performance requirement
Application is generating Resources which we want to store in database and provide searching on this resources.
Resource have 2 part one is data and another one is meta
Data contains textual/binray data like txt/html/doc/excel/pdf/image file etc
meta contains 30-40 different property telling something about data.
Average resource size is 10K
Insertion speed required for such resource (Data + Meta ) 2Gbps(30K Resource/Second )
We want indexing also on data and meta.
We used single oracle database and created resource table which has 40 column for meta property and one column of blob type for data.
Performance achieved is 100Mbps insertion speed (on normal machine)
Now to go to 2Gbps we are thinking to use Oracle RAC to scale it up to 2Gbps Insertion speed.May be 20 Node is required to scale it upto 2Gbps.
Now my question is does Oracle RAC provide close to liener scalability upto 20-30 nodes or not.
Key requirement is to achieve insertion speed upto 2Gbps
High availability of oracle rac can be added advantage for us but key concern here is scalability not fault tolerance.

> Now we are not using oracle partitioning because it
is slow when we define domain index (even index is
local and it is not sync real time ,index maintaines
is off) but it maintains pending queue and which
slows down insertion process.
Hmm.. I'm using partitioning extensively for mass parallel inserts and it is a lot cleaner than individual tables (requiring dynamic SQL), and I have not seen any performance issues.
Can you elaborate on what issues you have seen?
> Our main and key concern is to achieve insertion
speed of 2Gbps initialy and should be scalable upto
10Gbbps.
May I ask what data you are collecting? This volume sounds a bit extreme - is something like collecting/sniffing UDP/TCP packets?
> 1.If we have say I/O and network bandwidth available
then does oracle RAC will be capable to consume this
available I/O and network bandwidth by adding more
nodes.
Yes. Remember that each cluster node has its own set of local platform resources - including a pipe to the shared storage (such as a I/O fibre channel).
What will cause an impact? Anything that will impact a single insert process will also impact that process across a cluster. E.g. two processes attempting to insert a row with the same PK - only one can succeed and thus one will be blocked by another. Bitmap indexing as a lock on a bitmap "slot" locks the index data for a number of rows - any of which can currently be updated by other processes. Etc.
How is this resolved in a cluster? As the processes are not local to the same platform, IPC cannot be used. Thus it means the Interconnect has to be used. This will be slower than IPC.
> 2.We also heard from HP that Interconnect of node
will be botteleneck for us but my question is if we
look at above scenario where only insertion and
searching is there and no updation is there in system
then will RAC Interconnect become botteleneck just
for hearbeat messanging.
The Interconnect need not be a bottleneck. Besides, the Interconnect is a fundamental cog in the share-everything cluster machine. It is not a "Bad Thing". So I question what HP is saying - for me to accept such advice, it needs to be backed up with hard technical facts.
If Interconnect is such an issue according to HP, just what do they recommend you use to scale your system with? Let me guess - some very expensive and very complex HP product? A superdome perhaps?
> 3.Indexing time is much more even we create index for
one hour data at one shot, can we dedicate few nodes
in RAC just for indexing.
That is what I'm doing - running up to a 100+ PQ process to do the index builds and rebuilds.
> 4.When i use Create table as select command to
transfer same amount of data from one table to
another table then it takes only 30 seconds and when
i use direct path uploading then it takes 3 minute,
Make sure that you're performance comparisons are valid. What do you imply with "direct path uploading"?
Remember that CTAS is disk-to-disk I/O via the SGA buffer cache. There is no "client side" involved. Nothing external. Not even a PGA buffer area. No pushing data from a client process via IPC to an Oracle server process.
If your benchmark includes pushing data from a client, even from a PL/SQL process, that will be slower than a CTAS - always.
When dealing with such large volumes, the "traditional RDBMS" approach need to be carefully considered. Every single constraint, every single index, every single trigger, results in a tiny overhead that becomes a very huge overhead given the data volumes.
Data management also plays a crucial role. Unless you can manage the data, you cannot effectively insert such huge volumes, process those volumes and query those volumes.
I see RAC and partitioning and PL/SQL server side processing as crucial ingredients to make this work.

Install Oracle RAC 10g (10.2.0.1) on HP-UX B.11.31 U ia64 failed

Hi All
I am installing Oracle RAC 10g 10.2.0.1 on HP-UX B.11.31 U ia64 but can not complete
hosts file
#Public IPs
10.144.1.111 spgdb01
10.144.1.112 spgdb02
#Private IPs
10.144.2.2 spgdb01p
10.144.2.3 spgdb02p
#Virtual IPs
10.144.1.113 spgdb01v
10.144.1.114 spgdb02v
I do installation with runInstaller without error. It copy and link is ok. When I run root.sh then It cannot complete as following
Checking to see if Oracle CRS stack is already configured
Checking to see if any 9i GSD is up
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle/product/10.2.0' is not owned by root
WARNING: directory '/oracle/product' is not owned by root
WARNING: directory '/oracle' is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 0: spgdb01 spgdb01p spgdb01
node 1: spgdb02 spgdb02p spgdb02
Creating OCR keys for user 'root', privgrp 'sys'..
Operation successful.
Now formatting voting device: /ora/crs/votedisk01
waitpid(-1, 0x7fffdf50, WUNTRACED) .................................................................................................... [sleeping]
Now formatting voting device: /oracle/oradata1/crs/votedisk02
Now formatting voting device: /oracle/oradata2/crs/votedisk03
Format of 3 voting devices complete.
Startup will be queued to init within 30 seconds.
====================
I have waited for 10 mins but still not complete
Additionally, log from runInstaller, I got
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2011-04-28_12-13-31AM. Please wait ...-bash-4.2$ Oracle Universal Installer, Version 10.2.0.1.0 Production
Copyright (C) 1999, 2005, Oracle. All rights reserved.
Private Interconnect : null
Private Interconnect : null
Private Interconnect : null
Private Interconnect : null
So, please help me fix this issue
Thank you

I had this problem and resolved it by transporting the file to the installation server with the correct ftp datatype (binary).
On page 54 of the install guide (..Server\Oracle_Business_Intelligence\doc\doc\bi.1013\b31765.pdf) that comes with the installation files, there is an instruction to make sure that any ftp activity is done in binary.
This may not have occured with the license.xml file if you use a tool which offers the "feature" of automatic datatype recognition.
Hope this helps.

How to create a wallet in oracle RAC environment

How to create a wallet in oracle RAC environment.
While running following command "alter system set encryption key identified by "thalesdata4";
I am getting error message "cannot auto create wallet" or "failed to open wallet.
Please suggest correct way to create a wallet in RAC environment.
Thanks
Sudhir

hi,
please refer for detailed explanation
Master Note for SSL Configuration in Fusion Middleware 11g [ID 1218695.1]
regards

Advice on Oracle RAC Listener Set up

Similar Messages

Maybe you are looking for