Node reboot

Hello
This was my exam question last week.
"b" and "e" are definetely correct but not sure about the last one.
Which three actions would be helpful in determining the cause of a node reboot ?
a-)determining the time of the node reboot by using the update command and subtracting the uptime from the current system time
b-)looking for messages such as "ORACLE CSSD failure". Rebooting the cluster integrity in /var/log/messages
c-)using crsctl command to view tracing information
d-)inspecting the ocssd log for "Begin Dump" or "End Dump" messages
e-)inspecting the database alert log for reboot messages

Hi;
Correct answer is ABE
Regard
Helios

Similar Messages

SC 3.2 Solaris 10 x86. When one node reboot, the other one does also

Configured a two node cluster with a EMC clariion san (Raid 6) for holding a zpool and use as quorum device.
When one node goes down, the other one does also.
There seems a problem with the quorum.
I can not understand or figure out what actually goes wrong.
When starting up:
Booting as part of a cluster
NOTICE: CMM: Node cnode01 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node cnode02 (nodeid = 2) with votecount = 1 added.
NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3.
NOTICE: clcomm: Adapter nge3 constructed
NOTICE: clcomm: Adapter nge2 constructed
NOTICE: CMM: Node cnode01: attempting to join cluster.
NOTICE: nge3: link down
NOTICE: nge2: link down
NOTICE: nge3: link up 1000Mbps Full-Duplex
NOTICE: nge2: link up 1000Mbps Full-Duplex
NOTICE: nge3: link down
NOTICE: nge2: link down
NOTICE: nge3: link up 1000Mbps Full-Duplex
NOTICE: nge2: link up 1000Mbps Full-Duplex
NOTICE: CMM: Node cnode02 (nodeid: 2, incarnation #: 1248284052) has become reachable.
NOTICE: clcomm: Path cnode01:nge2 - cnode02:nge2 online
NOTICE: clcomm: Path cnode01:nge3 - cnode02:nge3 online
NOTICE: CMM: Cluster has reached quorum.
NOTICE: CMM: Node cnode01 (nodeid = 1) is up; new incarnation number = 1248284001.
NOTICE: CMM: Node cnode02 (nodeid = 2) is up; new incarnation number = 1248284052.
NOTICE: CMM: Cluster members: cnode01 cnode02.
NOTICE: CMM: node econfiguration #1 completed.
NOTICE: CMM: Node cnode01: joined cluster.
ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast
/dev/rdsk/c2t0d0s5 is clean
Reading ZFS config: done.
obtaining access to all attached disks
cnode01 console login:
Then this on the second node:
Booting as part of a cluster
NOTICE: CMM: Node cnode01 (nodeid = 1) with votecount = 1
NOTICE: CMM: Node cnode02 (nodeid = 2) with votecount = 1
NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3.
NOTICE: clcomm: Adapter nge3 constructed
NOTICE: clcomm: Adapter nge2 constructed
NOTICE: CMM: Node cnode02: attempting to join cluster.
NOTICE: CMM: Node cnode01 (nodeid: 1, incarnation #: 1248284001) has become reachable.
NOTICE: clcomm: Path cnode02:nge2 - cnode01:nge2 online
NOTICE: clcomm: Path cnode02:nge3 - cnode01:nge3 online
WARNING: CMM: Issuing a NULL Preempt failed on quorum device /dev/did/rdsk/d1s2 with error 2.
NOTICE: CMM: Cluster has reached quorum.ion ratio 4.77, dump succeeded
NOTICE: CMM: Node cnode01 (nodeid = 1) is up; new incarnation number = 1248284001.
NOTICE: CMM: Node cnode02 (nodeid = 2) is up; new incarnation number = 1248284052.
NOTICE: CMM: Cluster members: cnode01 cnode02.
NOTICE: CMM: node reconfiguration #1 completed.
NOTICE: CMM: Node cnode02: joined cluster.
NOTICE: CCR: Waiting for repository synchronization to finish.
*{color:#ff0000}WARNING: CMM: Issuing a NULL Preempt failed on quorum device /dev/did/rdsk/d1s2 with error 2.{color}*
ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast
/dev/rdsk/c2t0d0s5 is clean
Reading ZFS config: done.
obtaining access to all attached disks
cnode02 console login:
But when the first node reboot, on the second node this message:
Jul 22 19:24:48 cnode02 genunix: [ID 936769 kern.info] devinfo0 is /pseudo/devinfo@0
Jul 22 19:30:57 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge3: link down
Jul 22 19:30:57 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge2: link down
Jul 22 19:30:59 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge3: link up 1000Mbps Full-Duplex
Jul 22 19:31:00 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge2: link up 1000Mbps Full-Duplex
Jul 22 19:31:06 cnode02 genunix: [ID 489438 kern.notice] NOTICE: clcomm: Path cnode02:nge2 - cnode01:nge2 being drained
{color:#ff0000}Jul 22 19:31:06 cnode02 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x0{color}
Jul 22 19:31:06 cnode02 genunix: [ID 489438 kern.notice] NOTICE: clcomm: Path cnode02:nge3 - cnode01:nge3 being drained
Jul 22 19:31:11 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge3: link down
{color:#ff0000}Jul 22 19:31:12 cnode02 genunix: [ID 414208 kern.warning] WARNING: QUORUM_GENERIC: quorum preempt error in CMM: Error 5 --- QUORUM_GENERIC Tkown ioctl failed on quorum device /dev/did/rdsk/d1s2.{color}
{color:#ff0000}Jul 22 19:31:12 cnode02 cl_dlpitrans: [ID 624622 kern.notice] Notifying cluster that this node is panicking
Jul 22 19:31:12 cnode02 unix: [ID 836849 kern.notice]
Jul 22 19:31:12 cnode02 ^Mpanic[cpu3]/thread=ffffffff8b5c06e0:
Jul 22 19:31:12 cnode02 genunix: [ID 265925 kern.notice] CMM: Cluster lost operational quorum; aborting.{color}
Jul 22 19:31:12 cnode02 unix: [ID 100000 kern.notice]
Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651b40 genunix:vcmn_err+13 ()
Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651b50 cl_runtime:__1cZsc_syslog_msg_log_no_args6FpviipkcpnR__va_list_element__nZsc_syslog_msg_status_enum__+24 ()
Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651c30 cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+9d ()
Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651e20 cl_haci:__1cOautomaton_implbAstate_machine_qcheck_state6M_nVcmm_automaton_event_t__+3bc ()
Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651e60 cl_haci:__1cIcmm_implStransitions_thread6M_v_+de ()
Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651e70 cl_haci:__1cIcmm_implYtransitions_thread_start6Fpv_v_+b ()
Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651ed0 cl_orb:cllwpwrapper+106 ()
Jul 22 19:31:13 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651ee0 unix:thread_start+8 ()
Jul 22 19:31:13 cnode02 unix: [ID 100000 kern.notice]
Jul 22 19:31:13 cnode02 genunix: [ID 672855 kern.notice] syncing file systems...
Jul 22 19:31:13 cnode02 genunix: [ID 733762 kern.notice] 1
Jul 22 19:31:34 cnode02 last message repeated 20 times
Jul 22 19:31:35 cnode02 genunix: [ID 622722 kern.notice] done (not all i/o completed)
Jul 22 19:31:36 cnode02 genunix: [ID 111219 kern.notice] dumping to /dev/dsk/c2t0d0s1, offset 3436511232, content: kernel
Jul 22 19:31:45 cnode02 genunix: [ID 409368 kern.notice] ^M100% done: 136950 pages dumped, compression ratio 4.77,
Jul 22 19:31:45 cnode02 genunix: [ID 851671 kern.notice] dump succeeded
Jul 22 19:33:18 cnode02 genunix: [ID 540533 kern.notice] ^M

Hi,
the problem lies in the error message around the quorum device. The SC documentation, specifically the Sun Cluster Error Messages Guide at http://docs.sun.com/app/docs/doc/820-4681 explains this as follows:
414208 QUORUM_GENERIC: quorum preempt error in CMM: Error %d --- QUORUM_GENERIC Tkown ioctl failed on quorum device %s.
Description:
This node encountered an error when issuing a QUORUM_GENERIC Take Ownership operation on a quorum device. This error indicates that the node was unsuccessful in preempting keys from the quorum device, and the partition to which it belongs was preempted. If a cluster is divided into two or more disjoint subclusters, one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by gathering enough votes to grant it majority quorum. This action is called "preemption of the losing subclusters".
Solution:
Other related messages identify the quorum device where the error occurred. If an EACCES error occurs, the QUORUM_GENERIC command might have failed because of the SCSI3 keys on the quorum device. Scrub the SCSI3 keys off the quorum device and reboot the preempted nodes."
You should try to follow this advice. I would propose to chose a different QD before trying to do this, if you have one available. Is it possible that this LUN has been in use by a different cluster?
To scrub SCSI3 keys you should use the scsi command in /usr/cluster/lib/sc: ./scsi -c inkeys -d <device> to check for the existence of keys, and ...-c scrub.. to remove any SCSI3 keys.
Regards
Hartmut

RAC node reboots from time to time

Hi %,
we have a problem with our rac: it's a three node rac on sles9, 64 bit. one node reboots from time to time. We found nothing in any log file. (only in /var/log/messages of node 1:
"Feb 21 14:58:02 pmg-db1 kernel: o2net: connection to node pmg-db2 (num 1) at 192.168.0.2:7777 has been idle for 10 seconds, shutting it down."
). Does anyone had a similar problem? Or anyone an idea?
regards
Andreas

sorry no /var/log/demsg.
Perhaps I have to write another detail: the third node was added after the two node rac ran for several month. First we had the reboot problem with this third node. We found out, that the interconnect was connected to a 100Mbit module of the switch and not to a 1000Mbit module. We changed this a few days ago, but no the second node rebooted. And it is connected with 1000Mbit/s.
And did I mention, that we use 10.2.0.2?
regards
Andreas

Operational Quorum and both nodes rebooting.

I've experienced an issue that when I rip out the SCSI cables to shared storage (and the quorum device), both nodes panic and
reboot. Is this expected behavior?
It seems that it is understandable that the active node reboots, because it lost the disk-path and quorum device. But should
the stand-by node reboot to?

No problem.
It's running S10 update 4 w/ SC 3.2.
3120 JBOD attached to two T2000's, two-node cluster.
I'm wondering if the stand-by node didn't see the quorum device, when the the active nodes scsi cables were pulled.
We pulled the standby nodes SCSI cables and reconnected them prior to pulling the active nodes. The difference was that the stand-by node's /var/adm/messages log was filled with expected messages regarding a missing disk. The cables were re-attached to the stand-by and then yanked out of the active node. This is when both nodes panicked.

Secondary Node Rebooted instead of falling to Ok prompt

Hi all,
We need to get system backup for our clustered DB, before and after our maintenance work.
We have the following configuration:
Node #1 and Node #2
Solaris 8
SunCluster 3.0
Oracle 9.3.4
VxVM 3.2
Before issuing cluster shutdown command, I verified which node is primary.
#scstat
I issued scshutdown -y -i0 on the primary node, the secondary node rebooted instead of halting to {ok} prompt. (The primary server successfully fall to ok prompt)
When I checked on the logs on the secondary node.
May 16 08:18:41 SC[SUNW.HAStoragePlus,ttmapd-rg,tmsstor-res,hastorageplus_prenet_start_private]: Global device path /dev/vx/rdsk/tms_usr_dg01/bak_redo11_vol is not recognized as a device group or a device special file.
May 16 08:18:42 SC[SUNW.LogicalHostname,ttmapd-rg,tmslhost3-res,hafoip_start]: pnm_init: RPC: Rpcbind failure - RPC: Unable to receive
May 16 08:18:42 SC[SUNW.LogicalHostname,ttmapd-rg,tmslhost0-res,hafoip_start]: pnm_init: RPC: Rpcbind failure - RPC: Unable to receive
May 16 08:18:42 SC[SUNW.LogicalHostname,ttmapd-rg,tmslhost3-res,hafoip_start]: Failed to validate NAFO group name <nafo0> nafo errorcode <5>.
May 16 08:18:42 SC[SUNW.LogicalHostname,ttmapd-rg,tmslhost0-res,hafoip_start]: Failed to validate NAFO group name <nafo1> nafo errorcode <5>.
May 16 08:18:42 SC[SUNW.LogicalHostname,ttmapd-rg,tmslhost0-res,hafoip_stop]: pnm_init: RPC: Rpcbind failure - RPC: Unable to receive
May 16 08:18:42 SC[SUNW.LogicalHostname,ttmapd-rg,tmslhost0-res,hafoip_stop]: Failed to validate NAFO group name <nafo1> nafo errorcode <5>.
May 16 08:18:42 SC[SUNW.LogicalHostname,ttmapd-rg,tmslhost3-res,hafoip_stop]: pnm_init: RPC: Rpcbind failure - RPC: Unable to receive
May 16 08:18:42 SC[SUNW.LogicalHostname,ttmapd-rg,tmslhost3-res,hafoip_stop]: Failed to validate NAFO group name <nafo0> nafo errorcode <5>.
Has anyone encountered this error before.
Thank you in advance.
Regards,
Rachele

scshutdown gives a shutdown command on both nodes. here is the procedure.
-failover your resourcegroup to node_2
root@node_2#scswitch -z -g oracle -h node_2
you can check the status of the with scstat
-on the node which you want to backup:
root@node_1#init 0
ok boot -sx
s = single usermode
x = outside of the cluster
Once you're in single usermode, start your backup
If you wish to avoid a bunch of logging on node_2, you can always disable, or set it in maintenance, node_1 on node_2 with scconf:
root@node_2#scconf -q node=node_1,maintstate
(make sure you know what you're doing here)
to reboot node_1 and join it into the cluster:
root@node_1#umount -a
root@node_1#sync
root@node_1#reboot
ok boot (if auto-boot? is set to false)
root@node_2#scconf -q node=node_1,reset
that's it
cheers,
Kim

Both cluster node reboot

There is a two nodes cluster and running Oracle RAC DB. Yesterday both nodes rebooted at the same time (less than few seconds different). Don't know it was caused by Oracle CRS and server itsefl?
Here is the log:
/var/log/messages in node 1
Dec 8 15:14:38 dc01locs01 kernel: 493 http://RAIDarray.mppdcsgswsst6140:1:0:2 Cmnd failed-retry the same path. vcmnd SN 18469446 pdev H3:C0:T0:L2 0x02/0x04/0x01 0x08000002 mpp_status:1
Dec 8 15:14:38 dc01locs01 kernel: 493 http://RAIDarray.mppdcsgswsst6140:1:0:2 Cmnd failed-retry the same path. vcmnd SN 18469448 pdev H3:C0:T0:L2 0x02/0x04/0x01 0x08000002 mpp_status:1
Dec 8 15:17:20 dc01locs01 syslogd 1.4.1: restart.
Dec 8 15:17:20 dc01locs01 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Dec 8 15:17:20 dc01locs01 kernel: Linux version 2.6.18-128.7.1.0.1.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Mon Aug 24 14:07:09 EDT 2009
Dec 8 15:17:20 dc01locs01 kernel: Command line: ro root=/dev/vg00/root rhgb quiet crashkernel=128M@16M
Dec 8 15:17:20 dc01locs01 kernel: BIOS-provided physical RAM map:
ocssd.log in node 1
CSSD2009-12-08 15:14:33.467 1134680384 >TRACE: clssgmDispatchCMXMSG: msg type(13) src(2) dest(1) size(123) tag(00000000) incarnation(148585637)
CSSD2009-12-08 15:14:33.468 1134680384 >TRACE: clssgmHandleDataInvalid: grock HB+ASM, member 2 node 2, birth 1
CSSD2009-12-08 15:19:00.217 >USER: Copyright 2009, Oracle version 11.1.0.7.0
CSSD2009-12-08 15:19:00.217 >USER: CSS daemon log for node dc01locs01, number 1, in cluster ocsprodrac
clsdmtListening to (ADDRESS=(PROTOCOL=ipc)(KEY=dc01locs01DBG_CSSD))
CSSD2009-12-08 15:19:00.235 1995774848 >TRACE: clssscmain: Cluster GUID is 79db6803afc7df32ffd952110f22702c
CSSD2009-12-08 15:19:00.239 1995774848 >TRACE: clssscmain: local-only set to false
/var/log/messages in node 2
Dec 8 15:14:38 dc01locs02 kernel: 493 http://RAIDarray.mppdcsgswsst6140:1:0:2 Cmnd failed-retry the same path. vcmnd SN 18561465 pdev H3:C0:T0:L2 0x02/0x04/0x01 0x08000002 mpp_status:1
Dec 8 15:14:38 dc01locs02 kernel: 493 http://RAIDarray.mppdcsgswsst6140:1:0:2 Cmnd failed-retry the same path. vcmnd SN 18561463 pdev H3:C0:T0:L2 0x02/0x04/0x01 0x08000002 mpp_status:1
Dec 8 15:17:14 dc01locs02 syslogd 1.4.1: restart.
Dec 8 15:17:14 dc01locs02 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Dec 8 15:17:14 dc01locs02 kernel: Linux version 2.6.18-128.7.1.0.1.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Mon Aug 24 14:07:09 EDT 2009
Dec 8 15:17:14 dc01locs02 kernel: Command line: ro root=/dev/vg00/root rhgb quiet crashkernel=128M@16M
Dec 8 15:17:14 dc01locs02 kernel: BIOS-provided physical RAM map:
ocssd.log in node 2
CSSD2009-12-08 15:14:35.450 1264081216 >TRACE: clssgmExecuteClientRequest: Received data update request from client (0x2aaaac065a00), type 1
CSSD2009-12-08 15:14:36.909 1127713088 >TRACE: clssgmDispatchCMXMSG: msg type(13) src(1) dest(1) size(123) tag(00000000) incarnation(148585637)
CSSD2009-12-08 15:14:36.909 1127713088 >TRACE: clssgmHandleDataInvalid: grock HB+ASM, member 1 node 1, birth 0
CSSD2009-12-08 15:18:55.047 >USER: Copyright 2009, Oracle version 11.1.0.7.0
clsdmtListening to (ADDRESS=(PROTOCOL=ipc)(KEY=dc01locs02DBG_CSSD))
CSSD2009-12-08 15:18:55.047 >USER: CSS daemon log for node dc01locs02, number 2, in cluster ocsprodrac
CSSD2009-12-08 15:18:55.071 3628915584 >TRACE: clssscmain: Cluster GUID is 79db6803afc7df32ffd952110f22702c
CSSD2009-12-08 15:18:55.077 3628915584 >TRACE: clssscmain: local-only set to false

Hi!
I suppose this seems easy: you have a service at 'http://RAIDarray.mppdcsgswsst6140:1:0:2' (a RAID perhaps?) which failed. Logically all servers connected to thi RAID went down at the same time.
Seems no Oracle problem. Good luck!

Oracle Cluster Node Reboots Abruptly

One of our RAC 11gR2 Cluster Node rebooted abruptly. We found the following error in the grid home alter log file and ocssd.log file.
[cssd(6014)]CRS-1611:Network communication with node mumchora12 (1) missing for 75% of timeout interval. Removal of this node from cluster in 6.190 secondsWe need to find the Root Cause for this node reboot. Kindly assist.
OS Version : RHEL 5.8
GRID : 11.2.0.2
Database : 11.2.0.2.10

Hi,
By looking the logs it seems private interconnect problem. I would suggest you to refer one of nice metalink doc on same issue.
Node reboot or eviction: How to check if your private interconnect CRS can transmit network heartbeats [ID 1445075.1]
Hope it will help you to identify the root cause of node eviction.
Thanks

Cluster node reboots after network failure

hi all,
The suncluster 3.1 8/05 with 2 nodes (E2900) was working fine without any errors in the sccheck.
yesterday one node rebooted saying a network failure,errors in the massage file are
Jan 17 08:00:36 PRD in.mpathd[221]: [ID 594170 daemon.error] NIC failure detected on ce0 of group sc_ipmp0
Jan 17 08:00:36 PRD Cluster.PNM: [ID 890413 daemon.notice] sc_ipmp0: state transition from OK to DOWN.
Jan 17 08:00:47 PRD Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource PROD status on node PRD change to R_FM_DEGRADED
Jan 17 08:00:47 PRD Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource PROD status msg on node PRD change to <IPMP Failure.>
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group CFS state on node PRD change to RG_PENDING_OFFLINE
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource PROD state on node PRD change to R_MON_STOPPING
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hafoip_monitor_stop> for resource <PROD>, resource group <CFS>, timeout <300> seconds
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hafoip_monitor_stop> completed successfully for resource <PROD>, resource group <CFS>, time used: 0% of timeout <300 seconds>
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource PROD state on node PRD change to R_ONLINE_UNMON
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource PROD state on node PRD change to R_STOPPING
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hafoip_stop> for resource <PROD>, resource group <CFS>, timeout <300> seconds
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource PROD status on node PRD change to R_FM_UNKNOWN
Jan 17 08:00:50 PRD Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource PROD status msg on node PRD change to <Stopping>
Jan 17 08:00:51 PRD ip: [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 172.016.005.025:0, remote = 000.000.000.000:0, start = -2, end = 6
Jan 17 08:00:51 PRD ip: [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 53 connections
what can be the reason for reabooting?
is there any way to avoid this, with only a failover?
rgds
Message was edited by:
suj

What is in that resource group? The cause is probably something with Failover_mode=HARD set. Check the manual reference section for this. The option would be to set the Failover_mode=SOFT.
Tim
---

SC 3.2 nodes reboot when i reboot the first one

i had create a cluster with two nodes and quorom (shared file system) beetwen the two nodes. but when i try to reboot one node the second one reboot. i had solaris 10 and sun cluster 3.2. the error in the console is
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000031c498ca848 (ssd25):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000038e49941b19 (ssd26):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039449941c7d (ssd27):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039149941bd1 (ssd29):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039749941cab (ssd30):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005ab794000005ff4a564a48 (ssd47):
offline or reservation conflict
Update_drv failed to re-read did.conf file for did driver. Will retry once agai
n.
Update_drv failed to re-read did.conf file for did driver after 1 retry. Will t
ry devfsadm.
Devfsadm successfully configured did devices.
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000031c498ca848 (ssd25):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000038e49941b19 (ssd26):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039449941c7d (ssd27):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039149941bd1 (ssd29):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039749941cab (ssd30):
offline or reservation conflict
WARNING: /scsi_vhci/ssd@g600a0b80005ab794000005ff4a564a48 (ssd47):
offline or reservation conflict
Update_drv failed to re-read did.conf file for did driver. Will retry once agai
n.
Update_drv failed to re-read did.conf file for did driver after 1 retry. Will t
ry devfsadm.
Devfsadm successfully configured did devices.
Mohyi

A couple more questions:
- does clq status show that the quorum vote is counted correctly?
- what kind of storage are you using
- are these newly created LUNs that you are using or is it possible that these have been used before by other hosts or clusters?
- any interesting error messages in the log files - /var/adm/messages
- what is the panic string of the other node that reboots?
I do not think that the did related message is relevant in this context.

When one node reboot other node in RAC

Hi Friends,
I faced one situation where one node of RAC cluster had been rebooted by other node. This happen due to network interconnect link fluctuation.
Sep 13 16:23:48 kkvs1a su: [ID 810491 auth.crit] 'su admin' failed for wipro1 on /dev/pts/3
Sep 14 00:22:17 kkvs1a ixgbe: [ID 611667 kern.info] NOTICE: ixgbe3: link down
Sep 14 00:22:21 kkvs1a ixgbe: [ID 611667 kern.info] NOTICE: ixgbe3: link up, , full duplex
Sep 14 00:22:31 kkvs1a ixgbe: [ID 611667 kern.info] NOTICE: ixgbe1: link down
Sep 14 00:22:31 kkvs1a ixgbe: [ID 611667 kern.info] NOTICE: ixgbe3: link down
/opt/oracle/product/10.2.0/crs/log/node1/alertkk1a.log
==============================================
2013-09-14 00:22:05.180
[cssd(12561)]CRS-1612:node kk1b (2) at 50% heartbeat fatal, eviction in 14.251 seconds
2013-09-14 00:22:12.180
[cssd(12561)]CRS-1611:node kk1b (2) at 75% heartbeat fatal, eviction in 7.251 seconds
2013-09-14 00:22:13.180
[cssd(12561)]CRS-1611:node kk1b (2) at 75% heartbeat fatal, eviction in 6.251 seconds
2013-09-14 00:22:17.179
[cssd(12561)]CRS-1610:node kk1b (2) at 90% heartbeat fatal, eviction in 2.251 seconds
2013-09-14 00:22:18.180
[cssd(12561)]CRS-1610:node kkvs1b (2) at 90% heartbeat fatal, eviction in 1.251 seconds
This clearly shows CSSD of node kkvs1a has given node eviction message to kkvs1b node.
I got following messages on the instance which got rebooted:
ASM alert log:
Sat Sep 14 00:22:25 IST 2013
Error: KGXGN aborts the instance (6)
Sat Sep 14 00:22:25 IST 2013
Errors in file /opt/oracle/admin/+ASM/bdump/+asm2_lmon_8527.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON: terminating instance due to error 29702
A network fluctuation shouldn't give reboot like this. Then why oracle design like this way? Is this a bug? My oracle version is: 10.2.0.5.0
Could you tell me the other possible situations when 1 RC instance reboots other RAC instacne.

What you are describing is the expected behaviour: if your interconnect fails, you will have a node eviction. Releases < 11.2.0.2 evict a node by reboot, which can fix the problem: the NIC may come up correctly when the machine re-starts. Releases >= 11.2.0.2 can often evict without a re-boot. But either way, if your interconnect goes down, a node must be evicted to prevent uncoordinated disc writes.
If you are interested, you can find some discussion and demos of this in a series of webcasts I've recorded,
Free Oracle Database Tutorials for Administration and Developers
If you really don't like this behaviour and the problems are transient, you can try 'raising the CSS MISSCOUNT parameter.
John Watson
Oracle Certified Master DBA

RAC node rebooting frequently

Hi all,
I am woserking on two node rac environment.One of my rac node is rebooting so frequently.I am using oracle 10g database and clusterware also(10.2.0.1).
Ihave checked os logs(linux AS 4),and rac related logs.Not able to find out anything.Posting all logs please suggest.

Hi i am posting alert log,os log and ocssd logs....
clusterware alert log....._
[crsd(5649)]CRS-1201:CRSD started on node ctmisdb1.
2012-03-21 09:50:38.188
[cssd(7490)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 .
2012-03-21 09:50:46.726
[crsd(5649)]CRS-1204:Recovering CRS resources for node ctmisdb2.
2012-03-21 09:55:21.760
[cssd(7490)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 ctmisdb2 .
2012-03-21 12:07:46.681
[cssd(7426)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /u01/app/oracle/product/crs/log/ctmisdb1/cssd/ocssd.log.
2012-03-21 12:07:50.432
[cssd(7426)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 ctmisdb2 .
2012-03-21 12:07:50.893
[crsd(5549)]CRS-1012:The OCR service started on node ctmisdb1.
2012-03-21 12:07:50.942
[evmd(7304)]CRS-1401:EVMD started on node ctmisdb1.
2012-03-21 12:07:52.827
[crsd(5549)]CRS-1201:CRSD started on node ctmisdb1.
2012-03-21 12:48:41.908
[cssd(7448)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /u01/app/oracle/product/crs/log/ctmisdb1/cssd/ocssd.log.
2012-03-21 12:48:45.741
[cssd(7448)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 ctmisdb2 .
2012-03-21 12:48:49.173
[crsd(5546)]CRS-1012:The OCR service started on node ctmisdb1.
2012-03-21 12:48:49.190
[evmd(7328)]CRS-1401:EVMD started on node ctmisdb1.
2012-03-21 12:48:50.818
[crsd(5546)]CRS-1201:CRSD started on node ctmisdb1.
2012-03-21 13:26:36.398
[cssd(7343)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /u01/app/oracle/product/crs/log/ctmisdb1/cssd/ocssd.log.
2012-03-21 13:26:40.492
[cssd(7343)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 ctmisdb2 .
2012-03-21 13:26:40.939
[crsd(5542)]CRS-1012:The OCR service started on node ctmisdb1.
2012-03-21 13:26:40.977
[evmd(7223)]CRS-1401:EVMD started on node ctmisdb1.
2012-03-21 13:26:42.772
[crsd(5542)]CRS-1201:CRSD started on node ctmisdb1.
node os log....+
Mar 21 12:06:35 ctmisdb1 rc: Starting readahead: succeeded
Mar 21 12:06:35 ctmisdb1 messagebus: messagebus startup succeeded
Mar 21 12:06:36 ctmisdb1 cups-config-daemon: cups-config-daemon startup succeeded
Mar 21 12:06:36 ctmisdb1 haldaemon: haldaemon startup succeeded
Mar 21 12:06:37 ctmisdb1 fstab-sync[6267]: removed all generated mount points
Mar 21 12:06:37 ctmisdb1 fstab-sync[6378]: added mount point /media/cdrecorder for /dev/hde
Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6323]: session opened for user oracle by (uid=0)
Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6324]: session opened for user oracle by (uid=0)
Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6229]: session opened for user oracle by (uid=0)
Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6229]: session closed for user oracle
Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6644]: session opened for user oracle by (uid=0)
Mar 21 12:06:37 ctmisdb1 kernel: matroxfb: cannot set xres to 800, rounded up to 832
Mar 21 12:06:37 ctmisdb1 last message repeated 2 times
Mar 21 12:06:41 ctmisdb1 su(pam_unix)[6323]: session closed for user oracle
Mar 21 12:06:41 ctmisdb1 su(pam_unix)[6644]: session closed for user oracle
Mar 21 12:06:41 ctmisdb1 su(pam_unix)[6324]: session closed for user oracle
Mar 21 12:06:41 ctmisdb1 logger: Cluster Ready Services completed waiting on dependencies.
Mar 21 12:06:41 ctmisdb1 last message repeated 2 times
Mar 21 12:06:45 ctmisdb1 gdm(pam_unix)[6379]: session opened for user root by (uid=0)
Mar 21 12:06:46 ctmisdb1 gconfd (root-7052): starting (version 2.8.1), pid 7052 user 'root'
Mar 21 12:06:47 ctmisdb1 gconfd (root-7052): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
Mar 21 12:06:47 ctmisdb1 gconfd (root-7052): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 1
Mar 21 12:06:47 ctmisdb1 gconfd (root-7052): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2
Mar 21 12:06:55 ctmisdb1 gconfd (root-7052): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 0
Mar 21 12:07:41 ctmisdb1 su(pam_unix)[5547]: session opened for user oracle by (uid=0)
Mar 21 12:07:41 ctmisdb1 logger: Running CRSD with TZ =
Mar 21 12:07:43 ctmisdb1 su(pam_unix)[7399]: session opened for user oracle by (uid=0)
Mar 21 12:12:49 ctmisdb1 sshd(pam_unix)[15323]: session opened for user root by root(uid=0)
Mar 21 12:12:57 ctmisdb1 su(pam_unix)[15531]: session opened for user oracle by root(uid=0)
Mar 21 12:47:05 ctmisdb1 syslogd 1.4.1: restart.
ocssd log....
[    CSSD]2012-03-21 11:24:41.045 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661f0c0) proc(0x8006622560) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 11:24:41.078 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660cfe0) proc(0x800662ba70) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:07:44.564 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=ctmisdb1DBG_CSSD))
[    CSSD]2012-03-21 12:07:44.564 >USER: CSS daemon log for node ctmisdb1, number 1, in cluster crs
[    CSSD]2012-03-21 12:07:44.581 [28260544] >TRACE: clssscmain: local-only set to false
[    CSSD]2012-03-21 12:07:44.603 [28260544] >TRACE: clssnmReadNodeInfo: added node 1 (ctmisdb1) to cluster
[    CSSD]2012-03-21 12:07:44.621 [28260544] >TRACE: clssnmReadNodeInfo: added node 2 (ctmisdb2) to cluster
[    CSSD]2012-03-21 12:07:44.627 [72925824] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
[    CSSD]2012-03-21 12:07:44.627 [28260544] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
[    CSSD]2012-03-21 12:07:44.641 [28260544] >TRACE: clssnmInitNMInfo: misscount set to 60
[    CSSD]2012-03-21 12:07:44.655 [28260544] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw2)
[    CSSD]2012-03-21 12:07:46.661 [72925824] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/raw/raw2)
[    CSSD]2012-03-21 12:07:46.690 [72925824] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(18) wrtcnt(7920) LATS(0) Disk lastSeqNo(7920)
[    CSSD]2012-03-21 12:07:46.752 [28260544] >TRACE: clssnmFatalInit: fatal mode enabled
[    CSSD]2012-03-21 12:07:46.752 [94777984] >TRACE: clssnmconnect: connecting to node 1, flags 0x0001, connector 1
[    CSSD]2012-03-21 12:07:46.753 [94777984] >TRACE: clssnmconnect: connecting to node 0, flags 0x0000, connector 1
[    CSSD]2012-03-21 12:07:46.753 [94777984] >TRACE: clssnmClusterListener: Probing node(2)
[    CSSD]2012-03-21 12:07:46.755 [94777984] >TRACE: clssnmConnComplete: connected to node 2 (con 0x8006601040), state 3 birth 0, unique 1332303918/1332303918 prevConuni(0)
[    CSSD]2012-03-21 12:07:46.756 [106332800] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[    CSSD]2012-03-21 12:07:46.756 [106332800] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_ctmisdb1_crs))
[    CSSD]2012-03-21 12:07:46.757 [151810688] >TRACE: clssnmPollingThread: Connection complete
[    CSSD]2012-03-21 12:07:46.757 [162296448] >TRACE: clssnmSendingThread: Connection complete
[    CSSD]2012-03-21 12:07:46.757 [172782208] >TRACE: clssnmRcfgMgrThread: Connection complete
[    CSSD]2012-03-21 12:07:46.757 [172782208] >TRACE: clssnmRcfgMgrThread: Local Join
[    CSSD]2012-03-21 12:07:46.757 [172782208] >WARNING: clssnmLocalJoinEvent: takeover aborted due to connected but inactive nodes
[    CSSD]2012-03-21 12:07:47.339 [94777984] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[ctmisdb2] seq[5] sync[18]
[    CSSD]2012-03-21 12:07:47.759 [172782208] >TRACE: clssnmRcfgMgrThread: lastleader(2) unique(1332311864)
[    CSSD]2012-03-21 12:07:48.341 [94777984] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(18)
[    CSSD]2012-03-21 12:07:50.346 [94777984] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[    CSSD]2012-03-21 12:07:50.346 [94777984] >TRACE: clssnmDeactivateNode: node 0 () left cluster
[    CSSD]2012-03-21 12:07:50.346 [94777984] >TRACE: clssnmUpdateNodeState: node 1, state (1/2) unique (1332311864/1332311864) prevConuni(0) birth (0/18) (old/new)
[    CSSD]2012-03-21 12:07:50.346 [94777984] >TRACE: clssnmUpdateNodeState: node 2, state (4/3) unique (1332303918/1332303918) prevConuni(0) birth (0/16) (old/new)
[    CSSD]2012-03-21 12:07:50.346 [94777984] >USER: clssnmHandleUpdate: SYNC(18) from node(2) completed
[    CSSD]2012-03-21 12:07:50.346 [94777984] >USER: clssnmHandleUpdate: NODE 1 (ctmisdb1) IS ACTIVE MEMBER OF CLUSTER
[    CSSD]2012-03-21 12:07:50.346 [94777984] >USER: clssnmHandleUpdate: NODE 2 (ctmisdb2) IS ACTIVE MEMBER OF CLUSTER
[    CSSD]2012-03-21 12:07:50.429 [28260544] >USER: NMEVENT_SUSPEND [00][00][00][00]
[    CSSD]2012-03-21 12:07:50.429 [183267968] >TRACE: clssgmReconfigThread: started for reconfig (18)
[    CSSD]2012-03-21 12:07:50.429 [183267968] >USER: NMEVENT_RECONFIG [00][00][00][06]
[    CSSD]2012-03-21 12:07:50.429 [183267968] >TRACE: clssgmEstablishConnections: 2 nodes in cluster incarn 18
[    CSSD]2012-03-21 12:07:50.430 [140255872] >TRACE: clssgmInitialRecv: (0x102a0360) accepted a new connection from node 2 born at 16 active (2, 2), vers (10,3,1,2)
[    CSSD]2012-03-21 12:07:50.430 [140255872] >TRACE: clssgmInitialRecv: conns done (2/2)
[    CSSD]2012-03-21 12:07:50.430 [183267968] >TRACE: clssgmEstablishMasterNode: MASTER for 18 is node(2) birth(16)
[    CSSD]2012-03-21 12:07:50.430 [183267968] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
[    CSSD]2012-03-21 12:07:50.432 [140255872] >TRACE: clssgmHandleDBDone(): src/dest (2/65535) size(72) incarn 18
[    CSSD]CLSS-3000: reconfiguration successful, incarnation 18 with 2 nodes
[    CSSD]CLSS-3001: local node number 1, master node number 2
[    CSSD]2012-03-21 12:07:50.433 [183267968] >TRACE: clssgmReconfigThread: completed for reconfig(18), with status(1)
[    CSSD]2012-03-21 12:07:50.550 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006603bb0) proc(0x8006608b00) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:07:50.551 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x80066066f0) proc(0x8006608d70) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:07:53.569 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660ec70) proc(0x8006611260) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:00.829 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006610990) proc(0x800660de00) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:04.698 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006613030) proc(0x8006612930) pid(8115) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:04.816 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006612950) proc(0x8006613c20) pid(8115) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:04.832 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006612950) proc(0x8006613c20) pid(8115) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:06.615 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006612950) proc(0x8006613c20) pid(8171) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:07.114 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006615960) proc(0x8006616350) pid(8175) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:11.373 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x80066192a0) proc(0x8006619470) pid(8302) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:11.669 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661bf60) proc(0x800661ee20) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:17.135 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661bf60) proc(0x800661ee70) pid(8458) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:17.268 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661fc00) proc(0x80066220d0) pid(8460) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:17.305 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x80066223e0) proc(0x8006625250) pid(8462) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:17.353 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006625560) proc(0x8006628430) pid(8464) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:24.585 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006625560) proc(0x8006628430) pid(8645) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:27.957 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006628740) proc(0x800662b610) pid(8722) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:30.931 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800662cce0) proc(0x800662c860) pid(8801) proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:36.400 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661c5f0) proc(0x800661eb50) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:37.863 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800662f1c0) proc(0x800661eee0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:38.537 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800662f1c0) proc(0x800661d500) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:39.232 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661bf60) proc(0x800661d500) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:43.085 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006611210) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:08:58.971 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x80066112c0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:09:59.290 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800660b190) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:10:59.589 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800660b190) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:11:59.904 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800660b190) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:13:00.203 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800660b190) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:13:14.029 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800660b190) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:14:00.501 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006611210) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:15:00.809 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:16:01.117 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:17:01.447 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:01.762 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:39.841 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:42.123 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:42.316 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:42.843 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:42.963 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:43.098 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b260) proc(0x800662bd20) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:44.173 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:44.368 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b260) proc(0x800660b310) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:45.351 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006628670) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:46.236 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:47.031 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:47.694 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:47.819 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b260) proc(0x800660b310) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:48.103 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:48.327 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b260) proc(0x800660b310) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:48.484 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006611210) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:48.758 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:49.529 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:50.509 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:51.060 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:18:51.558 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800662f0f0) pid() proto(10:2:1:1)
[    CSSD]2012-03-21 12:48:39.836 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=ctmisdb1DBG_CSSD))
[    CSSD]2012-03-21 12:48:39.836 >USER: CSS daemon log for node ctmisdb1, number 1, in cluster crs
[    CSSD]2012-03-21 12:48:39.849 [28260544] >TRACE: clssscmain: local-only set to false
[    CSSD]2012-03-21 12:48:39.865 [28260544] >TRACE: clssnmReadNodeInfo: added node 1 (ctmisdb1) to cluster
[    CSSD]2012-03-21 12:48:39.872 [28260544] >TRACE: clssnmReadNodeInfo: added node 2 (ctmisdb2) to cluster
[    CSSD]2012-03-21 12:48:39.879 [72925824] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
[    CSSD]2012-03-21 12:48:39.879 [28260544] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
[    CSSD]2012-03-21 12:48:39.881 [28260544] >TRACE: clssnmInitNMInfo: misscount set to 60
[    CSSD]2012-03-21 12:48:39.888 [28260544] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw2)
[    CSSD]2012-03-21 12:48:41.892 [72925824] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/raw/raw2)
[    CSSD]2012-03-21 12:48:41.915 [72925824] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(20) wrtcnt(10367) LATS(0) Disk lastSeqNo(10367)
[    CSSD]2012-03-21 12:48:41.959 [28260544] >TRACE: clssnmFatalInit: fatal mode enabled
[    CSSD]2012-03-21 12:48:41.959 [94777984] >TRACE: clssnmconnect: connecting to node 1, flags 0x0001, connector 1
[    CSSD]2012-03-21 12:48:41.959 [94777984] >TRACE: clssnmconnect: connecting to node 0, flags 0x0000, connector 1
[    CSSD]2012-03-21 12:48:41.959 [94777984] >TRACE: clssnmClusterListener: Probing node(2)
[    CSSD]2012-03-21 12:48:41.961 [94777984] >TRACE: clssnmConnComplete: connected to node 2 (con 0x8006702790), state 3 birth 0, unique 1332303918/1332303918 prevConuni(0)
[    CSSD]2012-03-21 12:48:41.962 [106332800] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[    CSSD]2012-03-21 12:48:41.962 [106332800] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_ctmisdb1_crs))
[    CSSD]2012-03-21 12:48:41.963 [152330880] >TRACE: clssnmPollingThread: Connection complete
[    CSSD]2012-03-21 12:48:41.963 [162816640] >TRACE: clssnmSendingThread: Connection complete
[    CSSD]2012-03-21 12:48:41.963 [173302400] >TRACE: clssnmRcfgMgrThread: Connection complete
[    CSSD]2012-03-21 12:48:41.963 [173302400] >TRACE: clssnmRcfgMgrThread: Local Join
[    CSSD]2012-03-21 12:48:41.963 [173302400] >WARNING: clssnmLocalJoinEvent: takeover aborted due to connected but inactive nodes
[    CSSD]2012-03-21 12:48:42.631 [94777984] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[ctmisdb2] seq[13] sync[20]
[    CSSD]2012-03-21 12:48:42.965 [173302400] >TRACE: clssnmRcfgMgrThread: lastleader(2) unique(1332314319)
[    CSSD]2012-03-21 12:48:43.636 [94777984] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(20)
[    CSSD]2012-03-21 12:48:45.640 [94777984] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[    CSSD]2012-03-21 12:48:45.640 [94777984] >TRACE: clssnmDeactivateNode: node 0 () left cluster
[    CSSD]2012-03-21 12:48:45.640 [94777984] >TRACE: clssnmUpdateNodeState: node 1, state (1/2) unique (1332314319/1332314319) prevConuni(0) birth (0/20) (old/new)
[    CSSD]2012-03-21 12:48:45.640 [94777984] >TRACE: clssnmUpdateNodeState: node 2, state (4/3) unique (1332303918/1332303918) prevConuni(0) birth (0/16) (old/new)
[    CSSD]2012-03-21 12:48:45.640 [94777984] >USER: clssnmHandleUpdate: SYNC(20) from node(2) completed
[    CSSD]2012-03-21 12:48:45.640 [94777984] >USER: clssnmHandleUpdate: NODE 1 (ctmisdb1) IS ACTIVE MEMBER OF CLUSTER
[    CSSD]2012-03-21 12:48:45.640 [94777984] >USER: clssnmHandleUpdate: NODE 2 (ctmisdb2) IS ACTIVE MEMBER OF CLUSTER
[    CSSD]2012-03-21 12:48:45.737 [28260544] >USER: NMEVENT_SUSPEND [00][00][00][00]
[    CSSD]2012-03-21 12:48:45.738 [183788160] >TRACE: clssgmReconfigThread: started for reconfig (20)
[    CSSD]2012-03-21 12:48:45.738 [183788160] >USER: NMEVENT_RECONFIG [00][00][00][06]
[    CSSD]2012-03-21 12:48:45.738 [183788160] >TRACE: clssgmEstablishConnections: 2 nodes in cluster incarn 20
[    CSSD]2012-03-21 12:48:45.739 [140776064] >TRACE: clssgmInitialRecv: (0x102a0370) accepted a new connection from node 2 born at 16 active (2, 2), vers (10,3,1,2)
[    CSSD]2012-03-21 12:48:45.739 [140776064] >TRACE: clssgmInitialRecv: conns done (2/2)
[    CSSD]2012-03-21 12:48:45.739 [183788160] >TRACE: clssgmEstablishMasterNode: MASTER for 20 is node(2) birth(16)
[    CSSD]2012-03-21 12:48:45.739 [183788160] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
[    CSSD]2012-03-21 12:48:45.741 [140776064] >TRACE: clssgmHandleDBDone(): src/dest (2/65535) size(72) incarn 20
[    CSSD]CLSS-3000: reconfiguration successful, incarnation 20 with 2 nodes
Plz check and help..........

RAC nodes rebooting

I'm newbie, and trying to implement 11g RAC using openfiler on E-linux 5.3
I have so far successfully configured openfiler, created volumes and configured the nodes, configured ocfs2 and ASM.
When I rebooted the machines, I first started the openfiler server and external storage they start fine and all volumes(devices) comes up fine, but when I boot the nodes one after the other, they are rebooting after couple of minutes continuously one after other , I am clue less, how to figure out what is the problem, why is this happening, has any one else experienced similar situatio? , how can this be resolved?
I would appreciate any advise or help
Thanks

what is difference in timings on your rac nodes...any thing > 45 secs can possibly cause reboots.
check you disktimeouts.. and hangcheck timer settings
hth

Node reboot has some problem

Hi,
I have oracle 10g 2 node RAC cluster. I am using ASM. I have a problem when I reboot the node. It doesn't start my database service. It is successful when I user crsctl stop and then start. But when I restart the complete node, it doesnt start database service. What could be the parameter that needs to be set so that myd database starts immediately when node restarts with out user interference.
Thanking you

It doesn't start my database service. You mean database service in database...
If it error, while start (reboot) check ORACLE_HOME/log/<nodename>/racg/*
->
Database services depend on instance(database), So your database have to start before.
If you stop database services before reboot machine or crs stop/start... It might not start auto...
Check database service resource:
Example:
$ crs_stat | grep NAME\= | grep \.srv | awk -F\= '{print $2}'
ora.DB.service1.DB1.srv
$ crs_stat -p ora.DB.service1.DB1.srv | grep AUTO_START
AUTO_START=restore
restore = If stop database service manual before reboot... must start manual after that....
Or check database service configuration
$ srvctl config service -d DB -s service1 -a -S 1
Good Luck

DB didn't come up along with crs after node reboot

Grid Version: 11.2.0.3
OS: Red Hat Enterprise Linux 5.6
Node2 of our two node RAC got rebooted. Upon reboot, CRS and ASM instance came up. But the DB didn't come up.
How can I check if DB is linked to CRS startup ?
How can I enable DB startup upon CRS startup ?

Hi,
Check Alert log of Database Instance on that node.
By default if Oracle Database Instance already is started in one node, CRS automaticaly start Database Instance automatically on others nodes.
But if you issue "srvctl stop instance" before shutdown of node the state of this resource will be "shutdown" (i.e stay down) CRS database resource have by default the attribute AUTO_START=restore, which means Oracle CRSD will remember last state of that resource.
In this case you must manually issuing "srvctl start instance" after startup of clusterware, but if database instance was running and you issued "crctl stop crs", crsd must start database automatically at clusterware start.

If use MSSQ , when oracle rac node reboot, client get TPEOS error

Hi, all
in my tuxedo applicaton, if we use Single Server, Single Queue mode , when reboot any Oracle RAC node, our application is ok, client can get correct result. but if we use MSSQ（Multi Server, Single Queue) , if Oracle RAC node is ok , our application also is ok. but if we reboot any Oracle RAC node, client program can continue run, get correct result, but always get TPEOS error , for this situation， server can get client request, but client can not get server reply, only get TPEOS error.
our enviroment is :
oracle RAC ,10g 10.2.0.4 , two instances ,rac1 rac2, and two DTP services s1 and s2, set s1 and s2 services TAF is basic
tuxedo 10R3 , two nodes ,work in MP model ，use XA access oracle rac database，services have Transaction and not Transaction
OS is linux AS4 U5, 64bits
service program use OCI
can any one encounter this problem ?

Hi, first thanks you
in ULOG file , only have failover information, not any other error message, in client side also has no other error.
not use MSSQ, ubb file about MSSQ config
SERVERS
DEFAULT:
CLOPT="-A "
sinUpdate_server SRVGRP=GROUP11 SRVID=80 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinUpdate_server SRVGRP=GROUP12 SRVID=160 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinCount_server SRVGRP=GROUP11 SRVID=240 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinCount_server SRVGRP=GROUP12 SRVID=320 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinSelect_server SRVGRP=GROUP11 SRVID=360 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinSelect_server SRVGRP=GROUP12 SRVID=400 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinInsert_server SRVGRP=GROUP11 SRVID=520 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinInsert_server SRVGRP=GROUP12 SRVID=560 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDelete_server SRVGRP=GROUP11 SRVID=600 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDelete_server SRVGRP=GROUP12 SRVID=640 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDdl_server SRVGRP=GROUP11 SRVID=700 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDdl_server SRVGRP=GROUP12 SRVID=740 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
lockselect_server SRVGRP=GROUP11 SRVID=800 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
lockselect_server SRVGRP=GROUP12 SRVID=840 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
#mulup_server SRVGRP=GROUP11 SRVID=1 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
#mulup_server SRVGRP=GROUP12 SRVID=60 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinUpdate_server SRVGRP=GROUP13 SRVID=83 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinUpdate_server SRVGRP=GROUP14 SRVID=164 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinCount_server SRVGRP=GROUP13 SRVID=243 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinCount_server SRVGRP=GROUP14 SRVID=324 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinSelect_server SRVGRP=GROUP13 SRVID=363 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinSelect_server SRVGRP=GROUP14 SRVID=404 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinInsert_server SRVGRP=GROUP13 SRVID=523 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinInsert_server SRVGRP=GROUP14 SRVID=564 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDelete_server SRVGRP=GROUP13 SRVID=603 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDelete_server SRVGRP=GROUP14 SRVID=644 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDdl_server SRVGRP=GROUP13 SRVID=703 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDdl_server SRVGRP=GROUP14 SRVID=744 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
lockselect_server SRVGRP=GROUP13 SRVID=803 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
lockselect_server SRVGRP=GROUP14 SRVID=844 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
#mulup_server SRVGRP=GROUP13 SRVID=13 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
#mulup_server SRVGRP=GROUP14 SRVID=64 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
WSL SRVGRP=GROUP11 SRVID=1000
CLOPT="-A -- -n//120.3.8.237:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP12 SRVID=1001
CLOPT="-A -- -n//120.3.8.238:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP13 SRVID=1003
CLOPT="-A -- -n//120.3.8.237:7203 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP14 SRVID=1004
CLOPT="-A -- -n//120.3.8.238:7204 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
if we use MSSQ ,ubb file about MSSQ config is
*SERVERS
DEFAULT:
CLOPT="-A -p 1,60:1,30"
sinUpdate_server SRVGRP=GROUP11 SRVID=80 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate11 REPLYQ=Y
sinUpdate_server SRVGRP=GROUP12 SRVID=160 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate12 REPLYQ=Y
sinCount_server SRVGRP=GROUP11 SRVID=240 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount11 REPLYQ=Y
sinCount_server SRVGRP=GROUP12 SRVID=320 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount12 REPLYQ=Y
sinSelect_server SRVGRP=GROUP11 SRVID=360 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelec11 REPLYQ=Y
sinSelect_server SRVGRP=GROUP12 SRVID=400 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelect12 REPLYQ=Y
sinInsert_server SRVGRP=GROUP11 SRVID=520 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert11 REPLYQ=Y
sinInsert_server SRVGRP=GROUP12 SRVID=560 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert12 REPLYQ=Y
sinDelete_server SRVGRP=GROUP11 SRVID=600 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete11 REPLYQ=Y
sinDelete_server SRVGRP=GROUP12 SRVID=640 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete12 REPLYQ=Y
sinDdl_server SRVGRP=GROUP11 SRVID=700 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl11 REPLYQ=Y
sinDdl_server SRVGRP=GROUP12 SRVID=740 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl12 REPLYQ=Y
lockselect_server SRVGRP=GROUP11 SRVID=800 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect11 REPLYQ=Y
lockselect_server SRVGRP=GROUP12 SRVID=840 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect12 REPLYQ=Y
#mulup_server SRVGRP=GROUP11 SRVID=1 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup11 REPLYQ=Y
#mulup_server SRVGRP=GROUP12 SRVID=60 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup12 REPLYQ=Y
sinUpdate_server SRVGRP=GROUP13 SRVID=83 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate13 REPLYQ=Y
sinUpdate_server SRVGRP=GROUP14 SRVID=164 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate14 REPLYQ=Y
sinCount_server SRVGRP=GROUP13 SRVID=243 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount13 REPLYQ=Y
sinCount_server SRVGRP=GROUP14 SRVID=324 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount14 REPLYQ=Y
sinSelect_server SRVGRP=GROUP13 SRVID=363 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelec13 REPLYQ=Y
sinSelect_server SRVGRP=GROUP14 SRVID=404 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelect14 REPLYQ=Y
sinInsert_server SRVGRP=GROUP13 SRVID=523 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert13 REPLYQ=Y
sinInsert_server SRVGRP=GROUP14 SRVID=564 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert14 REPLYQ=Y
sinDelete_server SRVGRP=GROUP13 SRVID=603 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete13 REPLYQ=Y
sinDelete_server SRVGRP=GROUP14 SRVID=644 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete14 REPLYQ=Y
sinDdl_server SRVGRP=GROUP13 SRVID=703 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl13 REPLYQ=Y
sinDdl_server SRVGRP=GROUP14 SRVID=744 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl14 REPLYQ=Y
lockselect_server SRVGRP=GROUP13 SRVID=803 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect13 REPLYQ=Y
lockselect_server SRVGRP=GROUP14 SRVID=844 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect14 REPLYQ=Y
#mulup_server SRVGRP=GROUP13 SRVID=13 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup13 REPLYQ=Y
#mulup_server SRVGRP=GROUP14 SRVID=64 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup14 REPLYQ=Y
WSL SRVGRP=GROUP11 SRVID=1000
CLOPT="-A -- -n//120.3.8.237:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP12 SRVID=1001
CLOPT="-A -- -n//120.3.8.238:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP13 SRVID=1003
CLOPT="-A -- -n//120.3.8.237:7203 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP14 SRVID=1004
CLOPT="-A -- -n//120.3.8.238:7204 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
about above ubb file ,has any error ? or not correct use MSSQ
look forward to you answer,thanks.

Node reboot

Similar Messages

Maybe you are looking for