Node Eviction and SGA

Hello All,
We have a 6 node RAC on 10g rel 2 / windows 2003 64 bit. It was working well from all aspects.
About 3 weeks back ( 3 days before i was to go for my vacation) SA needed to add more power modules, so the entire system (including SAN) was powered down and then brought back up. DB m/c by themselves have undergone a complete reboot before without any issues. This time it was the entire IT system.
Two days after that, all out of sudden, we starting witnessing node eviction issues. Every day one node would get evicted but the m/c would not go down. The typical messages seen were (below is the message from ocssd.log on node 2 ) ..
[    CSSD]2008-07-27 16:04:14.605 [5540] >WARNING: clssnmPollingThread: node serv-db01 (1) at 50% heartbeat fatal, eviction in 29.125 seconds
[    CSSD]2008-07-27 16:04:29.605 [5540] >WARNING: clssnmPollingThread: node serv-db01 (1) at 75% heartbeat fatal, eviction in 14.125 seconds
[    CSSD]2008-07-27 16:04:38.606 [5540] >WARNING: clssnmPollingThread: node serv-db01 (1) at 90% heartbeat fatal, eviction in 5.125 seconds
[    CSSD]2008-07-27 16:04:39.606 [5540] >WARNING: clssnmPollingThread: node serv-db01 (1) at 90% heartbeat fatal, eviction in 4.125 seconds
[    CSSD]2008-07-27 16:04:40.606 [5540] >TRACE: clssnmPollingThread: node serv-db01 (1) is impending reconfig
[    CSSD]2008-07-27 16:04:40.606 [5540] >WARNING: clssnmPollingThread: node serv-db01 (1) at 90% heartbeat fatal, eviction in 3.125 seconds
[    CSSD]2008-07-27 16:04:40.606 [5540] >TRACE: clssnmPollingThread: diskTimeout set to (57000)ms impending reconfig status(1)
[    CSSD]2008-07-27 16:04:41.606 [5540] >TRACE: clssnmPollingThread: node serv-db01 (1) is impending reconfig
[    CSSD]2008-07-27 16:04:41.606 [5540] >WARNING: clssnmPollingThread: node serv-db01 (1) at 90% heartbeat fatal, eviction in 2.125 seconds
[    CSSD]2008-07-27 16:04:42.606 [5540] >TRACE: clssnmPollingThread: node serv-db01 (1) is impending reconfig
[    CSSD]2008-07-27 16:04:42.606 [5540] >WARNING: clssnmPollingThread: node serv-db01 (1) at 90% heartbeat fatal, eviction in 1.125 seconds
[    CSSD]2008-07-27 16:04:43.606 [5540] >TRACE: clssnmPollingThread: node serv-db01 (1) is impending reconfig
[    CSSD]2008-07-27 16:04:43.606 [5540] >WARNING: clssnmPollingThread: node serv-db01 (1) at 90% heartbeat fatal, eviction in 0.125 seconds
[    CSSD]2008-07-27 16:04:43.731 [5540] >TRACE: clssnmPollingThread: node serv-db01 (1) is impending reconfig
[    CSSD]2008-07-27 16:04:43.731 [5540] >TRACE: clssnmPollingThread: Eviction started for node serv-db01 (1), flags 0x000f, state 3, wt4c 0
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmDoSyncUpdate: Initiating sync 8
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmDoSyncUpdate: diskTimeout set to (57000)ms
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmSetupAckWait: Ack message type (11)
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmSetupAckWait: node(1) is ALIVE
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmSetupAckWait: node(3) is ALIVE
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmSetupAckWait: node(4) is ALIVE
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmSetupAckWait: node(5) is ALIVE
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmSendSync: syncSeqNo(8)
[    CSSD]2008-07-27 16:04:43.731 [5648] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[serv-db02] seq[1] sync[8]
[    CSSD]2008-07-27 16:04:43.731 [5648] >TRACE: clssnmHandleSync: diskTimeout set to (57000)ms
[    CSSD]2008-07-27 16:04:43.731 [4340] >USER: NMEVENT_SUSPEND [00][00][00][3e]
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(4)
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmWaitForAcks: node(1) is expiring, msg type(11)
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmWaitForAcks: done, msg type(11)
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmDoSyncUpdate: Terminating node 1, serv-db01, misstime(60000) state(3)
[    CSSD]2008-07-27 16:04:43.731 [5640] >TRACE: clssnmSetupAckWait: Ack message type (13)
No information was written to the alert logs on all the nodes.
. We contacted oracle support and they were saying its a n/w issue etc,. But my SA was adament that its an oracle problem. Anyway i went for my vacation. There was a suggestion (SA had an oracle contact) that SGA needs to be increased. It was at 800 mb per node. My junoir dba was forced to raise it to 2 gb on each node based on SA's suggestion. Then all of a sudden from the next day, node eviction stopped.
I cannot still beleive that increasing the SGA has got anything to do with node eviction. I told my upper mgmt that node eviction has nothing to do with the SGA. But the consensus in my IT dept is SGA increase solved the issue. Does anybdy think there is any connection between increase in SGA and node eviction. ?. I have read the node eviction papers in metalink and they do not mention about SGA at all.
I would really appriciate any help in this regard.
Thank You,
Sat

WARNING: clssnmPollingThread: node servAs you know that the above warning message because of the network delay which causes the node evictions. If a node doesn't send a network heartbeat for <misscount> (times in seconds) then the node will be evicted from the cluster. Have you check from your network team about any glitches in the around the node eviction time?
I have read the node eviction papers in metalink and they do not mention about SGA at all. One of the other prime reason for node eviction is lack of resource on the server. I am not sure increasing the SGA might have resolved the node eviction issue. Why don't you produce a test case and submit oracle support for more clarification?
Jaffar

Similar Messages

  • Rac node evicted and asm related

    Hi friends
    I have few doubts in rac environment
    1.In 2 node rac while adding datafile to tablespace if you forget to metion '+'then what will happen whether it is going to be create or it throws an error if it creates where exactly located and other node users how to work on that tablespace .what all steps to perform that datafile is usefull for all node users.
    2. In Rac environment how to check how many sessions connected to particular node.
    3)
    In Rac any node is evicted due to network failure then after we rebuild the network .Is there any steps to do manually to access the failure node after rebuilding the network or it will automatically available in cluster group which service is perform this activity.
    4.While configuring clusterware you choose voting disk and ocr disk location and which redundancy you will choose suppose if you go for normal redundancy how many disks you can select for each file either one or two?.

    [grid@srvtestdb1 ~]$ ps -ef|grep tns
    root 65 2 0 Aug29 ? 00:00:00 [netns]
    grid 4449 1 0 Aug29 ? 00:00:25 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN2 -inherit
    grid 4454 1 0 Aug29 ? 00:00:23 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN3 -inherit
    grid 4481 1 0 Aug29 ? 00:00:33 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER -inherit
    grid 37028 1 0 09:38 ? 00:00:00 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
    grid 37901 36372 0 09:45 pts/0 00:00:00 grep tns
    [grid@srvtestdb1 ~]$
    [grid@srvtestdb1 ~]$ srvctl config scan_listener
    SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521
    SCAN Listener LISTENER_SCAN2 exists. Port: TCP:1521
    SCAN Listener LISTENER_SCAN3 exists. Port: TCP:1521
    [grid@srvtestdb1 ~]$
    [grid@srvtestdb1 ~]$ srvctl status scan_listener
    SCAN Listener LISTENER_SCAN1 is enabled
    SCAN listener LISTENER_SCAN1 is running on node srvtestdb1
    SCAN Listener LISTENER_SCAN2 is enabled
    SCAN listener LISTENER_SCAN2 is running on node srvtestdb1
    SCAN Listener LISTENER_SCAN3 is enabled
    SCAN listener LISTENER_SCAN3 is running on node srvtestdb1
    [grid@srvtestdb1 ~]$ srvctl status scan
    SCAN VIP scan1 is enabled
    SCAN VIP scan1 is running on node srvtestdb1
    SCAN VIP scan2 is enabled
    SCAN VIP scan2 is running on node srvtestdb1
    SCAN VIP scan3 is enabled
    SCAN VIP scan3 is running on node srvtestdb1

  • Oracle 10g RAC on Solaris Node Eviction

    Been having periodic node eviction on my server. I've found several threads regarding RAC node reboots but nothing specific.. In my case, the node eviction warning appears to be "immediate"
    [cssd(9530)]CRS-1612:node mbdmb2 (0) at 50% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1612:node mbdmb2 (0) at 50% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1611:node mbdmb2 (0) at 75% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1611:node mbdmb2 (0) at 75% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
    [cssd(9530)]CRS-1607:CSSD evicting node mbdmb2. Details in /u01/crs/oracle/product/10.2/app/log/mbdmb1/cssd/
    ocssd.log.
    Other people’s: Seem to have a time to recover.. and only reboots when it eventually runs out of time..
    2009-08-31 16:05:41.405
    [cssd(4968)]CRS-1612:node simsd1 (1) at 50% heartbeat fatal, eviction in 29.611 seconds
    2009-08-31 16:05:42.403
    [cssd(4968)]CRS-1612:node simsd1 (1) at 50% heartbeat fatal, eviction in 28.613 seconds
    2009-08-31 16:05:56.412
    [cssd(4968)]CRS-1611:node simsd1 (1) at 75% heartbeat fatal, eviction in 14.604 seconds
    2009-08-31 16:05:57.411
    [cssd(4968)]CRS-1611:node simsd1 (1) at 75% heartbeat fatal, eviction in 13.605 seconds
    2009-08-31 16:06:05.413
    [cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 5.603 seconds
    2009-08-31 16:06:06.412
    [cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 4.604 seconds
    2009-08-31 16:06:07.410
    [cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 3.606 seconds
    2009-08-31 16:06:08.409
    [cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 2.607 seconds
    2009-08-31 16:06:09.407
    [cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 1.609 seconds
    2009-08-31 16:06:10.405
    [cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 0.611 seconds
    2009-08-31 16:06:11.061
    [cssd(4968)]CRS-1609:CSSD detected a network split. Details in C:\product\11.1.0\crs\log\simsd2\cssd\ocssd.log.
    2009-08-31 16:14:37.873
    I'm lead to think this is due to something with the setting on the heartbeat loss window. There are some threads suggesting the hangcheck-timer but it does not appear to be for solaris. Wondering where if any place I can check/change this setting.

    Ah, thanks, looks like even just looking at the log yielded some thing different. I was grepping the alertlog instead which apparantly doesn't show as much (and shows time as 0). In the ocssd.log, it shows it with the time to live.
    One more question, can you tell from this that whether this is a network hb or disk hb related or something else?
    Thanks!
    CSSD]2010-02-24 06:46:17.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:17.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:46:22.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:22.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmRegisterClient: proc(17/1009503f0), client(344/10097f
    7f0)
    [    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmExecuteClientRequest: GRKJOIN recvd from client 344 (
    10097f7f0)
    [    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmJoinGrock: grock DG_FLASH51 new client 10097f7f0 with
    con 100932430, requested num -1
    [    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmAddGrockMember: adding member to grock DG_FLASH51
    [    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmAddMember: member (2/100921830) added. pbsz(123) prsz
    (42) flags 0x0 to grock (100914210/DG_FLASH51)
    [    CSSD]2010-02-24 06:46:27.686 [14] >TRACE: clssgmQueueGrockEvent: groupName(DG_FLASH51) count(3) maste
    r(0) event(1), incarn 208505, mbrc 3, to member 0, events 0x0, state 0x0
    [    CSSD]2010-02-24 06:46:27.686 [14] >TRACE: clssgmCommonAddMember: Local member(2) node(1) flags 0x0 0x
    100 grock (2/100914210/DG_FLASH51)
    [    CSSD]2010-02-24 06:46:27.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:27.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:46:28.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 50 2.123767e-314art
    beat fatal, eviction in 59.577 seconds
    [    CSSD]2010-02-24 06:46:28.941 [18] >TRACE: clssnmPollingThread: node mbdmb2 (2) is impending reconfig,
    flag 1, misstime 60423
    [    CSSD]2010-02-24 06:46:28.941 [18] >TRACE: clssnmPollingThread: diskTimeout set to (117000)ms impendin
    g reconfig status(1)
    [    CSSD]2010-02-24 06:46:29.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 50 2.123767e-314art
    beat fatal, eviction in 58.577 seconds
    [    CSSD]2010-02-24 06:46:30.363 [17] >TRACE: clssgmDispatchCMXMSG(): msg type(12) src(2) dest(1) size(36
    0) tag(00d2002a) incarnation(88)
    [    CSSD]2010-02-24 06:46:32.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:32.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:46:37.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:37.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:46:42.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:42.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:46:47.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:47.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:46:52.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:57.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:46:57.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:46:58.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 75 2.123767e-314art
    beat fatal, eviction in 29.577 seconds
    [    CSSD]2010-02-24 06:46:59.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 75 2.123767e-314art
    beat fatal, eviction in 28.577 seconds
    [    CSSD]2010-02-24 06:47:02.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:47:02.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:47:07.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:47:07.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:47:12.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:47:12.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:47:16.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 11.577 seconds
    [    CSSD]2010-02-24 06:47:17.725 [17] >TRACE: clssgmDispatchCMXMSG(): msg type(12) src(2) dest(1) size(36
    0) tag(00d3002a) incarnation(88)
    [    CSSD]2010-02-24 06:47:17.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:47:17.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 10.577 seconds
    [    CSSD]2010-02-24 06:47:17.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:47:18.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 9.577 seconds
    [    CSSD]2010-02-24 06:47:19.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 8.577 seconds
    [    CSSD]2010-02-24 06:47:20.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 7.577 seconds
    [    CSSD]2010-02-24 06:47:21.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 6.577 seconds
    [    CSSD]2010-02-24 06:47:22.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
    [    CSSD]2010-02-24 06:47:22.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 5.577 seconds
    [    CSSD]2010-02-24 06:47:22.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
    [    CSSD]2010-02-24 06:47:26.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 1.577 seconds
    [    CSSD]2010-02-24 06:47:26.941 [19] >TRACE: clssnmSendingThread: sent 4 status msgs to all nodes
    [    CSSD]2010-02-24 06:47:27.703 [17] >TRACE: clssgmPeerEventHndlr: receive failed, node 2 (mbdmb2) (1009
    0eb90), rc 11
    [    CSSD]2010-02-24 06:47:27.704 [17] >TRACE: clssgmPeerDeactivate: node 2 (mbdmb2), death 0, state 0x800
    00001 connstate 0xf
    [    CSSD]2010-02-24 06:47:27.704 [17] >TRACE: clssgmPeerListener: discarded 0 future msgsfor 2
    [    CSSD]2010-02-24 06:47:27.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
    beat fatal, eviction in 0.577 seconds
    [    CSSD]2010-02-24 06:47:28.521 [18] >TRACE: clssnmPollingThread: Eviction started for node mbdmb2 (2),
    flags 0x0001, state 3, wt4c 0

  • 11g R1 Node Evictions on Linux

    We are getting random node evictions on Linux. oswatcher is showing sometimes we get times of around 2.0ms. They are typically below .500ms. We have a VLAN interconnect switch. If times above .500ms are seen, what is the duration that seeing these high times will cause eviction? for example, a 30sec period of interconnect times above .500ms?
    Thanks in advance!

    You can check the values of disktimeout and misscount in your CRS: crsctl get css disktimeout, and look at the ocrdump for the second.
    Have you already checked similiar questions here ?. This looks like something similar to your problem, with a good discussion: RAC nodes rebooting
    Regards.

  • RAC Node eviction question...

    Say we have 3 node RAC cluster on OEL5.3. What happens if one node evicted out of it? I know other two instance will do dynamic remastering... and something more.
    I want to know eachand every steps in detail. What really happens when one node goes down in RAC environment.
    Experts please comment.
    Many Thanks.

    I want to know each and every steps in detail. Assume you know "each and every steps in detail." what will you do differently based upon this information?
    Handle:      vh_dba
    Status Level:      Newbie (30)
    Registered:      Jan 10, 2010
    Total Posts:      38
    Total Questions:      16 (15 unresolved)
    So many questions with only a single answer.
    :-(

  • 11gR2 Windows 2008 R2 node eviction issue

    Hi
    We are facing the cluster node eviction when the teamed network is down less than 5 seconds time. Is there any settings needs to be changed? Recently our network team is performing a firmware upgrade of all modules, they mentioned to us our HP blade servers and network is completely redundant. But still the node eviction happen to us, we tried with different scenarios by disabling the separate network cards. When ever we do disable and enable private teamed network even less than 2 seconds the node eviction is happening.
    After reading this CSS Timeout Computation in Oracle Clusterware [ID 294430.1] at least it should wait till 30 seconds time but it is not.
    I am attaching the log here this is node2 alert log..
    [ctssd(5264)]CRS-2409:The clock on host shadbtestrac02 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
    2010-08-10 10:34:34.722
    [ctssd(5264)]CRS-2409:The clock on host shadbtestrac02 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
    2010-08-10 10:35:15.485
    [cssd(5124)]CRS-1612:Network communication with node shadbtestrac01 (1) missing for 50% of timeout interval. Removal of this node from cluster in 14.743 seconds
    2010-08-10 10:35:23.331
    [cssd(5124)]CRS-1611:Network communication with node shadbtestrac01 (1) missing for 75% of timeout interval. Removal of this node from cluster in 6.896 seconds
    2010-08-10 10:35:27.387
    [cssd(5124)]CRS-1610:Network communication with node shadbtestrac01 (1) missing for 90% of timeout interval. Removal of this node from cluster in 2.840 seconds
    2010-08-10 10:35:30.242
    [cssd(5132)]CRS-1609:This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in C:\OracleGI\11.2.0\log\shadbtestrac02\cssd\ocssd.log.
    2010-08-10 10:35:39.665
    [cssd(4724)]CRS-1608:This node was evicted by node 1, shadbtestrac01; details at (:CSSNM00005:) in C:\OracleGI\11.2.0\log\shadbtestrac02\cssd\ocssd.log.
    2010-08-10 10:35:39.665
    [cssd(4548)]CRS-1608:This node was evicted by node 1, shadbtestrac01; details at (:CSSNM00005:) in C:\OracleGI\11.2.0\log\shadbtestrac02\cssd\ocssd.log.
    2010-08-10 10:35:39.680
    [cssd(3808)]CRS-1608:This node was evicted by node 1, shadbtestrac01; details at (:CSSNM00005:) in C:\OracleGI\11.2.0\log\shadbtestrac02\cssd\ocssd.log.
    2010-08-10 10:35:39.712
    [ctssd(5268)]CRS-2402:The Cluster Time Synchronization Service aborted on host shadbtestrac02. Details at (:ctsselect_mmg5_1: in C:\OracleGI\11.2.0\log\shadbtestrac02\ctssd\octssd.log.
    2010-08-10 10:35:39.712
    [C:\OracleGI\11.2.0\bin\oraagent.exe(5984)]CRS-5822:Agent 'C:\OracleGI\11.2.0\bin\oraagent.exe_system' disconnected from server. Details at (:CRSAGF00117:) in C:\OracleGI\11.2.0\log\shadbtestrac02\agent\crsd\oraagent\oraagent.log.
    2010-08-10 10:35:39.712
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(5768)]CRS-5822:Agent 'C:\OracleGI\11.2.0\bin\orarootagent.exe_system' disconnected from server. Details at (:CRSAGF00117:) in C:\OracleGI\11.2.0\log\shadbtestrac02\agent\crsd\orarootagent\orarootagent.log.
    Node1 alert log..
    2010-08-10 09:13:28.642
    [C:\OracleGI\11.2.0\bin\oraagent.exe(5652)]CRS-5011:Check of resource "CPCSR" failed: details at "(:CLSN00007:)" in "C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log"
    2010-08-10 09:13:29.828
    [crsd(3104)]CRS-2765:Resource 'ora.cpcsr.db' has failed on server 'shadbtestrac01'.
    2010-08-10 09:13:31.700
    [crsd(3104)]CRS-2765:Resource 'ora.cpcsr.cpcsboa.dev.sha.svc' has failed on server 'shadbtestrac01'.
    2010-08-10 09:13:31.700
    [crsd(3104)]CRS-2771:Maximum restart attempts reached for resource 'ora.cpcsr.cpcsboa.dev.sha.svc'; will not restart.
    2010-08-10 09:13:44.102
    [crsd(3104)]CRS-2758:Resource 'ora.cpcsr.db' is in an unknown state.
    2010-08-10 10:35:18.777
    [cssd(4272)]CRS-1612:Network communication with node shadbtestrac02 (2) missing for 50% of timeout interval. Removal of this node from cluster in 14.697 seconds
    2010-08-10 10:35:26.437
    [cssd(4272)]CRS-1611:Network communication with node shadbtestrac02 (2) missing for 75% of timeout interval. Removal of this node from cluster in 7.037 seconds
    2010-08-10 10:35:30.493
    [cssd(4272)]CRS-1610:Network communication with node shadbtestrac02 (2) missing for 90% of timeout interval. Removal of this node from cluster in 2.981 seconds
    2010-08-10 10:35:33.488
    [cssd(4280)]CRS-1607:Node shadbtestrac02 is being evicted in cluster incarnation 173888169; details at (:CSSNM00007:) in C:\OracleGI\11.2.0\log\shadbtestrac01\cssd\ocssd.log.
    2010-08-10 10:35:40.009
    [ohasd(3320)]CRS-8011:reboot advisory message from host: shadbtestrac02, component: ag164619, with time stamp: L-2010-08-10-10:35:39.000
    [ohasd(3320)]CRS-8013:reboot advisory message text: clsnomon_status: need to reboot, unexpected failure 8 received from CSS
    2010-08-10 10:36:09.805
    [cssd(12776)]CRS-1601:CSSD Reconfiguration complete. Active nodes are shadbtestrac01 .
    2010-08-10 10:36:09.961
    [crsd(3104)]CRS-5504:Node down event reported for node 'shadbtestrac02'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'Generic'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.AHPSR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.AHPSR_AHPSBC.TEST.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.APPOHDR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.APPOITR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.APPOITR_OREMS.DEV.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.CFSR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.CFSR_CFS.DEV.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.CFSTESTR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.CFSTESTR_CFSADS.TEST.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.CPCSR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.CPCSR_CPCSBOA.DEV.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.FASTR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.FASTR_FAST.DEV.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.FASTTRGR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.FASTTRGR_FAST.TRNG.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.FASTTSTR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.FASTTSTR_FAST.TEST.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.MAXIMOR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.MAXIMOR_MAXIMO.DEV.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.MAXIMOR_MAXIMO.TEST.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.SHAPMSR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.SHAPMSR_SHAPMS.DEV.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.TEPMSR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.TEPMSR_TEPMS.DEV.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.WHAT1R'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.WHAT1R_WHAT1ADS.TEST.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.WHAT2R'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.WHAT2R_WHAT2ADS.TEST.SHA'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.WWWDBR'.
    2010-08-10 10:36:22.894
    [crsd(3104)]CRS-2773:Server 'shadbtestrac02' has been removed from pool 'ora.WWWDBR_PLC.DEV.SHA'.
    2010-08-10 10:37:48.802
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(5092)]CRS-5818:Aborted command 'check for resource: ora.shadbtestrac02.vip 1 1' for resource 'ora.shadbtestrac02.vip'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\orarootagent\orarootagent.log.
    2010-08-10 10:37:49.722
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(4540)]CRS-5818:Aborted command 'check for resource: ora.net1.network shadbtestrac01 1' for resource 'ora.net1.network'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\orarootagent\orarootagent.log.
    2010-08-10 10:37:49.722
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(9064)]CRS-5818:Aborted command 'check for resource: ora.shadbtestrac01.vip 1 1' for resource 'ora.shadbtestrac01.vip'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\orarootagent\orarootagent.log.
    2010-08-10 10:37:49.738
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(9448)]CRS-5818:Aborted command 'check for resource: ora.scan2.vip 1 1' for resource 'ora.scan2.vip'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\orarootagent\orarootagent.log.
    2010-08-10 10:37:50.284
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(10020)]CRS-5818:Aborted command 'check for resource: ora.scan1.vip 1 1' for resource 'ora.scan1.vip'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\orarootagent\orarootagent.log.
    2010-08-10 10:38:06.788
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(5300)]CRS-5818:Aborted command 'check for resource: ora.scan3.vip 1 1' for resource 'ora.scan3.vip'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\orarootagent\orarootagent.log.
    2010-08-10 10:41:03.721
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(5484)]CRS-5818:Aborted command 'check for resource: ora.net1.network shadbtestrac01 1' for resource 'ora.net1.network'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\orarootagent\orarootagent.log.
    2010-08-10 10:41:57.323
    [crsd(3104)]CRS-2765:Resource 'ora.net1.network' has failed on server 'shadbtestrac01'.
    2010-08-10 14:57:31.510
    [C:\OracleGI\11.2.0\bin\oraagent.exe(5656)]CRS-5818:Aborted command 'check for resource: ora.LISTENER.lsnr shadbtestrac01 1' for resource 'ora.LISTENER.lsnr'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log.
    2010-08-10 14:57:32.648
    [C:\OracleGI\11.2.0\bin\oraagent.exe(6416)]CRS-5014:Agent "C:\OracleGI\11.2.0\bin\oraagent.exe" timed out starting process "C:\OracleGI\11.2.0\bin\lsnrctl.exe" for action "check": details at "(:CLSN00009:)" in "C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log"
    2010-08-10 14:59:16.732
    [C:\OracleGI\11.2.0\bin\oraagent.exe(12912)]CRS-5818:Aborted command 'check for resource: ora.LISTENER.lsnr shadbtestrac01 1' for resource 'ora.LISTENER.lsnr'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log.
    2010-08-10 15:00:57.289
    [C:\OracleGI\11.2.0\bin\oraagent.exe(12676)]CRS-5818:Aborted command 'check for resource: ora.LISTENER.lsnr shadbtestrac01 1' for resource 'ora.LISTENER.lsnr'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log.
    2010-08-10 15:00:57.289
    [C:\OracleGI\11.2.0\bin\oraagent.exe(13848)]CRS-5818:Aborted command 'check for resource: ora.LISTENER_SCAN1.lsnr 1 1' for resource 'ora.LISTENER_SCAN1.lsnr'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log.
    2010-08-10 15:02:55.802
    [C:\OracleGI\11.2.0\bin\oraagent.exe(12580)]CRS-5818:Aborted command 'check for resource: ora.LISTENER.lsnr shadbtestrac01 1' for resource 'ora.LISTENER.lsnr'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log.
    2010-08-10 15:04:11.899
    [C:\OracleGI\11.2.0\bin\oraagent.exe(3588)]CRS-5818:Aborted command 'check for resource: ora.LISTENER.lsnr shadbtestrac01 1' for resource 'ora.LISTENER.lsnr'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log.
    2010-08-10 15:05:23.207
    [C:\OracleGI\11.2.0\bin\oraagent.exe(13996)]CRS-5818:Aborted command 'check for resource: ora.LISTENER.lsnr shadbtestrac01 1' for resource 'ora.LISTENER.lsnr'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log.
    2010-08-10 15:06:34.748
    [C:\OracleGI\11.2.0\bin\oraagent.exe(12388)]CRS-5818:Aborted command 'check for resource: ora.LISTENER.lsnr shadbtestrac01 1' for resource 'ora.LISTENER.lsnr'. Details at (:CRSAGF00113:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\oraagent\oraagent.log.
    2010-08-10 15:06:37.291
    [crsd(5112)]CRS-5831:Agent 'C:\OracleGI\11.2.0\bin\oraagent.exe' has exceeded maximum failures and has been disabled. Details at (:CRSAGF00129:) in C:\OracleGI\11.2.0\log\shadbtestrac01\crsd\crsd.log.
    2010-08-10 15:08:35.243
    [ohasd(3948)]CRS-2765:Resource 'ora.crsd' has failed on server 'shadbtestrac01'.
    2010-08-10 15:08:36.569
    [C:\OracleGI\11.2.0\bin\orarootagent.exe(5076)]CRS-5822:Agent 'C:\OracleGI\11.2.0\bin\orarootagent.exe_system' disconnected from server. Details at (:CRSAGF00117:) in C:\OracleGI\11.2.0\log\shadbtestrac01\agent\crsd\orarootagent\orarootagent.log.
    2010-08-10 15:08:53.744
    [crsd(4420)]CRS-1012:The OCR service started on node shadbtestrac01.
    2010-08-10 15:09:06.630
    [crsd(4420)]CRS-1201:CRSD started on node shadbtestrac01.
    If anyone facing similar issue. Please share your inputs.
    Thanks
    Rao

    Hi Rao,
    have you disabled Media Sense?
    http://support.microsoft.com/default.aspx?scid=kb;en-us;239924
    Because if media sense is not disabled, Windows will report a unplugged/unlinked network card. This will immediately tell the cluster that network is down. In this case, Oracle does not wait. Since the network is down a reboot is initiated (need need to wait for a heartbeat down).
    Sebastian

  • Node eviction while registring a new DSN

    Hi,
    I encounter a node eviction when trying to register a DSN in cache grid configuration. I tried about 6 times with the same result. Could you please take a look?
    My software run on Solaris 10.
    My system consists of 2 DSN: tt_0.0.0.1 and tt_0.1.0.1
    I have 2 servers with TT and CRS installed, node 06 hosts active tt_0.0.0.1 and standby tt_0.1.0.1 / node 07 hosts standby tt_0.0.0.1 and standby tt_0.1.0.1
    Oracle Server is hosted on 07
    I managed to register tt_0.0.0.1:
    server06$ ttisql tt_0.0.0.1
    Command> call ttcachestart ;
    Command> call ttcacheuidpwdset('pin', 'pin'); # pin is a datastore user (uid)
    Command> call ttGridCreate('ttgridPin');
    Command> call ttGridNameSet('ttgridPin');
    Command> call ttGridAttach(1, '0.0.0.1_active', 'dfso4e06', 53381, '0.0.0.1_stdby', 'dfso4e07', 53382);
    Command> call ttrepstart ;
    Command> call ttgridnodestatus;
    < TTGRIDPIN, 1, 1, T, DFSO4E06, TTGRIDPIN_TT_0.0.0.11_1A, 10.197.65.67, 53381, T, DFSO4E07, TTGRIDPIN_TT_0.0.0.12_1B, 10.197.65.185, 53382 >
    Command> exit
    server07$ ttIsql tt_0.1.0.1
    Command> call ttcachestart ;
    Command> call ttcacheuidpwdset('pin', 'pin');
    Command> call ttGridNameSet('ttgridPin');
    Command> call ttGridAttach(1, '0.1.0.1_active', 'dfso4e07', 53383, '0.1.0.1_stdby', 'dfso4e06', 53384);
    Command> call ttrepstart ;
    Everything works fine, below is my cachegrid configuration:
    ttadmin@dfso4e06:/exec/products/CRS/v11.1.0.7/log/dfso4e06$ ttIsql tt_0.0.0.1
    Command> call ttGridNameSet('ttgridPin');
    Command> call ttgridnodestatus;
    < TTGRIDPIN, 1, 1, T, DFSO4E06, TTGRIDPIN_TT_0.0.0.11_1A, 10.197.65.67, 53381, T, DFSO4E07, TTGRIDPIN_TT_0.0.0.12_1B, 10.197.65.185, 53382 >
    < TTGRIDPIN, 2, 1, T, dfso4e07, TTGRIDPIN_0.1.0.1_active_2A, 10.197.65.185, 53383, F, dfso4e06, TTGRIDPIN_0.1.0.1_stdby_2B, 10.197.65.67, 53384 >
    Note: the F status (detached) of the standby server which should go to T (attached) as soon as i successfully run "ttCWAdmin -create"
    My CRS / TT integration looks like this:
    ttadmin@dfso4e06:/exec/products/CRS/v11.1.0.7/log/dfso4e06$ ttCWAdmin -status
    Master Datastore 1:
    Host:dfso4e06
    Status:AVAILABLE
    State:ACTIVE
    Master Datastore 2:
    Host:dfso4e07
    Status:AVAILABLE
    State:STANDBY
    I used ttCWAdmin framework to register that DSN. Now I want to do the same with the other one: tt_0.1.0.1.
    I run:
    ttadmin@dfso4e07:/exec/applis/BRM/timesten/TimesTen/tt1121_HA/info$ ttCWAdmin -create -dsn tt_0.1.0.1
    Successful connection with Oracle clusterware stack
    replication DDL = create active standby pair "TT_0.1.0.1" on "dfso4e07","TT_0.1.0.1" on "dfso4e06" RETURN TWOSAFE
    Enter internal UID with ADMIN privileges: pin
    Enter password for the same UID:
    IMDB Cache is Enabled
    Enter Oracle password for the same UID:
    Enter any phrase for password encryption:
    Data store exists and uid/pwd are verified on host dfso4e07
    Create active standby pair replication scheme on host dfso4e07 and make it the active master? (Y/N)
    As soon as i will hit 'Y' + Enter the countdown to eviction will start from the other node (the 06), see:
    ttadmin@dfso4e06:/exec/products/CRS/v11.1.0.7/log/dfso4e06$ tail -f alertdfso4e06.log
    2011-03-30 02:42:37.166
    [cssd(3323)]CRS-1612:node dfso4e07 (2) at 50% heartbeat fatal, eviction in 14.783 seconds
    2011-03-30 02:42:38.175
    [cssd(3323)]CRS-1612:node dfso4e07 (2) at 50% heartbeat fatal, eviction in 13.773 seconds
    2011-03-30 02:42:45.244
    [cssd(3323)]CRS-1611:node dfso4e07 (2) at 75% heartbeat fatal, eviction in 6.703 seconds
    2011-03-30 02:42:49.283
    [cssd(3323)]CRS-1610:node dfso4e07 (2) at 90% heartbeat fatal, eviction in 2.663 seconds
    2011-03-30 02:42:50.292
    [cssd(3323)]CRS-1610:node dfso4e07 (2) at 90% heartbeat fatal, eviction in 1.653 seconds
    2011-03-30 02:42:51.302
    [cssd(3323)]CRS-1610:node dfso4e07 (2) at 90% heartbeat fatal, eviction in 0.643 seconds
    2011-03-30 02:42:51.954
    [cssd(3323)]CRS-1607:CSSD evicting node dfso4e07. Details in /exec/products/CRS/v11.1.0.7/log/dfso4e06/cssd/ocssd.log.
    the last log won't say much.
    Below, my sys.odbc.ini:
    [ODBC Data Sources]
    tt_0.0.0.1=TimesTen 11.2.1 Driver
    tt_0.1.0.1=TimesTen 11.2.1 Driver
    # New data source definitions can be added below.
    [tt_0.0.0.1]
    DataStore=/exec/applis/BRM/brm7.4/tt/db_files/tt_0.0.0.1
    UID=pin
    PWD=pin
    OracleNetServiceName=PINDB
    DatabaseCharacterSet=UTF8
    ConnectionCharacterSet=UTF8
    PLSQL=1
    oraclepwd=pin
    OracleId=PINDB
    Driver=$TIMESTEN_HOME/lib/libtten.so
    #Shared-memory size in megabytes allocated for the datastore.
    PermSize= 1024
    #Shared-memory size in megabytes allocated for temporary data partition, generally half the size of PermSize.
    #TempSize=32
    PassThrough=0
    #Use large log buffer, log file sizes
    LogFileSize=512
    LogBufMB=512
    #Async repl flushes to disk before sending batches so this makes it faster on Linux
    LogFlushMethod=2
    #Limit Ckpt rate to 10 mb/s
    CkptFrequency=200
    CkptLogVolume=0
    CkptRate=10
    Connections=200
    #Oracle recommends setting LockWait to 30 seconds.
    LockWait=30
    DurableCommits=0
    CacheGridEnable=1
    [tt_0.1.0.1]
    DataStore=/exec/applis/BRM/brm7.4/tt/db_files/tt_0.1.0.1
    UID=pin
    PWD=pin
    OracleNetServiceName=PINDB
    DatabaseCharacterSet=UTF8
    ConnectionCharacterSet=UTF8
    PLSQL=1
    oraclepwd=pin
    OracleId=PINDB
    Driver=$TIMESTEN_HOME/lib/libtten.so
    #Shared-memory size in megabytes allocated for the datastore.
    PermSize= 1024
    #Shared-memory size in megabytes allocated for temporary data partition, generally half the size of PermSize.
    ## JB ##TempSize=32
    PassThrough=0
    #Use large log buffer, log file sizes
    LogFileSize=512
    LogBufMB=512
    #Async repl flushes to disk before sending batches so this makes it faster on Linux
    LogFlushMethod=2
    #Limit Ckpt rate to 10 mb/s
    CkptFrequency=200
    CkptLogVolume=0
    CkptRate=10
    Connections=200
    #Oracle recommends setting LockWait to 30 seconds.
    LockWait=30
    DurableCommits=0
    CacheGridEnable=1
    Note: pin is the cache admin i created - i also created and populated cache groups within the tt_0.0.0.1
    below my cluster.oracle.ini
    [repdb1_1121]
    MasterHosts = dfso4e06,dfso4e07
    ScriptInstallDir = /exec/applis/BRM/timesten/TimesTen/tt1121_HA/info/crs_scripts
    [cachedb1_1121]
    MasterHosts = dfso4e06,dfso4e07
    ScriptInstallDir = /exec/applis/BRM/timesten/TimesTen/tt1121_HA/info/crs_scripts
    CacheConnect = Y
    [tt_0.0.0.1]
    MasterHosts = dfso4e06,dfso4e07
    ScriptInstallDir = /exec/applis/BRM/timesten/TimesTen/tt1121_HA/info/crs_scripts
    CacheConnect = Y
    ReturnServiceAttribute = RETURN TWOSAFE
    GridPort = 53381,53382
    [tt_0.1.0.1]
    MasterHosts = dfso4e07,dfso4e06
    ScriptInstallDir = /exec/applis/BRM/timesten/TimesTen/tt1121_HA/info/crs_scripts
    CacheConnect = Y
    ReturnServiceAttribute = RETURN TWOSAFE
    GridPort = 53383,53384

    Since I changed some Timeout parameters, I have a new error coming up:
    ttadmin@dfso4e07:/users/applix/vpb/ttadmin$ ttCWAdmin -create -dsn tt_0.1.0.1
    Successful connection with Oracle clusterware stack
    replication DDL = create active standby pair "TT_0.1.0.1" on "dfso4e07","TT_0.1.0.1" on "dfso4e06" RETURN TWOSAFE
    Enter internal UID with ADMIN privileges: pin
    Enter password for the same UID:
    IMDB Cache is Enabled
    Enter Oracle password for the same UID:
    Enter any phrase for password encryption:
    Data store exists and uid/pwd are verified on host dfso4e07
    Create active standby pair replication scheme on host dfso4e07 and make it the active master? (Y/N)Y
    Waiting for confirmation from hosts...
    Active Standby pair replication scheme created on host dfso4e07, which will be the active master at start
    ====================================================================
    Warning: If data store(s) already exist(s) on host(s) dfso4e06, they may be destroyed when the active master is duplicated.
    If the data store(s) need(s) to be preserved, please back up the data store(s) manually, before executing
    ttCWAdmin -start Command
    ====================================================================
    Registering TimesTen Cluster with Oracle Clusterware
    ====================================================================
    Number of unregistered dsn resources = 2
    Registering dsn resources:... Registration complete.
    ====================================================================
    ====================================================================
    Number of unregistered service resources = 2
    Registering service resources:... (ttCWAdmin:) crsctl.c(14915): TT48004: clscrs_register_resource failed with status = 200.
    Warning: some resource could not be registered
    ====================================================================
    (ttCWAdmin:) crsctl.c(1789): TT48014: Failed to register the cluster for DSN TT_0.1.0.1 with Oracle Clusterware.
    I perform the required check:
    oracle@dfso4e07:/users/applix/vpb/oracle
    crsctl check crsCluster Synchronization Services appears healthy
    Cluster Ready Services appears healthy
    Event Manager appears healthy
    I found this interesting though in ttcwerrors.log:
    2011-03-31 01:35:37.26 Err : : 10032: (ttCRSmaster:TT_0.0.0.1) ttctl.c(366): 08001:[TimesTen][TimesTen 11.2.1.6.6 ODBC Driver][TimesTen]TT0830: Cannot create data store
    file. OS-detected error: Permission denied -- file "db.c", lineno 11505, procedure "sbDbConnect"
    2011-03-31 01:35:37.26 Err : : 10032: (ttCRSmaster:TT_0.0.0.1) ttCRSmaster.c(1978): Native error 830 for the dsn TT_0.0.0.1
    2011-03-31 01:35:39.74 Err : : 10032: (ttCRSmaster:TT_0.0.0.1) ttctl.c(366): 08001:[TimesTen][TimesTen 11.2.1.6.6 ODBC Driver][TimesTen]TT0830: Cannot create data store
    file. OS-detected error: Permission denied -- file "db.c", lineno 11505, procedure "sbDbConnect"
    2011-03-31 01:35:42.25 Err : : 10032: (ttCRSmaster:TT_0.0.0.1) ttctl.c(366): 08001:[TimesTen][TimesTen 11.2.1.6.6 ODBC Driver][TimesTen]TT0830: Cannot create data store
    file. OS-detected error: Permission denied -- file "db.c", lineno 11505, procedure "sbDbConnect"
    I tried to run truss, but couldnt find out where it was denied...
    Any help?

  • Is it possible to put two different colors in tree parent node background and child nodes background?

    Is it possible to put two different colors in tree parent
    node background and child nodes background?
    Any help will be very helpful.
    Thanks

    Hi PanosE,
    Yes, you can set up another Standard Edition Server in child domain and then deploy pool pairing.
    You need to deploy a new Front End Pool for the new Standard Edition Server.
    A similar case for your reference.
    https://social.technet.microsoft.com/Forums/office/en-US/eca4299c-8edb-481e-b328-c7deba2a79ba/lync-2013-standard-edition-lync-fe-pools-in-multiple-domain-single-forest-senario?forum=lyncdeploy
    Best regards,
    Eric
    Please remember to mark the replies as answers if they help, and unmark the answers if they provide no help. If you have feedback for TechNet Support, contact [email protected]

  • Stop managed server without node manager and admin server

    What are the commonly used ways to stop managed Weblogic server without node manager running and without administration server running?
    (I have only one solution: on the managed server startup dump process ID to a file, and then when I want to stop it, send a signal to this process ID and kill JVM. But it seems not very clean way.)
    (The managed server is started when both node manager and admin server are down, and I provide boot.properties of admin server to the managed server to start.)
    UPDATED: And I don't want to start neither admin server, nor node manager even temporarily.
    Edited by: user12163080 on Jun 24, 2010 4:40 AM

    Hai,
    I read the Oracle weblogic wlst script document without Admin server you cannot connect the managed server through the WLST script. see the below lines
    "The start command starts Managed Servers or clusters in a domain using Node Manager.
    To use the start command, WLST must be connected to a running Administration Server.
    To start Managed Servers without requiring a running Administration Server, use the
    nmStart command with WLST connected to Node Manager."
    "You shut down a server to which WLST is connected by entering the shutdown command
    without any arguments.
    When connected to a Managed Server instance, you only use the shutdown command to shut
    down the Managed Server instance to which WLST is connected; you cannot shut down another
    server while connected to a Managed Server instance.
    WLST uses Node Manager to shut down a Managed Server. When shutting down a Managed
    Server, Node Manager must be running.
    In the event of an error, the command returns"
    They are two option if you are using adminserver then we can stop the any Managed server.
    The option is if you are using the nodemanager without admin server we can stop the any Managed server.
    The last final solution to kill the particular Managed server pid.
    Regards,
    S.vinoth babu

  • Set node id and baudrate via lss (CANOpen)

    Hi there,
    for a reasearch project we are trying to use 2 CAN volume flow sensors (http://www.hydrotechnik.com/english/QT106_DSEN.pdf) . To use them in our CAN network I first have to configure the node-id and baudrate of each one. The manufacturer told me to do this via Layer Setting Service (LSS). (How) Can i do that using Labview?
    I can use one of the following NI CAN Cards: NI PCI-CAN/2 and NI-PCI-8512 .
    Thanks for any hints.
    Greetings,
    Thomas
    Solved!
    Go to Solution.

    Hi all,
    1) JB is right
    NI does not recommend the NI CANopen LabVIEW Library for use in new designs
    But it is a second way to communicate over CANOpen.
    2) I started the Installer...as attachment you will find a screenshot
    The installer works with LV 8.5, 8.6, 2009 and 2010 (I have installed LV 2012 - and I can't choose that option)
    => I / you have to install this library manually
    Regards
    Dippi
    Attachments:
    CANOpenLibrary.jpg ‏141 KB

  • RAC Node hang and unexpected reboot

    Hello friends      
    We are facing the intermittent issue of node hang and unexpected shutdown of node. This is 2 node rac 10.2.03 running on windows 2003. Here's crsd.log
    2009-07-16 17:24:03.058: [ OCRMSG][5252]prom_rpc: CLSC recv failure..ret code 7
    2009-07-16 17:24:03.058: [ OCRMSG][5252]prom_rpc: possible OCR retry scenario
    2009-07-16 17:24:03.058: [ COMMCRS][5616]clscsendx: (0000000002AF5C60) Physical connection (0000000003892080) not active
    2009-07-16 17:24:03.058: [ OCRMSG][5616]prom_rpc: CLSC send failure..ret code 11
    2009-07-16 17:24:03.058: [ OCRMSG][5616]prom_rpc: possible OCR retry scenario
    2009-07-16 17:24:03.105: [ COMMCRS][5252]clscsendx: (0000000002AF5C60) Connection not active
    2009-07-16 17:24:03.105: [ OCRMSG][5252]prom_rpc: CLSC send failure..ret code 6
    2009-07-16 17:24:03.105: [ OCRMSG][5252]prom_rpc: possible OCR retry scenario
    2009-07-16 17:24:03.105: [ COMMCRS][5616]clscsendx: (0000000002AF5C60) Connection not active
    2009-07-16 17:24:03.105: [ OCRMSG][5616]prom_rpc: CLSC send failure..ret code 6
    2009-07-16 17:24:03.105: [ OCRMSG][5616]prom_rpc: possible OCR retry scenario
    2009-07-16 17:24:03.152: [ COMMCRS][5252]clscsendx: (0000000002AF5C60) Connection not active
    2009-07-16 17:24:03.152: [ OCRMSG][5252]prom_rpc: CLSC send failure..ret code 6
    2009-07-16 17:24:03.152: [ OCRMSG][5252]prom_rpc: possible OCR retry scenario
    2009-07-16 17:24:03.168: [ COMMCRS][5616]clscsendx: (0000000002AF5C60) Connection not active
    2009-07-16 17:24:03.168: [ OCRMSG][5616]prom_rpc: CLSC send failure..ret code 6
    2009-07-16 17:24:03.168: [ OCRMSG][5616]prom_rpc: possible OCR retry scenario
    2009-07-16 17:24:03.215: [ COMMCRS][5252]clscsendx: (0000000002AF5C60) Connection not active
    2009-07-16 17:24:03.215: [ OCRMSG][5252]prom_rpc: CLSC send failure..ret code 6
    2009-07-16 17:24:03.215: [ OCRMSG][5252]prom_rpc: possible OCR retry scenario
    2009-07-16 17:24:03.215: [ COMMCRS][5616]clscsendx: (0000000002AF5C60) Connection not active
    2009-07-16 17:24:03.215: [ OCRMSG][5616]prom_rpc: CLSC send failure..ret code 6
    2009-07-16 17:24:03.215: [ OCRMSG][5616]prom_rpc: possible OCR retry scenario
    2009-07-16 17:24:03.261: [ COMMCRS][5616]clscsendx: (0000000002AF5C60) Connection not active
    2009-07-16 17:24:03.261: [ OCRMSG][5616]prom_rpc: CLSC send failure..ret code 6
    2009-07-16 17:24:03.261: [ OCRMSG][5616]prom_rpc: possible OCR retry scenario
    Please throw me the light, what may be issue.

    I suggest you install [ IPD/OS|http://www.oracle.com/technology/products/database/clustering/ipd_download_homepage.html] on you cluster. This will give you all the relevant OS statistics so when a node reboot happens, you can figure out what the state of the nodes was at that time and then fix the problem. The hang is often caused by something other than Oracle RAC.

  • Root.sh failed in one node - CLSMON and UDLM

    Hi experts.
    My enviroment is:
    2-node SunCluster Update3
    Oracle RAC 10.2.0.1 > planning to upgrade to 10.2.0.4
    The problem is: I installed the CRS services on 2 nodes - OK
    After that, running root.sh fails in 1 node:
    /u01/app/product/10/CRS/root.sh
    WARNING: directory '/u01/app/product/10' is not owned by root
    WARNING: directory '/u01/app/product' is not owned by root
    WARNING: directory '/u01/app' is not owned by root
    WARNING: directory '/u01' is not owned by root
    Checking to see if Oracle CRS stack is already configured
    Checking to see if any 9i GSD is up
    Setting the permissions on OCR backup directory
    Setting up NS directories
    Oracle Cluster Registry configuration upgraded successfully
    WARNING: directory '/u01/app/product/10' is not owned by root
    WARNING: directory '/u01/app/product' is not owned by root
    WARNING: directory '/u01/app' is not owned by root
    WARNING: directory '/u01' is not owned by root
    clscfg: EXISTING configuration version 3 detected.
    clscfg: version 3 is 10G Release 2.
    Successfully accumulated necessary OCR keys.
    Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
    node <nodenumber>: <nodename> <private interconnect name> <hostname>
    node 0: spodhcsvr10 clusternode1-priv spodhcsvr10
    node 1: spodhcsvr12 clusternode2-priv spodhcsvr12
    clscfg: Arguments check out successfully.
    NO KEYS WERE WRITTEN. Supply -force parameter to override.
    -force is destructive and will destroy any previous cluster
    configuration.
    Oracle Cluster Registry for cluster has already been initialized
    Sep 22 13:34:17 spodhcsvr10 root: Oracle Cluster Ready Services starting by user request.
    Startup will be queued to init within 30 seconds.
    Sep 22 13:34:20 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Adding daemons to inittab
    Expecting the CRS daemons to be up within 600 seconds.
    Sep 22 13:34:34 spodhcsvr10 last message repeated 3 times
    Sep 22 13:34:34 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:34:40 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:35:43 spodhcsvr10 last message repeated 9 times
    Sep 22 13:36:07 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:36:07 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:36:14 spodhcsvr10 su: libsldap: Status: 85 Mesg: openConnection: simple bind failed - Timed out
    Sep 22 13:36:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:37:35 spodhcsvr10 last message repeated 11 times
    Sep 22 13:37:40 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:37:40 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:37:42 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:38:03 spodhcsvr10 last message repeated 3 times
    Sep 22 13:38:10 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:39:12 spodhcsvr10 last message repeated 9 times
    Sep 22 13:39:13 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:39:13 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:39:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:40:42 spodhcsvr10 last message repeated 12 times
    Sep 22 13:40:46 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:40:46 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:40:49 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:42:05 spodhcsvr10 last message repeated 11 times
    Sep 22 13:42:11 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:42:12 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:42:19 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:42:19 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:42:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:43:49 spodhcsvr10 last message repeated 13 times
    Sep 22 13:43:51 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:43:51 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:43:56 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Failure at final check of Oracle CRS stack.
    I traced the ocssd.log and found some informations:
    [    CSSD]2010-09-22 14:04:14.739 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:14.742 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.742 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:14.744 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.745 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:14.746 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.785 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:14.785 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:14.785 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:14.786 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    [    CSSD]2010-09-22 14:04:23.075 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [    CSSD]2010-09-22 14:04:23.075 >USER: CSS daemon log for node spodhcsvr10, number 0, in cluster NET_RAC
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=spodhcsvr10DBG_CSSD))
    [    CSSD]2010-09-22 14:04:23.082 [1] >TRACE: clssscmain: local-only set to false
    [    CSSD]2010-09-22 14:04:23.096 [1] >TRACE: clssnmReadNodeInfo: added node 0 (spodhcsvr10) to cluster
    [    CSSD]2010-09-22 14:04:23.106 [1] >TRACE: clssnmReadNodeInfo: added node 1 (spodhcsvr12) to cluster
    [    CSSD]2010-09-22 14:04:23.129 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    [    CSSD]2010-09-22 14:04:23.129 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 30
    [    CSSD]2010-09-22 14:04:23.132 [1] >TRACE: clssnmInitNMInfo: misscount set to 600
    [    CSSD]2010-09-22 14:04:23.136 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:23.139 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:23.143 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:25.139 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:25.142 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2488) LATS(0) Disk lastSeqNo(2488)
    [    CSSD]2010-09-22 14:04:25.143 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:25.144 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2488) LATS(0) Disk lastSeqNo(2488)
    [    CSSD]2010-09-22 14:04:25.145 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:25.148 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2489) LATS(0) Disk lastSeqNo(2489)
    [    CSSD]2010-09-22 14:04:25.186 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:25.186 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:25.186 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:25.187 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    [    CSSD]2010-09-22 14:04:33.449 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [    CSSD]2010-09-22 14:04:33.449 >USER: CSS daemon log for node spodhcsvr10, number 0, in cluster NET_RAC
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=spodhcsvr10DBG_CSSD))
    [    CSSD]2010-09-22 14:04:33.457 [1] >TRACE: clssscmain: local-only set to false
    [    CSSD]2010-09-22 14:04:33.470 [1] >TRACE: clssnmReadNodeInfo: added node 0 (spodhcsvr10) to cluster
    [    CSSD]2010-09-22 14:04:33.480 [1] >TRACE: clssnmReadNodeInfo: added node 1 (spodhcsvr12) to cluster
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 30
    [    CSSD]2010-09-22 14:04:33.500 [1] >TRACE: clssnmInitNMInfo: misscount set to 600
    [    CSSD]2010-09-22 14:04:33.505 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:33.508 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:33.510 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:35.508 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:35.510 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.510 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:35.512 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.513 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:35.514 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.553 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:35.553 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:35.553 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:35.553 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    I believe the main error is:
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    And the communication between UDLM and CLSMON. But i don't know how to resolve this.
    My UDLM version is 3.3.4.9.
    Somebody have any ideas about this?
    Tks!

    Now i finally installed CRS and run root.sh without errors (i think that problem is in some old file from other instalation tries...)
    But now i have another problem: When install DB software, in step to copy instalation to remote node, this node have some failure in CLSMON/CSSD daemon and panicking:
    Sep 23 16:10:51 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 138. Respawning
    Sep 23 16:10:52 spodhcsvr10 root: Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:52 spodhcsvr10 root: [ID 702911 user.alert] Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:51 spodhcsvr10 root: [ID 702911 user.error] Oracle CLSMON terminated with unexpected status 138. Respawning
    Sep 23 16:10:52 spodhcsvr10 root: [ID 702911 user.alert] Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:56 spodhcsvr10 Cluster.OPS.UCMMD: fatal: received signal 15
    Sep 23 16:10:56 spodhcsvr10 Cluster.OPS.UCMMD: [ID 770355 daemon.error] fatal: received signal 15
    Sep 23 16:10:59 spodhcsvr10 root: Oracle Cluster Ready Services waiting for SunCluster and UDLM to start.
    Sep 23 16:10:59 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 23 16:10:59 spodhcsvr10 root: [ID 702911 user.error] Oracle Cluster Ready Services waiting for SunCluster and UDLM to start.
    Sep 23 16:10:59 spodhcsvr10 root: [ID 702911 user.error] Cluster Ready Services completed waiting on dependencies.
    Notifying cluster that this node is panicking
    The instalation in first node continue and report error in copy to second node.
    Any ideas? Tks!

  • Azure node.js and phonegap integration

    Hi all,
    I am having a little problem integrating azure, node.js and phone gap.  Any assistance greatly appreciated.
    I am trying to write some multiplayer html5 games that work on both the iPhone and from the web etc. I would like the server to be hosted on Azure using node.js.
    The online azure/node.js examples such as the chat application seem to work fine, both locally, and when uploaded to azure (as, for example a webpage).
    The problem I am having is when I am running the client as local host (such as when using Phone Gap), and trying to connect to the azure server.  Generally to connect through socket.io you specify a host and port.  On azure, you specify the port
    with the command: 
    app.listen(process.env.port, function....
    This generates a strange port string such as \\.\pipe\c20fd94c-84ba-43dd-b7f2-160cf0564a62 rather than the kind or url I am used to such as "www.webpage.com:3000"
    So, the question is - how do I link to a node.js server on azure from a webpage hosted elsewhere (such as on an iPhone).
    Any tips greatly appreciated.
    Cheers,
    Scott

    hi Soctt,
    Would you like to make sense Azure mobile service? Please see this tutorials:
    http://azure.microsoft.com/en-us/documentation/articles/mobile-services-javascript-backend-phonegap-get-started/
    Regards,
    Will
    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click
    HERE to participate the survey.

  • Node creation and addition to existing Element Node

    Hi all,
    I have a class that opens a xml file, loads the Document and then, from a method, I am trying to add sub nodes to one of the Element node from the XML document. It just doesn't work.
    This is the method that is supposed to add the nodes:
    public Node getSubNode( Node iCurrentNode )
    // Method local variables
    DocumentBuilder lBuilder = null;
    Document lDocument = null;
    DocumentBuilderFactory lFactory = DocumentBuilderFactory.newInstance();
    // Prepare the document and final Node to return
    lBuilder = lFactory.newDocumentBuilder();
    lDocument = lBuilder.newDocument();
    Element lSubMenu = lDocument.createElement("sub_menu");
    lSubMenu.setAttribute("ID", "User20");
    lSubMenu.setAttribute("label", "Montreal user 1");
    iCurrentNode.appendChild( lSubMenu );
    lSubMenu = lDocument.createElement("sub_menu");
    lSubMenu2.setAttribute("ID", "User21");
    lSubMenu2.setAttribute("label", "Montreal user 2");
    lDocument.appendChild(lSubMenu2);*/
    return iCurrentNode;
    I always get a "org.apache.crimson.tree.DomEx: WRONG_DOCUMENT_ERR: That node doesn't belong in this document."
    So I understand by this that I am working with a Document that I declared in this method and thus, for an unknown reason, seem unable to add this node to my existing Element node.
    What should I do to be able to add these nodes to my input Node?
    Dominique Paquin

    Ok, to answer my own question, and for the benefit of future people searching here for an answer, I'd used 2 different Document instance for the creation of my 2 nodes, one for the creation of the basic Node structure and one for the creation of the node that would be added to the other one. Since the importNode (done on the first one ) removes the parent from the imported node, I was no longer in a position to access the parent reference.
    I simply created a single Document in my class as a global variable and used this single entity to create all my nodes (the original and the one added to the original), It solved all my problems.
    I don't know if I was clear anough but if you need further explanation drop me a line.
    [email protected]
    http://www.okiok.com

  • Kernel para and sga,pga sizing

    Hello,
    OS-HP-UX(configured to use high sga /no restrictions)
    Current database is running without any issue
    Due to increase in physical ram i plan to increase sga and pga(ram is increase about 3 times)
    My question:
    1)should i also increase sga and pga in same propotion (3 times then existing)
    or use 70% ROT formula
    2)whatever figure i arrive by analysis for increasing sga,do i need to change kernel parameters also
    (does i requre to change kernel parameter every time ram and sga is increased)
    Please suggest
    Thanks

    880991 wrote:
    Ok some questions based on your reply:
    1)then how any gain in upgrading memory above 4 GB if shmmax cannot be more then 4 GB (32 bit OS),then
    SGA also cannot be > 4GB (As per documents SHMMAX should be > SGA)Because the SGA doesn't have to be in one segment. Which version of Oracle are you looking at? It makes a difference!
    http://kevinclosson.wordpress.com/2009/07/27/little-things-doth-crabby-make-part-x-posts-about-linux-hugepages-makes-some-crabby-it-seems/
    >
    2)I read if Operating system is linux can use SGA above 4GB,in fact one of friend told they implemented SGA
    4GB on linux configuring VLM/Huge pages,so in that case ( VLM ) needs to set SHMMAX > 4GB (as per memory ROT) of VLM take cares of SGA > 4GB even if SHMMAX is not setintrestingly my friend also mentioned that in there server VLM is set and shmmax is set < 4 GB and SGA > 4GB
    still it works without problem when checking SGA usage in dyanamic views its shown as using above 4 GB
    ,it contradicts what i read in documents,how this is working ,anyone have idea?[url http://download.oracle.com/docs/cd/B28359_01/server.111/b32009/appi_vlm.htm]No contradiction.
    Also VLM implementation is restricted to Linux or it can be on set on Unix as well and if its set do all components
    of SGA benefit or only buffer cache?
    ThanksIt varies, some unix are quite different than others. Specific answers can only be given to specific configurations. I really have to wonder when you are asking about hp-ux and 32 bit.

Maybe you are looking for

  • No goods receipt possible for purchase order 4800000097

    hi, i am facing a problem while at the time of Doing GR . it says the "No goods receipt possible for purchase order 4800000097"  when i try to proceed with the GR. I have checked all the necessary possibilities such as 1.  PO has been released 2. at

  • Why is Firefox writing so much data all the time?

    I am running Windows 7 x64 and Firefox 8. In my Task Manager I display the I/O Write Bytes of all processes. I have found that Firefox is ALWAYS the Process with the largest amount of write bytes after it has been running an hour or so even though th

  • Running two displays from MBP Retina under windows

    I'm looking at purchasing a MBP Retina 13".  I have two external monitors that can accept HDMI and DVI (neither can do displayport).  Some software that I run requires me to sometimes boot into windows.  Under windows 7, can I output a DVI signal fro

  • Report - Responsibility in EBS

    Hi I have bunch of Reports running as Concurrent Request in EBS. How do I find from Report name which responsibility it is assigned to? I got Report name but dot sure which responsibility it runs as. Please advise. Thanks and regards Vijay

  • Retro issue Infotype 14 wagetypes

    Hi Experts, Changes in Infotype 0008 wagetype is working fine and is appearing in /551 . Similar thing is not happening for IT0014 wagetypes. I've changed them from past date but the /551 doesn't get affected. Already checked the retro settings in ta