RAC node reboots from time to time

Hi %,
we have a problem with our rac: it's a three node rac on sles9, 64 bit. one node reboots from time to time. We found nothing in any log file. (only in /var/log/messages of node 1:
"Feb 21 14:58:02 pmg-db1 kernel: o2net: connection to node pmg-db2 (num 1) at 192.168.0.2:7777 has been idle for 10 seconds, shutting it down."
). Does anyone had a similar problem? Or anyone an idea?
regards
Andreas

sorry no /var/log/demsg.
Perhaps I have to write another detail: the third node was added after the two node rac ran for several month. First we had the reboot problem with this third node. We found out, that the interconnect was connected to a 100Mbit module of the switch and not to a 1000Mbit module. We changed this a few days ago, but no the second node rebooted. And it is connected with 1000Mbit/s.
And did I mention, that we use 10.2.0.2?
regards
Andreas

Similar Messages

  • Had to erase hd and reboot from time machine.  Now login password does not work

    Had to erase hd and reboot from time machine.  Now login password does not work

    Stick the OS X 10.6 install disk and reboot the computer holding c down.
    Second screen in is a Utilities menu with Password Reset, try that.

  • RAC node rebooting frequently

    Hi all,
    I am woserking on two node rac environment.One of my rac node is rebooting so frequently.I am using oracle 10g database and clusterware also(10.2.0.1).
    Ihave checked os logs(linux AS 4),and rac related logs.Not able to find out anything.Posting all logs please suggest.

    Hi i am posting alert log,os log and ocssd logs....
    clusterware alert log....._
    [crsd(5649)]CRS-1201:CRSD started on node ctmisdb1.
    2012-03-21 09:50:38.188
    [cssd(7490)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 .
    2012-03-21 09:50:46.726
    [crsd(5649)]CRS-1204:Recovering CRS resources for node ctmisdb2.
    2012-03-21 09:55:21.760
    [cssd(7490)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 ctmisdb2 .
    2012-03-21 12:07:46.681
    [cssd(7426)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /u01/app/oracle/product/crs/log/ctmisdb1/cssd/ocssd.log.
    2012-03-21 12:07:50.432
    [cssd(7426)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 ctmisdb2 .
    2012-03-21 12:07:50.893
    [crsd(5549)]CRS-1012:The OCR service started on node ctmisdb1.
    2012-03-21 12:07:50.942
    [evmd(7304)]CRS-1401:EVMD started on node ctmisdb1.
    2012-03-21 12:07:52.827
    [crsd(5549)]CRS-1201:CRSD started on node ctmisdb1.
    2012-03-21 12:48:41.908
    [cssd(7448)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /u01/app/oracle/product/crs/log/ctmisdb1/cssd/ocssd.log.
    2012-03-21 12:48:45.741
    [cssd(7448)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 ctmisdb2 .
    2012-03-21 12:48:49.173
    [crsd(5546)]CRS-1012:The OCR service started on node ctmisdb1.
    2012-03-21 12:48:49.190
    [evmd(7328)]CRS-1401:EVMD started on node ctmisdb1.
    2012-03-21 12:48:50.818
    [crsd(5546)]CRS-1201:CRSD started on node ctmisdb1.
    2012-03-21 13:26:36.398
    [cssd(7343)]CRS-1605:CSSD voting file is online: /dev/raw/raw2. Details in /u01/app/oracle/product/crs/log/ctmisdb1/cssd/ocssd.log.
    2012-03-21 13:26:40.492
    [cssd(7343)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ctmisdb1 ctmisdb2 .
    2012-03-21 13:26:40.939
    [crsd(5542)]CRS-1012:The OCR service started on node ctmisdb1.
    2012-03-21 13:26:40.977
    [evmd(7223)]CRS-1401:EVMD started on node ctmisdb1.
    2012-03-21 13:26:42.772
    [crsd(5542)]CRS-1201:CRSD started on node ctmisdb1.
    node os log....+
    Mar 21 12:06:35 ctmisdb1 rc: Starting readahead: succeeded
    Mar 21 12:06:35 ctmisdb1 messagebus: messagebus startup succeeded
    Mar 21 12:06:36 ctmisdb1 cups-config-daemon: cups-config-daemon startup succeeded
    Mar 21 12:06:36 ctmisdb1 haldaemon: haldaemon startup succeeded
    Mar 21 12:06:37 ctmisdb1 fstab-sync[6267]: removed all generated mount points
    Mar 21 12:06:37 ctmisdb1 fstab-sync[6378]: added mount point /media/cdrecorder for /dev/hde
    Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6323]: session opened for user oracle by (uid=0)
    Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6324]: session opened for user oracle by (uid=0)
    Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6229]: session opened for user oracle by (uid=0)
    Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6229]: session closed for user oracle
    Mar 21 12:06:37 ctmisdb1 su(pam_unix)[6644]: session opened for user oracle by (uid=0)
    Mar 21 12:06:37 ctmisdb1 kernel: matroxfb: cannot set xres to 800, rounded up to 832
    Mar 21 12:06:37 ctmisdb1 last message repeated 2 times
    Mar 21 12:06:41 ctmisdb1 su(pam_unix)[6323]: session closed for user oracle
    Mar 21 12:06:41 ctmisdb1 su(pam_unix)[6644]: session closed for user oracle
    Mar 21 12:06:41 ctmisdb1 su(pam_unix)[6324]: session closed for user oracle
    Mar 21 12:06:41 ctmisdb1 logger: Cluster Ready Services completed waiting on dependencies.
    Mar 21 12:06:41 ctmisdb1 last message repeated 2 times
    Mar 21 12:06:45 ctmisdb1 gdm(pam_unix)[6379]: session opened for user root by (uid=0)
    Mar 21 12:06:46 ctmisdb1 gconfd (root-7052): starting (version 2.8.1), pid 7052 user 'root'
    Mar 21 12:06:47 ctmisdb1 gconfd (root-7052): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
    Mar 21 12:06:47 ctmisdb1 gconfd (root-7052): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 1
    Mar 21 12:06:47 ctmisdb1 gconfd (root-7052): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2
    Mar 21 12:06:55 ctmisdb1 gconfd (root-7052): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 0
    Mar 21 12:07:41 ctmisdb1 su(pam_unix)[5547]: session opened for user oracle by (uid=0)
    Mar 21 12:07:41 ctmisdb1 logger: Running CRSD with TZ =
    Mar 21 12:07:43 ctmisdb1 su(pam_unix)[7399]: session opened for user oracle by (uid=0)
    Mar 21 12:12:49 ctmisdb1 sshd(pam_unix)[15323]: session opened for user root by root(uid=0)
    Mar 21 12:12:57 ctmisdb1 su(pam_unix)[15531]: session opened for user oracle by root(uid=0)
    Mar 21 12:47:05 ctmisdb1 syslogd 1.4.1: restart.
    ocssd log....
    [    CSSD]2012-03-21 11:24:41.045 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661f0c0) proc(0x8006622560) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 11:24:41.078 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660cfe0) proc(0x800662ba70) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:07:44.564 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=ctmisdb1DBG_CSSD))
    [    CSSD]2012-03-21 12:07:44.564 >USER: CSS daemon log for node ctmisdb1, number 1, in cluster crs
    [    CSSD]2012-03-21 12:07:44.581 [28260544] >TRACE: clssscmain: local-only set to false
    [    CSSD]2012-03-21 12:07:44.603 [28260544] >TRACE: clssnmReadNodeInfo: added node 1 (ctmisdb1) to cluster
    [    CSSD]2012-03-21 12:07:44.621 [28260544] >TRACE: clssnmReadNodeInfo: added node 2 (ctmisdb2) to cluster
    [    CSSD]2012-03-21 12:07:44.627 [72925824] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
    [    CSSD]2012-03-21 12:07:44.627 [28260544] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
    [    CSSD]2012-03-21 12:07:44.641 [28260544] >TRACE: clssnmInitNMInfo: misscount set to 60
    [    CSSD]2012-03-21 12:07:44.655 [28260544] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw2)
    [    CSSD]2012-03-21 12:07:46.661 [72925824] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/raw/raw2)
    [    CSSD]2012-03-21 12:07:46.690 [72925824] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(18) wrtcnt(7920) LATS(0) Disk lastSeqNo(7920)
    [    CSSD]2012-03-21 12:07:46.752 [28260544] >TRACE: clssnmFatalInit: fatal mode enabled
    [    CSSD]2012-03-21 12:07:46.752 [94777984] >TRACE: clssnmconnect: connecting to node 1, flags 0x0001, connector 1
    [    CSSD]2012-03-21 12:07:46.753 [94777984] >TRACE: clssnmconnect: connecting to node 0, flags 0x0000, connector 1
    [    CSSD]2012-03-21 12:07:46.753 [94777984] >TRACE: clssnmClusterListener: Probing node(2)
    [    CSSD]2012-03-21 12:07:46.755 [94777984] >TRACE: clssnmConnComplete: connected to node 2 (con 0x8006601040), state 3 birth 0, unique 1332303918/1332303918 prevConuni(0)
    [    CSSD]2012-03-21 12:07:46.756 [106332800] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
    [    CSSD]2012-03-21 12:07:46.756 [106332800] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_ctmisdb1_crs))
    [    CSSD]2012-03-21 12:07:46.757 [151810688] >TRACE: clssnmPollingThread: Connection complete
    [    CSSD]2012-03-21 12:07:46.757 [162296448] >TRACE: clssnmSendingThread: Connection complete
    [    CSSD]2012-03-21 12:07:46.757 [172782208] >TRACE: clssnmRcfgMgrThread: Connection complete
    [    CSSD]2012-03-21 12:07:46.757 [172782208] >TRACE: clssnmRcfgMgrThread: Local Join
    [    CSSD]2012-03-21 12:07:46.757 [172782208] >WARNING: clssnmLocalJoinEvent: takeover aborted due to connected but inactive nodes
    [    CSSD]2012-03-21 12:07:47.339 [94777984] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[ctmisdb2] seq[5] sync[18]
    [    CSSD]2012-03-21 12:07:47.759 [172782208] >TRACE: clssnmRcfgMgrThread: lastleader(2) unique(1332311864)
    [    CSSD]2012-03-21 12:07:48.341 [94777984] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(18)
    [    CSSD]2012-03-21 12:07:50.346 [94777984] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
    [    CSSD]2012-03-21 12:07:50.346 [94777984] >TRACE: clssnmDeactivateNode: node 0 () left cluster
    [    CSSD]2012-03-21 12:07:50.346 [94777984] >TRACE: clssnmUpdateNodeState: node 1, state (1/2) unique (1332311864/1332311864) prevConuni(0) birth (0/18) (old/new)
    [    CSSD]2012-03-21 12:07:50.346 [94777984] >TRACE: clssnmUpdateNodeState: node 2, state (4/3) unique (1332303918/1332303918) prevConuni(0) birth (0/16) (old/new)
    [    CSSD]2012-03-21 12:07:50.346 [94777984] >USER: clssnmHandleUpdate: SYNC(18) from node(2) completed
    [    CSSD]2012-03-21 12:07:50.346 [94777984] >USER: clssnmHandleUpdate: NODE 1 (ctmisdb1) IS ACTIVE MEMBER OF CLUSTER
    [    CSSD]2012-03-21 12:07:50.346 [94777984] >USER: clssnmHandleUpdate: NODE 2 (ctmisdb2) IS ACTIVE MEMBER OF CLUSTER
    [    CSSD]2012-03-21 12:07:50.429 [28260544] >USER: NMEVENT_SUSPEND [00][00][00][00]
    [    CSSD]2012-03-21 12:07:50.429 [183267968] >TRACE: clssgmReconfigThread: started for reconfig (18)
    [    CSSD]2012-03-21 12:07:50.429 [183267968] >USER: NMEVENT_RECONFIG [00][00][00][06]
    [    CSSD]2012-03-21 12:07:50.429 [183267968] >TRACE: clssgmEstablishConnections: 2 nodes in cluster incarn 18
    [    CSSD]2012-03-21 12:07:50.430 [140255872] >TRACE: clssgmInitialRecv: (0x102a0360) accepted a new connection from node 2 born at 16 active (2, 2), vers (10,3,1,2)
    [    CSSD]2012-03-21 12:07:50.430 [140255872] >TRACE: clssgmInitialRecv: conns done (2/2)
    [    CSSD]2012-03-21 12:07:50.430 [183267968] >TRACE: clssgmEstablishMasterNode: MASTER for 18 is node(2) birth(16)
    [    CSSD]2012-03-21 12:07:50.430 [183267968] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
    [    CSSD]2012-03-21 12:07:50.432 [140255872] >TRACE: clssgmHandleDBDone(): src/dest (2/65535) size(72) incarn 18
    [    CSSD]CLSS-3000: reconfiguration successful, incarnation 18 with 2 nodes
    [    CSSD]CLSS-3001: local node number 1, master node number 2
    [    CSSD]2012-03-21 12:07:50.433 [183267968] >TRACE: clssgmReconfigThread: completed for reconfig(18), with status(1)
    [    CSSD]2012-03-21 12:07:50.550 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006603bb0) proc(0x8006608b00) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:07:50.551 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x80066066f0) proc(0x8006608d70) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:07:53.569 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660ec70) proc(0x8006611260) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:00.829 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006610990) proc(0x800660de00) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:04.698 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006613030) proc(0x8006612930) pid(8115) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:04.816 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006612950) proc(0x8006613c20) pid(8115) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:04.832 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006612950) proc(0x8006613c20) pid(8115) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:06.615 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006612950) proc(0x8006613c20) pid(8171) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:07.114 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006615960) proc(0x8006616350) pid(8175) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:11.373 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x80066192a0) proc(0x8006619470) pid(8302) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:11.669 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661bf60) proc(0x800661ee20) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:17.135 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661bf60) proc(0x800661ee70) pid(8458) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:17.268 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661fc00) proc(0x80066220d0) pid(8460) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:17.305 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x80066223e0) proc(0x8006625250) pid(8462) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:17.353 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006625560) proc(0x8006628430) pid(8464) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:24.585 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006625560) proc(0x8006628430) pid(8645) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:27.957 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006628740) proc(0x800662b610) pid(8722) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:30.931 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800662cce0) proc(0x800662c860) pid(8801) proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:36.400 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661c5f0) proc(0x800661eb50) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:37.863 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800662f1c0) proc(0x800661eee0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:38.537 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800662f1c0) proc(0x800661d500) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:39.232 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800661bf60) proc(0x800661d500) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:43.085 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006611210) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:08:58.971 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x80066112c0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:09:59.290 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800660b190) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:10:59.589 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800660b190) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:11:59.904 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800660b190) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:13:00.203 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800660b190) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:13:14.029 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800660b190) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:14:00.501 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006611210) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:15:00.809 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:16:01.117 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:17:01.447 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:01.762 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:39.841 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:42.123 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:42.316 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:42.843 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:42.963 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:43.098 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b260) proc(0x800662bd20) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:44.173 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:44.368 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b260) proc(0x800660b310) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:45.351 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006628670) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:46.236 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:47.031 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:47.694 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:47.819 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b260) proc(0x800660b310) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:48.103 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:48.327 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b260) proc(0x800660b310) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:48.484 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x8006611210) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:48.758 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:49.529 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:50.509 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:51.060 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x800660b830) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:18:51.558 [106332800] >TRACE: clssgmClientConnectMsg: Connect from con(0x8006611630) proc(0x800662f0f0) pid() proto(10:2:1:1)
    [    CSSD]2012-03-21 12:48:39.836 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=ctmisdb1DBG_CSSD))
    [    CSSD]2012-03-21 12:48:39.836 >USER: CSS daemon log for node ctmisdb1, number 1, in cluster crs
    [    CSSD]2012-03-21 12:48:39.849 [28260544] >TRACE: clssscmain: local-only set to false
    [    CSSD]2012-03-21 12:48:39.865 [28260544] >TRACE: clssnmReadNodeInfo: added node 1 (ctmisdb1) to cluster
    [    CSSD]2012-03-21 12:48:39.872 [28260544] >TRACE: clssnmReadNodeInfo: added node 2 (ctmisdb2) to cluster
    [    CSSD]2012-03-21 12:48:39.879 [72925824] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
    [    CSSD]2012-03-21 12:48:39.879 [28260544] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
    [    CSSD]2012-03-21 12:48:39.881 [28260544] >TRACE: clssnmInitNMInfo: misscount set to 60
    [    CSSD]2012-03-21 12:48:39.888 [28260544] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw2)
    [    CSSD]2012-03-21 12:48:41.892 [72925824] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/raw/raw2)
    [    CSSD]2012-03-21 12:48:41.915 [72925824] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(20) wrtcnt(10367) LATS(0) Disk lastSeqNo(10367)
    [    CSSD]2012-03-21 12:48:41.959 [28260544] >TRACE: clssnmFatalInit: fatal mode enabled
    [    CSSD]2012-03-21 12:48:41.959 [94777984] >TRACE: clssnmconnect: connecting to node 1, flags 0x0001, connector 1
    [    CSSD]2012-03-21 12:48:41.959 [94777984] >TRACE: clssnmconnect: connecting to node 0, flags 0x0000, connector 1
    [    CSSD]2012-03-21 12:48:41.959 [94777984] >TRACE: clssnmClusterListener: Probing node(2)
    [    CSSD]2012-03-21 12:48:41.961 [94777984] >TRACE: clssnmConnComplete: connected to node 2 (con 0x8006702790), state 3 birth 0, unique 1332303918/1332303918 prevConuni(0)
    [    CSSD]2012-03-21 12:48:41.962 [106332800] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
    [    CSSD]2012-03-21 12:48:41.962 [106332800] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_ctmisdb1_crs))
    [    CSSD]2012-03-21 12:48:41.963 [152330880] >TRACE: clssnmPollingThread: Connection complete
    [    CSSD]2012-03-21 12:48:41.963 [162816640] >TRACE: clssnmSendingThread: Connection complete
    [    CSSD]2012-03-21 12:48:41.963 [173302400] >TRACE: clssnmRcfgMgrThread: Connection complete
    [    CSSD]2012-03-21 12:48:41.963 [173302400] >TRACE: clssnmRcfgMgrThread: Local Join
    [    CSSD]2012-03-21 12:48:41.963 [173302400] >WARNING: clssnmLocalJoinEvent: takeover aborted due to connected but inactive nodes
    [    CSSD]2012-03-21 12:48:42.631 [94777984] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[ctmisdb2] seq[13] sync[20]
    [    CSSD]2012-03-21 12:48:42.965 [173302400] >TRACE: clssnmRcfgMgrThread: lastleader(2) unique(1332314319)
    [    CSSD]2012-03-21 12:48:43.636 [94777984] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(20)
    [    CSSD]2012-03-21 12:48:45.640 [94777984] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
    [    CSSD]2012-03-21 12:48:45.640 [94777984] >TRACE: clssnmDeactivateNode: node 0 () left cluster
    [    CSSD]2012-03-21 12:48:45.640 [94777984] >TRACE: clssnmUpdateNodeState: node 1, state (1/2) unique (1332314319/1332314319) prevConuni(0) birth (0/20) (old/new)
    [    CSSD]2012-03-21 12:48:45.640 [94777984] >TRACE: clssnmUpdateNodeState: node 2, state (4/3) unique (1332303918/1332303918) prevConuni(0) birth (0/16) (old/new)
    [    CSSD]2012-03-21 12:48:45.640 [94777984] >USER: clssnmHandleUpdate: SYNC(20) from node(2) completed
    [    CSSD]2012-03-21 12:48:45.640 [94777984] >USER: clssnmHandleUpdate: NODE 1 (ctmisdb1) IS ACTIVE MEMBER OF CLUSTER
    [    CSSD]2012-03-21 12:48:45.640 [94777984] >USER: clssnmHandleUpdate: NODE 2 (ctmisdb2) IS ACTIVE MEMBER OF CLUSTER
    [    CSSD]2012-03-21 12:48:45.737 [28260544] >USER: NMEVENT_SUSPEND [00][00][00][00]
    [    CSSD]2012-03-21 12:48:45.738 [183788160] >TRACE: clssgmReconfigThread: started for reconfig (20)
    [    CSSD]2012-03-21 12:48:45.738 [183788160] >USER: NMEVENT_RECONFIG [00][00][00][06]
    [    CSSD]2012-03-21 12:48:45.738 [183788160] >TRACE: clssgmEstablishConnections: 2 nodes in cluster incarn 20
    [    CSSD]2012-03-21 12:48:45.739 [140776064] >TRACE: clssgmInitialRecv: (0x102a0370) accepted a new connection from node 2 born at 16 active (2, 2), vers (10,3,1,2)
    [    CSSD]2012-03-21 12:48:45.739 [140776064] >TRACE: clssgmInitialRecv: conns done (2/2)
    [    CSSD]2012-03-21 12:48:45.739 [183788160] >TRACE: clssgmEstablishMasterNode: MASTER for 20 is node(2) birth(16)
    [    CSSD]2012-03-21 12:48:45.739 [183788160] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
    [    CSSD]2012-03-21 12:48:45.741 [140776064] >TRACE: clssgmHandleDBDone(): src/dest (2/65535) size(72) incarn 20
    [    CSSD]CLSS-3000: reconfiguration successful, incarnation 20 with 2 nodes
    Plz check and help..........

  • RAC nodes rebooting

    I'm newbie, and trying to implement 11g RAC using openfiler on E-linux 5.3
    I have so far successfully configured openfiler, created volumes and configured the nodes, configured ocfs2 and ASM.
    When I rebooted the machines, I first started the openfiler server and external storage they start fine and all volumes(devices) comes up fine, but when I boot the nodes one after the other, they are rebooting after couple of minutes continuously one after other , I am clue less, how to figure out what is the problem, why is this happening, has any one else experienced similar situatio? , how can this be resolved?
    I would appreciate any advise or help
    Thanks

    what is difference in timings on your rac nodes...any thing > 45 secs can possibly cause reboots.
    check you disktimeouts.. and hangcheck timer settings
    hth

  • If use MSSQ , when oracle rac node reboot, client get TPEOS error

    Hi, all
    in my tuxedo applicaton, if we use Single Server, Single Queue mode , when reboot any Oracle RAC node, our application is ok, client can get correct result. but if we use MSSQ(Multi Server, Single Queue) , if Oracle RAC node is ok , our application also is ok. but if we reboot any Oracle RAC node, client program can continue run, get correct result, but always get TPEOS error , for this situation, server can get client request, but client can not get server reply, only get TPEOS error.
    our enviroment is :
    oracle RAC ,10g 10.2.0.4 , two instances ,rac1 rac2, and two DTP services s1 and s2, set s1 and s2 services TAF is basic
    tuxedo 10R3 , two nodes ,work in MP model ,use XA access oracle rac database,services have Transaction and not Transaction
    OS is linux AS4 U5, 64bits
    service program use OCI
    can any one encounter this problem ?

    Hi, first thanks you
    in ULOG file , only have failover information, not any other error message, in client side also has no other error.
    not use MSSQ, ubb file about MSSQ config
    SERVERS
    DEFAULT:
    CLOPT="-A "
    sinUpdate_server SRVGRP=GROUP11 SRVID=80 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinUpdate_server SRVGRP=GROUP12 SRVID=160 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinCount_server SRVGRP=GROUP11 SRVID=240 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinCount_server SRVGRP=GROUP12 SRVID=320 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinSelect_server SRVGRP=GROUP11 SRVID=360 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinSelect_server SRVGRP=GROUP12 SRVID=400 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinInsert_server SRVGRP=GROUP11 SRVID=520 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinInsert_server SRVGRP=GROUP12 SRVID=560 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinDelete_server SRVGRP=GROUP11 SRVID=600 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinDelete_server SRVGRP=GROUP12 SRVID=640 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinDdl_server SRVGRP=GROUP11 SRVID=700 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinDdl_server SRVGRP=GROUP12 SRVID=740 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    lockselect_server SRVGRP=GROUP11 SRVID=800 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    lockselect_server SRVGRP=GROUP12 SRVID=840 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    #mulup_server SRVGRP=GROUP11 SRVID=1 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    #mulup_server SRVGRP=GROUP12 SRVID=60 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinUpdate_server SRVGRP=GROUP13 SRVID=83 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinUpdate_server SRVGRP=GROUP14 SRVID=164 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinCount_server SRVGRP=GROUP13 SRVID=243 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinCount_server SRVGRP=GROUP14 SRVID=324 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinSelect_server SRVGRP=GROUP13 SRVID=363 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinSelect_server SRVGRP=GROUP14 SRVID=404 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinInsert_server SRVGRP=GROUP13 SRVID=523 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinInsert_server SRVGRP=GROUP14 SRVID=564 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinDelete_server SRVGRP=GROUP13 SRVID=603 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinDelete_server SRVGRP=GROUP14 SRVID=644 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinDdl_server SRVGRP=GROUP13 SRVID=703 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    sinDdl_server SRVGRP=GROUP14 SRVID=744 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    lockselect_server SRVGRP=GROUP13 SRVID=803 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    lockselect_server SRVGRP=GROUP14 SRVID=844 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    #mulup_server SRVGRP=GROUP13 SRVID=13 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    #mulup_server SRVGRP=GROUP14 SRVID=64 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
    WSL SRVGRP=GROUP11 SRVID=1000
    CLOPT="-A -- -n//120.3.8.237:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
    WSL SRVGRP=GROUP12 SRVID=1001
    CLOPT="-A -- -n//120.3.8.238:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
    WSL SRVGRP=GROUP13 SRVID=1003
    CLOPT="-A -- -n//120.3.8.237:7203 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
    WSL SRVGRP=GROUP14 SRVID=1004
    CLOPT="-A -- -n//120.3.8.238:7204 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
    if we use MSSQ ,ubb file about MSSQ config is
    *SERVERS
    DEFAULT:
    CLOPT="-A -p 1,60:1,30"
    sinUpdate_server SRVGRP=GROUP11 SRVID=80 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate11 REPLYQ=Y
    sinUpdate_server SRVGRP=GROUP12 SRVID=160 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate12 REPLYQ=Y
    sinCount_server SRVGRP=GROUP11 SRVID=240 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount11 REPLYQ=Y
    sinCount_server SRVGRP=GROUP12 SRVID=320 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount12 REPLYQ=Y
    sinSelect_server SRVGRP=GROUP11 SRVID=360 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelec11 REPLYQ=Y
    sinSelect_server SRVGRP=GROUP12 SRVID=400 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelect12 REPLYQ=Y
    sinInsert_server SRVGRP=GROUP11 SRVID=520 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert11 REPLYQ=Y
    sinInsert_server SRVGRP=GROUP12 SRVID=560 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert12 REPLYQ=Y
    sinDelete_server SRVGRP=GROUP11 SRVID=600 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete11 REPLYQ=Y
    sinDelete_server SRVGRP=GROUP12 SRVID=640 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete12 REPLYQ=Y
    sinDdl_server SRVGRP=GROUP11 SRVID=700 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl11 REPLYQ=Y
    sinDdl_server SRVGRP=GROUP12 SRVID=740 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl12 REPLYQ=Y
    lockselect_server SRVGRP=GROUP11 SRVID=800 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect11 REPLYQ=Y
    lockselect_server SRVGRP=GROUP12 SRVID=840 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect12 REPLYQ=Y
    #mulup_server SRVGRP=GROUP11 SRVID=1 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup11 REPLYQ=Y
    #mulup_server SRVGRP=GROUP12 SRVID=60 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup12 REPLYQ=Y
    sinUpdate_server SRVGRP=GROUP13 SRVID=83 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate13 REPLYQ=Y
    sinUpdate_server SRVGRP=GROUP14 SRVID=164 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate14 REPLYQ=Y
    sinCount_server SRVGRP=GROUP13 SRVID=243 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount13 REPLYQ=Y
    sinCount_server SRVGRP=GROUP14 SRVID=324 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount14 REPLYQ=Y
    sinSelect_server SRVGRP=GROUP13 SRVID=363 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelec13 REPLYQ=Y
    sinSelect_server SRVGRP=GROUP14 SRVID=404 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelect14 REPLYQ=Y
    sinInsert_server SRVGRP=GROUP13 SRVID=523 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert13 REPLYQ=Y
    sinInsert_server SRVGRP=GROUP14 SRVID=564 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert14 REPLYQ=Y
    sinDelete_server SRVGRP=GROUP13 SRVID=603 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete13 REPLYQ=Y
    sinDelete_server SRVGRP=GROUP14 SRVID=644 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete14 REPLYQ=Y
    sinDdl_server SRVGRP=GROUP13 SRVID=703 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl13 REPLYQ=Y
    sinDdl_server SRVGRP=GROUP14 SRVID=744 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl14 REPLYQ=Y
    lockselect_server SRVGRP=GROUP13 SRVID=803 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect13 REPLYQ=Y
    lockselect_server SRVGRP=GROUP14 SRVID=844 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect14 REPLYQ=Y
    #mulup_server SRVGRP=GROUP13 SRVID=13 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup13 REPLYQ=Y
    #mulup_server SRVGRP=GROUP14 SRVID=64 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup14 REPLYQ=Y
    WSL SRVGRP=GROUP11 SRVID=1000
    CLOPT="-A -- -n//120.3.8.237:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
    WSL SRVGRP=GROUP12 SRVID=1001
    CLOPT="-A -- -n//120.3.8.238:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
    WSL SRVGRP=GROUP13 SRVID=1003
    CLOPT="-A -- -n//120.3.8.237:7203 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
    WSL SRVGRP=GROUP14 SRVID=1004
    CLOPT="-A -- -n//120.3.8.238:7204 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
    about above ubb file ,has any error ? or not correct use MSSQ
    look forward to you answer,thanks.

  • How to speed up my system- reboot from time machine or increase free space?

    I've just upgraded my system to 3GB ram and some actions are being quite sluggish still- this I would suspect is because I have only 3GB free on my 120GB hard drive.
    Would rebooting the system from a time machine backup speed up the computer- in that all the files should be continuous or should I just aim to make more space on the hard drive?

    Ouch! You need to delete files - it's not a fragmentation issue, but a lack of available space for virtual memory and swap files. You should always have 10-15% of your capacity free (so, for you, that means keeping 11-16 GB free space on your drive).
    At this point, I'd recommend deleting files to free up space, then running Disk Utility > Verify Disk (and if errors are found, boot from the Install DVD and Repair Disk).
    To find your largest files, use something like Disk Inventory X:
    http://www.derlien.com

  • Can I install a new internal HDD and reboot from time machine?

    I'm away from home at the moment and my internal HDD has failed and I am putting in a new HDD I have the time machine backup on a external HDD but no OSX disks with me can I re boot OS from my time machine backup?

    Allan Eckert wrote:
    The only time the system on the DVD matters, is when you install from the DVD.
    One exception:  If you do a full system restore from a Snow Leopard backup using a Leopard Install disk, your Mac either won't start up or will kernel panic.   See #E8 in Time Machine - Troubleshooting.

  • Linux RAC NODES Rebooting

    We have 2-NODE RAC Cluster running GC since about 3months. But lately (last 3weeks) we have seen NODE2 reboot 5-6 times with CSSD errors:
    Oracle clsomon failed with fatal status 12
    Oracle CSSD failure 134.
    Oracle CRS failure. Rebooting for cluster integrity.
    This Environment is RHEL4 U4 with all RAC components running 10.2.0.3. Has anyone encountered the same.
    thanks

    Chandra,
    We looked into ocssd.log and didn't find anything unusual.
    Below is log on the NODE failed.
    [    CSSD]2008-xx-xx 11:49:49.172 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x786ba0) proc(0x7ba7b0) pid() proto(10:2:1:1)
    [    CSSD]2008-xx-xx 11:50:11.573 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x786ba0) proc(0x7ba7b0) pid() proto(10:2:1:1)
    [    CSSD]2008-xx-xx 11:50:44.376 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x786ba0) proc(0x7ba7b0) pid() proto(10:2:1:1)
    [    CSSD]2008-xx-xx 11:51:44.652 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x786ba0) proc(0x7ba7b0) pid() proto(10:2:1:1)
    [    CSSD]2008-xx-xx 11:52:44.921 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x786ba0) proc(0x7adaf0) pid() proto(10:2:1:1)
    [    CSSD]2008-xx-xx 11:58:02.771 >USER: Oracle Database 10g CSS Release 10.2.0.3.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=xxxxxDBG_CSSD))
    [    CSSD]2008-xx-xx 11:58:02.771 >USER: CSS daemon log for node xxxxxx, number 2, in cluster xxxxxxx-crs
    [    CSSD]2008-xx-xx 11:58:02.801 [2538463008] >TRACE: clssscmain: local-only set to false
    [    CSSD]2008-xx-xx 11:58:02.844 [2538463008] >TRACE: clssnmReadNodeInfo: added node 1 (xxxx) to cluster
    [    CSSD]2008-xx-xx 11:58:02.853 [2538463008] >TRACE: clssnmReadNodeInfo: added node 2 (xxxx) to cluster
    [    CSSD]2008-xx-xx 11:58:02.862 [1115699552] >TRACE: clssnm_skgxnmon: skgxn init failed
    [    CSSD]2008-xx-xx 11:58:02.862 [2538463008] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
    If you look at the log on the failed node it failed at around 11:53 per OS log and after the reboot it took back the resources and will run for couple of days without any issue and then again same happens.

  • RAC node reboot

    Hi,
    May I ask here that how to prevent from split-brain happen to a healthy two nodes RAC? I understand Oracle decided restart one node based on network messaging healthy, and on the other hand, i think from 10g to 11g, there are bugs about evict node due to ipc timeout.
    Thanks

    You should check following to get correct issue-
    refer to database log & associated trace files, asm.log associated trace files then further drill down to ocssd, crsd evmd logfiles.
    From trace files you will get the reason for node eviction normally for following reasons
    Reason 0 = No reconfiguration
    Reason 1 = The Node Monitor generated the reconfiguration.
    Reason 2 = An instance death was detected.
    Reason 3 = Communications Failure
    Reason 4 = Reconfiguration after suspend
    Once you know the reason, then look for the cause and fix it. For troubleshooting and data gathering refer to metalink notes.
    Thanks.

  • Solaris RAC nodes re-booting

    I have a pre-production 2-node cluster running on Solaris 10, Oracle 10.2.0.3 with the Oracle CRS, and using a NetApp filer as the shared storage.
    I also have a separate Solaris server running Grid Control 10.2.0.3, with the repository as one of the databases on the RAC (don't know if this is relevant to my problem).
    Periodically both RAC nodes reboot, with no trace of why (the GC server is fine). There is nothing logged in the Solaris logs (messages file), CRS logs, Oracle logs or the NetApp logs.
    All that is shown is the relevant service starting up following the shutdown.
    Has anyone any experience of this, or any thoughts on which component may cause such an issue?
    Thanks in advance
    Bob

    What type of Sun hardware are you using?
    Below is the Action Plan Oracle support sent me on my SR on this issue, not sure if any of this was provided to you or would be of help.
    ACTION PLAN
    ============
    1. there is nothing on the files at all that sheds any light on the issue
    agian 3 sperate sets of clusters all losing all nodes at the same tiem is a very strange occurance. Please be sure to have the admin look for
    anything in common wiht all custers.
    2. advice placing oswatcher on the systems Note.301137.1 Ext/Pub OS Watcher User Guide
    if we should have another occurances we will want the oswatcher logs for 1 hr before issue thru issue
    also see if the unix admin perhaps has any os stats from this occurance
    3. advice settign ntpd to run with -x option I do see that you are having negative time changes
    at times
    -x will give us a skew rather then an abbrupt time change
    4. advice setting this when you can
    Please do the following
    set the diagwait parameter:
    crsctl set css diagwait N [-force]
    Where N is the number of seconds to wait for a filesystem sync to
    complete (after this wait the node will reboot regardless of whether the
    sync has completed). This change must be made with the clusterware
    down, which will require the '-force', or with the stack up on just 1
    node, after which the stack on that node must be restarted before the
    stack starts up on any of the other nodes.
    N should be set to 25 (25 seconds)
    5. advice that you have with pcw mlr#6 Patch 5980915 on the systems as well
    but I do not believe that this was an oracle bug the reason for placing the patch on is for advanced diagnostics that is in that patchset
    6. the two issues sun is workking on
    Sun is working to resolve a time skew issue and a Solaris 10 kernel SIGALRM Sun#6292092 in addition to Sun#6595936.
    7. we do have a diagnostic oprocd that soem sites have used but on thier test systems. It stops reboots adn dumps information but I have
    been hesitant to place it on production boxes if you continue to have issues we may consider download the oprocd_skewfix_noreboot fro
    m Bug 6279879 but at this time I do not belvve that is warrented

  • SCAN LISTENER runs from only one node at a time from /etc/hosts !

    Dear all ,
    Recently I have to configure RAC in oracle 11g(r2) in AIX 6.1 . Since in this moment it is not possible to configure DNS, so I dont use SCAN ip into the DNS/GNS, I just add the SCAN ip into the host file like :
    cat /etc/hosts
    SCAN 172.17.0.22
    Got the info from : http://www.freeoraclehelp.com/2011/12/scan-setup-for-oracle-11g-release211gr2.html#ORACLE11GR2RACINS
    After configuring all the steps of RAC , Every services are ok except SCAN_LISTENER . This listener is up only one node at a time . First time when I chek it from node1 , it shows :
    srvctl status scan_listener
    SCAN listener LISTENER_SCAN1 is enabled
    SCAN listener LISTENER_SCAN1 is running on node dcdbsvr1
    now when I relocate it from node 2 using
    "srvctl relocate scan -i 1-n DCDBSVR2" , then the output shows :
    srvctl status scan_listener
    SCAN listener LISTENER_SCAN1 is enabled
    SCAN listener LISTENER_SCAN1 is running on node dcdbsvr2
    Baring these , we have to try to relocate it from the node2 by the following way, then it shows the error :
    srvctl relocate scan -i 2 -n DCDBSVR2
    resource ora.scan2.vip does not exists
    Now my question , How can I run the SCAN and SCAN_LISTENER both of the NODES ?
    Here is my listener file (which is in the GRID home location) configuration :
    Listener File OF NODE1 AND NODE 2:
    ==================================
    ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN1=ON
    ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON
    LISTENER_SCAN1 =
    (DESCRIPTION =
    (ADDRESS = (PROTOCOL = IPC) (KEY = LISTENER_SCAN1)
    ADR_BASE_LISTENER_SCAN1 = /U01/APP/ORACLE
    2)
    Another issue , when I give the command : " ifconfig -a " , then it shows the SCAN ip either node1 or node2 . suppose if the SCAN ip is in the node1 , and then if I run the "relocate" command from node2 , the ip goes to the Node 2 . is it a correct situation ? advice plz ... ...
    thx in advance .. ...
    Edited by: shipon_97 on Jan 10, 2012 7:22 AM
    Edited by: shipon_97 on Jan 10, 2012 7:31 AM

    After configuring all the steps of RAC , Every services are ok except SCAN_LISTENER . This listener is up only one node at a time . First time when I chek it from node1 , it shows :If I am not wrong and after looking at the document you sent, you will be able to use only once scan in case you use /etc/host file and this will be up on only one node where you added this scan entry in /etc/hosts file.
    Now my question , How can I run the SCAN and SCAN_LISTENER both of the NODES ?Probably you can't in your case, you might run only one i think and on one node only
    srvctl status scan_listener
    SCAN listener LISTENER_SCAN1 is enabled
    SCAN listener LISTENER_SCAN1 is running on node dcdbsvr1
    now when I relocate it from node 2 using
    "srvctl relocate scan -i 1 -n DCDBSVR2" , then the output shows :
    srvctl status scan_listener
    SCAN listener LISTENER_SCAN1 is enabled
    SCAN listener LISTENER_SCAN1 is running on node dcdbsvr2You moved scan listener from node 1 to node 2, OK
    Baring these , we have to try to relocate it from the node2 by the following way, then it shows the error :
    srvctl relocate scan -i 2 -n DCDBSVR2
    resource ora.scan2.vip does not exists
    --------------------------------------------------------------------------------Since you have only one scan, you can't relocate "2". So ise "1" instead here also
    FYI
    http://www.oracle.com/technetwork/database/clustering/overview/scan-129069.pdf
    Salman

  • DBconsole is terminated with signal 99 from time to time in 10g RAC!!

    Hi, all.
    The database is 2 node RAC 10.2.0.2.0 on 32-bit Windows 2003 EE SP1.
    We are not using Grid Control. I installed Enterprise Manager by using
    DBCA.
    From time to time, Console and agent processes are restarted "automatically".
    --> emdb.nohup
    ----- Tue Oct 16 00:04:43 2007::Console Launched with PID 4384 at time Tue Oct 16 00:04:43 2007 -----
    ----- Mon Oct 22 23:08:50 2007::Process with pid 4384 not found. Checking for file D:\oracle\product\10.2.0\db_1\nmsp110_RAC1/sysman/log/exitStatus_dbconsole -----
    ----- Mon Oct 22 23:08:50 2007::Exitfile D:\oracle\product\10.2.0\db_1\nmsp110_RAC1/sysman/log/exitStatus_dbconsole not found. Signaling abnormal exit -----
    ----- Mon Oct 22 23:08:51 2007::DBConsole exited at Mon Oct 22 23:08:51 2007 with signal 99 -----
    ----- Mon Oct 22 23:08:51 2007::DBConsole has exited due to an internal error -----
    ----- Mon Oct 22 23:08:51 2007:: - checking for corefile at D:\oracle\product\10.2.0\db_1\nmsp110_RAC1/sysman/emd -----
    ----- Mon Oct 22 23:08:51 2007::Restarting DBConsole. -----
    launchprocess::Launching process : D:\oracle\product\10.2.0\db_1\jdk/bin/java -server -Xmx256M -XX:MaxPermSize=96m -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=40 -DORACLE_HOME=D:\oracle\product\10.2.0\db_1 -Doracle.home=D:\oracle\product\10.2.0\db_1/oc4j -Doracle.oc4j.localhome=D:\oracle\product\10.2.0\db_1\nmsp110_RAC1/sysman -DEMSTATE=D:\oracle\product\10.2.0\db_1\nmsp110_RAC1 -Doracle.j2ee.dont.use.memory.archive=true -Djava.protocol.handler.pkgs=HTTPClient -Doracle.security.jazn.config=D:\oracle\product\10.2.0\db_1/oc4j/j2ee/OC4J_DBConsole_nmsp110_RAC1/config/jazn.xml -Djava.security.policy=D:\oracle\product\10.2.0\db_1/oc4j/j2ee/OC4J_DBConsole_nmsp110_RAC1/config/java2.policy -Djava.security.properties=D:\oracle\product\10.2.0\db_1/oc4j/j2ee/home/config/jazn.security.props -DEMDROOT=D:\oracle\product\10.2.0\db_1\nmsp110_RAC1 -Dsysman.md5password=true -Drepapi.oracle.home=D:\oracle\product\10.2.0\db_1 -Ddisable.checkForUpdate=true -Djava.awt.headless=true -jar D:\oracle\product\10.2.0\db_1/oc4j/j2ee/home/oc4j.jar -config D:\oracle\product\10.2.0\db_1/oc4j/j2ee/OC4J_DBConsole_nmsp110_RAC1/config/server.xml
    launchprocess::Launched process successfuly. Process Id is : 4444
    ----- Mon Oct 22 23:08:51 2007::Console Launched with PID 4444 at time Mon Oct 22 23:08:51 2007 -----
    As a result, some enterprise manager sessions become pending with library cache
    lock or gc cr request wait event.
    How can I respond to this problem??
    Thanks and Regards.
    Message was edited by:
    user507290
    Message was edited by:
    user507290

    Dear DBMS_Direct.
    Thanks for your reply.
    There are errors in emagent.trc and emoms.trc.
    But I do not know what they mean.
    There are several log files in sysman log dictory.
    1. emoms.trc
    2. emoms.log
    3. emagent.trc
    4. emagent.log
    5. emdb.nohup
    6. emdctl.trc
    7. emagent_perl.trc
    What processes are writing log to these files and what do they mean?
    Thanks and Regards.

  • During upgrade to Lion on Macbook Pro the upgrade stops at OS utilites after rebooting asking to restore from time machine backup or install new copy

    My Daughter is currently upgrading her MacBook Pro to OS X Lion.  During the upgrade the system rebooted and then stops at the OS X Utilites menu.  I have installed this same upgrade on My MacBook Pro and the family iMAC without issue.  Did the upgrade encounter a problem?  The only options are to Restore from Time Machine Backup, Reinstall Mac OS X, Get help, or Disk Utility.  
    I'm not sure how recient the backup is on the TimeMachine for her system and she is concered that she may loose a lot of updates she has made to iTunes and iPhoto.  
    Please help.

    I have 8 GB of RAM, but would that even matter during install? Performance once installed and running sure, but I questiong whether the installer would demand that much more, or why it would affect mountain lion when restoring from a time machine backup.

  • I have a late 2006 iMac that has just started giving me a message to reboot after a black screen comes down slowly from the top top of the screen to the bottom.  I reloaded software and restored from Time Machine, it now happens frequently.  Any insight?

    I have a late 2006 iMac that has just started giving me a message to reboot after a black screen comes down slowly from the top top of the screen to the bottom.  I reloaded software and restored from Time Machine, but it keeps happening, now several tomes a day.  Any insight as to the issue or a proposed solution?

    Unplug any peripherals you have except your keyboard, reboot, and check activity monitor for apps.  Keep running apps to a minimum to find the App causing the issue.

  • When my macbook goes to sleep from non activity when I return the wifi is not on.  When trying to log back onto my hub it says it is out of time and I have to reboot each time.  Most annoying, any solutions please?

    When my macbook goes to sleep from non activity when I return the wifi is not on.  When trying to log back onto my hub it says it is out of time and I have to reboot each time.  Most annoying, any solutions please?

    If you are not using data and the iPhone goes to sleep, it will automatically drop the wi-fi, and will not reconnect until you wake the phone up. That is the way it is designed. The only way around that is to turn off cellular data while at home and on wi-fi.
    Cannot answer the xbox question since this is an iPhone forum and I would have no idea.

Maybe you are looking for

  • Cannot Read or Write to Disk, Error (-50) When re-syncing

    Hi, I'm a little frustrated with my iPod. It syncs just fine when it is first connected to my PC with the usb cable. Then if I add another song or video to be synced and hit 'sync iPod' it will timeout and then return with two errors. First it will s

  • Adobe Media Encoder CC (7.0.1) Update

    Hi all, The Adobe Media Encoder CC (7.0.1) update is now available. See this post for details, including how to install the update: http://blogs.adobe.com/aftereffects/2013/07/adobe-media-encoder-cc-7-0-1-update-available- several-bug-fixes-for-close

  • Lost IPod app after computer change

    I downloaded an app onto my 5th gen IPod nano from the itunes store and soon after i lost the app, and my computer had wiped itself some time in between (the comper crashing had nothing to do with the app).Now when i tried to re-download the app from

  • VM with MS SQL will not longer join domain after a reboot

    I setup several VM's and a domain. Everything was working fine until I had to shut down the machine. Now only 2 out of the 3 can join the domain. The third is a VM with MS SQL. It claims to be joined to the domain, but logging on using a domain accou

  • A great solution to external storage for MacBook Pro Retina 13'?

    Hi. I'm looking for a good external storage (HDD) for my 128GB MacBook Pro Retina 13-inch. 128GB is way to low to storage music and films on etc. Any great solutions for storage?