Master node down

Dear DBAs,
for a maintenance reason we had to shutdown the 3 cluster servers (Windows 2003-64bit).
after starting up the master node (the others still down) the windows stuck on the "Applying profile", however if we disconnect this server from the network the server starts.
we noticed that after reconnecting the server to the network, it takes long time to discover the storage disks.
we are planning to move to another master node.
could you please send me a link with the step-by-step on how to move to a new master node.
the database is Oracle 10g v. 10.2.0.4.0
OS: windows 2003-64 bit
your quick reply is highly appreciated.
Regards
Elie

we noticed that after reconnecting the server to the network, it takes long time to discover the storage disks.I am not sure that doing so would decrease the time. You need to check with your n/w guys and storage guys that why its taking time since the same will happen with the new node as well.
we are planning to move to another master node.You need to take a backup of this db and then move it to the new host . Once done, you would need to add nodes to that node . But as said above, find the reason first why you are experiencing slow connectivity?
Aman....

Similar Messages

  • Abrupt shutdown of master node causes problem

    A service in usmbcom1 (LMID ) makes tpacall to a service which is
    present in
    both usmbapp1 ( master node LMID ) and usmbapp2 ( slave node LMID
    ) ( HERE
    usmbcom1 , usmbapp1 and usmbapp2 are LMIDs whereas the
    corresponding physical m/cs unames are usmbd5 , usmbd3 and usmbd4
    They are separate sun boxes) . LDBAL = Y and tpacall is done at
    25 / sec.
    Now there are 2 scenarios.
    1. While tpacall is in progress we kill the servers in usmbapp1
    using
    kill command ( not kill -9) . Then clean up the ipcs. Only
    few ( 3-5 out of a
    total of 5000 ) messages were lost . This is understandable since
    messages
    which were already in the queue got lost). The rest of the messages
    were
    processes by usmbapp2.
    2. In this case we switched off the sun box usmbapp1 ( m/c name
    usmbd3)
    while tpacall was in progress. This time we lost approx 50% of
    the
    messages. However if we go to the slave m/c i.e usmbapp2 and manually
    make it master ( tmadmin ... master) ., then from that point of
    time we stopped
    losing messages.
    Does that mean manual intervention is necessary if DBBL goes
    down? Is there anything which I am missing out while configuring
    the system?

    Hi Scott,
    You did understand the scenario and the problems.
    The answers are quite convincing.
    Actually the QA team here are doing failover testing
    and they have both these ( kill and m/c shutdown)
    as their test cases.
    However I would like to know about what you meant
    by High Availability Solutions.
    Do you also mean that if I shutdown my master m/c
    then no event would be written in the ULOG of slave,
    which can be monitored and used to convert the slave
    into master programatically ( I mean thru tpadmcall)
    Thanks
    Somashish
    Scott Orshan <[email protected]> wrote:
    Hi,
    I'm not sure if I completely comprehend your situation,
    but let me take
    a guess.
    When you killed the processes (including the Bridge),
    which by the way
    is a bad thing to do to TUXEDO, TCP notified the other
    connected nodes
    that the connection had dropped. This happens fairly quickly.
    But if you just turn off a machine, TCP may not detect
    it until it times
    out, which can take several minutes. Since TUXEDO was
    doing Round Robin
    load balancing, half the requests were sent to the Bridge,
    with a
    destination of the dead machine.
    To answer your final question, the DBBL has to be migrated
    manually,
    unless you are using one of our High Availability solutions
    that uses an
    external monitor.
    The reason is that it is very hard to distinguish between
    a network
    failure or slowdown, and a real failure of the Master
    node. And it would
    be very bad to have two machines in the domain acting
    as the Master.
         Scott Orshan
         BEA Systems
    Somashish Gupta wrote:
    A service in usmbcom1 (LMID ) makes tpacall to a servicewhich is
    present in
    both usmbapp1 ( master node LMID ) and usmbapp2 ( slavenode LMID
    ) ( HERE
    usmbcom1 , usmbapp1 and usmbapp2 are LMIDs whereas the
    corresponding physical m/cs unames are usmbd5 , usmbd3and usmbd4
    They are separate sun boxes) . LDBAL = Y and tpacallis done at
    25 / sec.
    Now there are 2 scenarios.
    1. While tpacall is in progress we kill the serversin usmbapp1
    using
    kill command ( not kill -9) . Then clean up theipcs. Only
    few ( 3-5 out of a
    total of 5000 ) messages were lost . This is understandablesince
    messages
    which were already in the queue got lost). The restof the messages
    were
    processes by usmbapp2.
    2. In this case we switched off the sun box usmbapp1( m/c name
    usmbd3)
    while tpacall was in progress. This time we lost approx50% of
    the
    messages. However if we go to the slave m/c i.e usmbapp2and manually
    make it master ( tmadmin ... master) ., then from thatpoint of
    time we stopped
    losing messages.
    Does that mean manual intervention is necessary if DBBLgoes
    down? Is there anything which I am missing out whileconfiguring
    the system?

  • How do we know the Master Node in RAC ? ?

    Hi Experts,
    We have implemented 2-Node RAC 11g R2 on Linux Platform. My query is How do we know the Master Node in RAC ? ?
    Thanks
    Venkat

    Hi,
    There is no such thing as "master node" in RAC configurations. All nodes are equal.
    Sebastian wrote: The only thing closest to something like a "Master" is that only one node has the role to update the Oracle Cluster Registry (OCR), all other nodes only read the OCR.
    {message:id=9827969}
    and
    {message:id=2154683}
    Regards,
    Levi Pereira

  • Master node

    Hi,
    I am reading 'Advanced RAC troubleshooting' written by Riyaj Shamsudeen and have some questions about the wait event 'gc current/cr grant 2-way'.
    It says:
    CR – disk read*
    Select c1 from t1 where n1 =:b1;*
    +1 User process in instance 1 requests master for a PR mode block.+
    +2 Assuming no current owner, master node grants the lock to inst 1.+
    +3 User process in instance 1 reads the block from the disk and holds PR.+
    Why the master node is inst2? I think if no current owner, inst1 should read the block from the disk directly and it will be the master.
    If only one block is read from the disk and the object which the block belongs to is never read, how to define the master node? Is it the requesting node?
    Please help me.
    Thanks.

    Hi
    Master node is determined by no of time particular node accessed particular block.Which is accessed more will be master node for particular block.As I believe
    Wait event gc current appears when request goest for current mode means for updation of blocks.
    cr grant 2-way appears when same blocks is modified in two nodes.
    Correction is highly appreciate.
    Tinku

  • Master Node on RAC

    Hi,
    Just would like to know, how to know which node is the master on RAC, is there any command for this?
    cheers
    fzheng

    One way to find that information is to look at the log file $ORA_CRS_HOME/log/`hostname`/cssd/ocssd.* files, you would find something like this:
    ocssd.l05:[    CSSD]CLSS-3001: local node number 1, master node number 1
    ocssd.l05:[    CSSD]2007-02-08 13:58:44.508 [507920] >TRACE: clssgmEstablishMasterNode: MASTER for 21 is node(1) birth(6)
    ocssd.l05:[    CSSD]2007-02-08 13:58:44.508 [507920] >TRACE: clssgmMasterCMSync: Synchronizing group/lock status
    ocssd.l05:[    CSSD]2007-02-08 13:58:44.514 [507920] >TRACE: clssgmMasterSendDBDone: group/lock status synchronization complete
    ocssd.l05:[    CSSD]CLSS-3001: local node number 1, master node number 1
    ocssd.l05:[    CSSD]2007-02-08 14:01:46.236 [524304] >TRACE: clssgmEstablishMasterNode: MASTER for 22 is node(1) birth(6)
    ocssd.l05:[    CSSD]2007-02-08 14:01:46.236 [524304] >TRACE: clssgmMasterCMSync: Synchronizing group/lock status
    ocssd.l05:[    CSSD]2007-02-08 14:01:46.241 [524304] >TRACE: clssgmMasterSendDBDone: group/lock status synchronization complete
    But, this information is not really not that critical for ongoing operations or regular maintenance.....and just informational for all practical purposes.
    HTH
    Thanks
    Chandra Pabba

  • VDI secondary data node down

    Dear all,
    i have one primary and two secondary setup
    both primary and one secondaryA are running fine. but when i used ./vda-db-status on another secondary B . it showed me down
    data node down
    actually secondary B was shutdown due to power failure. when we restarted the server it gave us boot error. boot archive..fsck -F ufs /dev/rdisk/....solved the problem
    but when we booted it was down..in mysql cluster status. this server is also datanode in cluster.
    I checked svcs svc:/application/database/vdadb:core is down showing offline*
    in /var/opt/SUNWvda/mysql-cluster/ndb_3.error.log
    I found following
    Current byte-offset of file-pointer is: 1566
    Time: Tuesday 8 June 2010 - 13:08:28
    Status: Ndbd file system error, restart node initial
    Message: File not found (Ndbd file system inconsistency error, please report a bug)
    Error: 2815
    Error data: DBLQH: File system open failed. OS errno: 2
    Error object: DBLQH (Line: 3083) 0x0000000a
    Program: /opt/SUNWvda/mysql/bin/ndbd
    Pid: 875
    Version: mysql-5.1.37 ndb-7.0.8a
    Trace: /var/opt/SUNWvda/mysql-cluster/ndb_3_trace.log.1
    ***EOM***
    Time: Tuesday 8 June 2010 - 13:32:00
    Status: Ndbd file system error, restart node initial
    Message: File not found (Ndbd file system inconsistency error, please report a bug)
    Error: 2815
    Error data: DBLQH: File system open failed. OS errno: 2
    Error object: DBLQH (Line: 3083) 0x0000000a
    Program: /opt/SUNWvda/mysql/bin/ndbd
    Pid: 5686
    Version: mysql-5.1.37 ndb-7.0.8a
    Trace: /var/opt/SUNWvda/mysql-cluster/ndb_3_trace.log.2
    ***EOM***
    Time: Tuesday 8 June 2010 - 13:42:26
    Status: Ndbd file system error, restart node initial
    Message: File not found (Ndbd file system inconsistency error, please report a bug)
    Error: 2815
    Error data: DBLQH: File system open failed. OS errno: 2
    Error object: DBLQH (Line: 3083) 0x0000000a
    Program: /opt/SUNWvda/mysql/bin/ndbd
    Pid: 764
    Version: mysql-5.1.37 ndb-7.0.8a
    Trace: /var/opt/SUNWvda/mysql-cluster/ndb_3_trace.log.3
    ***EOM***
    ______________________________________________________________________________________________________________

    Further i saw Vdadb:core.log
    i found following
    .........................(logs omitted)
    [ Jun  8 13:08:16 Executing start method ("/opt/SUNWvda/lib/vda-db-service start") ]
    Configuration:
    MGMT_NODE=[0]; NDBD_NODE=[1]; SQL_NODE=[0]; MULTI_HOST_MODE=[1];
    NDBD_CONNECTSTRING=[mycompnay.com]; NDBD_INITIAL_ARG=[]; NDBD_NODE_ID=[3];
    MYSQL_BIN=[opt/SUNWvda/mysql/bin];
    Starting the Sun Virtual Desktop Infrastructure Database service:
    - Starting Data Node... 2010-06-08 13:08:18 [ndbd] INFO -- Configuration fetched from 'mycompnay.com:1186', generation: 1
    Arguments: [mycompnay.com  ]...
    Error
    [ Jun  8 13:14:46 Method "start" exited with status 95 ]
    [ Jun  8 13:31:35 Leaving maintenance because disable requested. ]
    [ Jun  8 13:31:35 Disabled. ]
    [ Jun  8 13:31:56 Enabled. ]
    [ Jun  8 13:31:56 Executing start method ("/opt/SUNWvda/lib/vda-db-service start") ]
    Configuration:
    MGMT_NODE=[0]; NDBD_NODE=[1]; SQL_NODE=[0]; MULTI_HOST_MODE=[1];
    NDBD_CONNECTSTRING=[mycompnay.com]; NDBD_INITIAL_ARG=[]; NDBD_NODE_ID=[3];
    MYSQL_BIN=[opt/SUNWvda/mysql/bin];
    Starting the Sun Virtual Desktop Infrastructure Database service:
    - Starting Data Node... 2010-06-08 13:31:57 [ndbd] INFO -- Configuration fetched from 'mycompnay.com:1186', generation: 1
    Arguments: [mycompnay.com  ]...
    Error
    [ Jun  8 13:38:27 Method "start" exited with status 95 ]
    [ Jun  8 13:38:27 Leaving maintenance because disable requested. ]
    [ Jun  8 13:38:27 Disabled. ]
    [ Jun  8 13:42:21 Executing start method ("/opt/SUNWvda/lib/vda-db-service start") ]
    Configuration:
    MGMT_NODE=[0]; NDBD_NODE=[1]; SQL_NODE=[0]; MULTI_HOST_MODE=[1];
    NDBD_CONNECTSTRING=[mycompnay.com]; NDBD_INITIAL_ARG=[]; NDBD_NODE_ID=[3];
    MYSQL_BIN=[opt/SUNWvda/mysql/bin];
    Starting the Sun Virtual Desktop Infrastructure Database service:
    - Starting Data Node... 2010-06-08 13:42:22 [ndbd] INFO -- Configuration fetched from 'mycompnay.com:1186', generation: 1
    Arguments: [mycompnay.com  ]...
    Error
    [ Jun  8 13:48:50 Method "start" exited with status 95 ]
    any ideas

  • How to sort master-node in master-detail scenario without losing subnodes?

    Hi,
    I've a master-detail scenario and want to sort my master node.
    How can I sort the master node without losing the detail-subnodes?
    If I take a look in class CL_WDR_TABLE_METHOD_HNDL and method  IF_WD_TABLE_METHOD_HNDL~APPLY_SORTING
    Sorting is done by
    - unload node with context_node->get_static_attributes_table into an internal table
    - keeping node state like lead_selection(s) and attribute_properties
    - sort internal table
    - bind internal table to node
    - set lead_selection and properties
    But all subnodes are gone.
    How do you sort a master node?
    Thanks and Regards
    Carsten

    I think you have to write your own logic for that . May be you can implement IF_WD_TABLE_METHOD_HNDL in your class and extend the current logic to support subnodes.

  • TUXEDO11 in MP mode can't boot TMS_ORA on the non-master node

    I have my Tuxedo 11 installed on Ubuntu9.10 server as the master node (SITE1) and on CentOS6.2 as the non-master node (SITE2). The client program is using WSL to communicate with the servers. Tuxedo 11 has no patch, and both Tuxedo11 and Oracle10gR2 are 32 bits running on 32 bits OS.
    On both node a TMS_ORA associated with an ORACLE 10gR2 database was installed. When I issue "tmboot -y", the servers on the master node booted normally, however, the TMS_ORA server and server that using TMS_ORA on SITE2 reported "Assume started (pipe). ". There is no core file for these servers on SITE2 and in ULOG on SITE2 there is no Error or Warning concerning the failure starting of TMS_ORA.
    In order to check my servers and TMS_ORA works OK on SITE2, I used the master command under tmadmin to first swap the master and non-master node, and after the migration is successful, on SITE2 I issued "tmshutdown -cy" command then "tmboot -y" command. Surprisingly, all the servers booted correctly on both nodes. Then I migrate the master node back to SITE1 and the servers are still there alive and my client program can successfully call these servers which means the TMS_ORA and server using TMS_ORA on both nodes works fine.
    The problem is, when I "tmshutdown -s server" (those on SITE2, either TMS_ORA or server using TMS_ORA), then using "tmboot -s server" to boot them (those on SITE2, either TMS_ORA or server using TMS_ORA) I got "Assume started (pipe). " reported and those server process didn't appear on SITE2.
    It seems that I can't boot TMS_ORA on SITE2 from the master node SITE1 but can boot all the servers correctly if SITE2 are acting as the master node. Server that don't use TMS_ORA on SITE2 can be booted successfully from SITE1.
    Can anybody figure out what's wrong? Thanks in advance.
    Best regards,
    Orlando
    Edited by: user10950876 on 2012-6-13 下午3:02
    Edited by: user10950876 on 2012-6-13 下午3:33

    Hi Todd,
    Thank you for you reply. Following is my ULOG and tmboot report:
    ubuntu9:~/tuxapp$tmboot -y
    Booting all admin and server processes in /home/xp/tuxapp/tuxconfig
    INFO: Oracle Tuxedo, Version 11.1.1.2.0, 32-bit, Patch Level (none)
    Booting admin processes ...
    exec DBBL -A :
    on SITE1 -> process id=8803 ... Started.
    exec BBL -A :
    on SITE1 -> process id=8804 ... Started.
    exec BBL -A :
    on SITE2 -> process id=3964 ... Started.
    Booting server processes ...
    exec TMS_ORA -A :
    on SITE1 -> process id=8812 ... Started.
    exec TMS_ORA -A :
    on SITE1 -> process id=8838 ... Started.
    exec TMS_ORA2 -A :
    on SITE2 -> CMDTUX_CAT:819: INFO: Process id=3967 Assume started (pipe).
    exec TMS_ORA2 -A :
    on SITE2 -> CMDTUX_CAT:819: INFO: Process id=3968 Assume started (pipe).
    exec WSL -A -- -n //128.0.88.24:5000 -m 3 -M 5 -x 5 :
    on SITE1 -> process id=8841 ... Started.
    8 processes started.
    ULOG on ubuntu9
    134547.ubuntu9!DBBL.8803.3071841984.0: 06-14-2012: client high water (0), total client (0)
    134547.ubuntu9!DBBL.8803.3071841984.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134547.ubuntu9!DBBL.8803.3071841984.0: LIBTUX_CAT:262: INFO: Standard main starting
    134549.ubuntu9!DBBL.8803.3071841984.0: CMDTUX_CAT:4350: INFO: BBL started on SITE1 - Release 11112
    134550.ubuntu9!BBL.8804.3072861888.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit, Patch Level (none)
    134550.ubuntu9!BBL.8804.3072861888.0: LIBTUX_CAT:262: INFO: Standard main starting
    134550.ubuntu9!BRIDGE.8806.3072931520.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134550.ubuntu9!BRIDGE.8806.3072931520.0: LIBTUX_CAT:262: INFO: Standard main starting
    134555.ubuntu9!DBBL.8803.3071841984.0: CMDTUX_CAT:4350: INFO: BBL started on SITE2 - Release 11112
    134556.ubuntu9!BRIDGE.8806.3072931520.0: CMDTUX_CAT:1371: INFO: Connection received from redhat62
    134557.ubuntu9!TMS_ORA.8812.3057321664.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134557.ubuntu9!TMS_ORA.8812.3057321664.0: LIBTUX_CAT:262: INFO: Standard main starting
    134559.ubuntu9!TMS_ORA.8838.3056805568.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134559.ubuntu9!TMS_ORA.8838.3056805568.0: LIBTUX_CAT:262: INFO: Standard main starting
    134559.ubuntu9!WSL.8841.3072153920.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134559.ubuntu9!WSL.8841.3072153920.0: LIBTUX_CAT:262: INFO: Standard main starting
    134559.ubuntu9!WSH.8842.3072411328.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134559.ubuntu9!WSH.8842.3072411328.0: WSNAT_CAT:1030: INFO: Work Station Handler joining application
    134559.ubuntu9!WSH.8843.3073169088.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134559.ubuntu9!WSH.8843.3073169088.0: WSNAT_CAT:1030: INFO: Work Station Handler joining application
    134559.ubuntu9!WSH.8844.3073066688.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134559.ubuntu9!WSH.8844.3073066688.0: WSNAT_CAT:1030: INFO: Work Station Handler joining application
    ULOG on redhat62
    134615.redhat62!tmloadcf.3961.3078567616.-2: 06-14-2012: client high water (0), total client (0)
    134615.redhat62!tmloadcf.3961.3078567616.-2: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134615.redhat62!tmloadcf.3961.3078567616.-2: CMDTUX_CAT:872: INFO: TUXCONFIG file /home/tuxedo/tuxedo/simpapp/tuxconfig has been updated
    134617.redhat62!BSBRIDGE.3963.3078089312.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134617.redhat62!BSBRIDGE.3963.3078089312.0: LIBTUX_CAT:262: INFO: Standard main starting
    134619.redhat62!BBL.3964.3079420512.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit, Patch Level (none)
    134619.redhat62!BBL.3964.3079420512.0: LIBTUX_CAT:262: INFO: Standard main starting
    134620.redhat62!BRIDGE.3965.3077868128.0: 06-14-2012: Tuxedo Version 11.1.1.2.0, 32-bit
    134620.redhat62!BRIDGE.3965.3077868128.0: LIBTUX_CAT:262: INFO: Standard main starting
    134620.redhat62!BRIDGE.3965.3077868128.0: CMDTUX_CAT:4488: INFO: Connecting to ubuntu9 at //128.0.88.24:1800
    ubb file content: (just in case you want to see it too. I've commented all the services in the ubb file, except the TMS_ORA2 on SITE2 to make it more distinct.)
    *RESOURCES
    IPCKEY 123456
    DOMAINID TUXTEST
    MASTER SITE1, SITE2
    MAXACCESSERS 50
    MAXSERVERS 35
    MAXCONV 10
    MAXGTT 20
    MAXSERVICES 70
    OPTIONS LAN, MIGRATE
    MODEL MP
    LDBAL Y
    *MACHINES
    DEFAULT: MAXWSCLIENTS=30
    ubuntu9 LMID=SITE1
    APPDIR="/home/xp/tuxapp"
    TUXCONFIG="/home/xp/tuxapp/tuxconfig"
    TUXDIR="/home/xp/tuxedo11gR1"
    TLOGDEVICE="/home/xp/tuxapp/TLOG"
    TLOGNAME="TLOG"
    TLOGSIZE=100
    TYPE=Linux
    ULOGPFX="/home/xp/tuxapp/ULOG"
    ENVFILE="/home/xp/tuxapp/ENVFILE"
    UID=1000
    GID=1000
    redhat62 LMID=SITE2
    TUXDIR="/usr/oracle/tuxedo11gR1"
    APPDIR="/home/tuxedo/tuxedo/simpapp"
    TLOGDEVICE="/home/tuxedo/tuxedo/simpapp/TLOG"
    TLOGNAME="TLOG"
    TUXCONFIG="/home/tuxedo/tuxedo/simpapp/tuxconfig"
    TYPE=Linux
    ULOGPFX="/home/tuxedo/tuxedo/simpapp/ULOG"
    ENVFILE="/home/tuxedo/tuxedo/simpapp/ENVFILE"
    UID=501
    GID=501
    *GROUPS
    BANK1
    LMID=SITE1 GRPNO=1 TMSNAME=TMS_ORA TMSCOUNT=2
    OPENINFO="Oracle_XA:Oracle_XA+Acc=P/scott/tiger+SesTm=120+MaxCur=5+LogDir=.+SqlNet=xpdev"
    CLOSEINFO="NONE"
    BANK2
    LMID=SITE2 GRPNO=2 TMSNAME=TMS_ORA2 TMSCOUNT=2
    OPENINFO="Oracle_XA:Oracle_XA+Acc=P/scott/scott+SesTm=120+MaxCur=5+LogDir=.+SqlNet=tuxdev"
    CLOSEINFO="NONE"
    WSGRP
    LMID=SITE1 GRPNO=3
    OPENINFO=NONE
    *NETGROUPS
    DEFAULTNET NETGRPNO=0 NETPRIO=100
    SITE1_SITE2 NETGRPNO=1 NETPRIO=200
    *NETWORK
    SITE1 NETGROUP=DEFAULTNET
    NADDR="//128.0.88.24:1800"
    NLSADDR="//128.0.88.24:1500"
    SITE2 NETGROUP=DEFAULTNET
    NADDR="//128.0.88.215:1800"
    NLSADDR="//128.0.88.215:1500"
    *SERVERS
    DEFAULT:
    CLOPT="-A"
    #XFER SRVGRP=BANK1 SRVID=1
    #TLR_ORA SRVGRP=BANK1 SRVID=2
    #TLR_ORA2 SRVGRP=BANK2 SRVID=3
    WSL SRVGRP=WSGRP SRVID=4
    CLOPT="-A -- -n //128.0.88.24:5000 -m 3 -M 5 -x 5"
    *SERVICES
    #INQUIRY
    #WITHDRAW
    #DEPOSIT
    #XFER_NOXA
    #XFER_XA
    Edited by: user10950876 on 2012-6-13 下午10:58

  • Identify the OCR master node for 11.2

    My customer is on 11.1.0.6 RAC DB with 11.2 CRS+ASM and interested in finding  "OCR master node" at any point time.
    I noticed that one way is to identify the OCR master node is to search
    $ORA_CRS_HOME/log/hostname/crsd/crsd.log
    file for the line "I AM THE NEW OCR MASTER" or "MASTER" with the most recent timestamp. Does this applicable 11.2 Release ?
    and what are the other alternate ways to identify master node.
    Thanks in advance.

    Hi,
    as it was mentioned before, you can use the RAC FAQ Oracle Support Note to determine the masters in a RAC system. Except that this note would not elaborate on the OCR Master that you are asking for (OCR Writer as it is called in the documentation these days) in this context.
    However, your command to check $ORA_CRS_HOME/log/hostname/crsd/crsd.log works and the message is pretty much the same in 11.2 as it was pre-11.2. However, note that only checking the CRSD.log may not always tell you the OCR master all the time. Reason: The CRSD.log is used in a rolling fashion. Once the log entries have reached approx. 50MB, it is rolled over to crsd.l01 or something like that and a fresh crsd.log is used. 10 archived logs are maintained.
    For an average cluster this will last for a while, but in general, there might be a time when all these logs have been used and the OCR master has never been changed. In this case, you cannot use the logs anymore. Luckily, you should not have to find the OCR Master all the time. Why are you interested in knowing which node the OCR Master resides on all the time?
    At least, you should therefore cat all crsd.l* files under the respective directory on all nodes to determine this. But again, that should not be necessary.
    Hope that helps. Thanks,
    Markus

  • Sun Cluster 3.2 - zpools - master node

    How do we determine who is mastering the zpool from the clustering software? With SVM, we can determine who is the master node of the diskset. Thanks in advance.
    Ryan

    Why do you need to? I'm pretty sure the HAStoragePlus resource will validate the zpool, then once it is under its control you don't need to worry about which node masters it. It will be governed by which node has the HASP resource online. If it isn't online, then the zpool is deported and not owned at all.
    I would guess that zpool import will give you some idea of whether a pool is mastered. If it errors, it's either owned or wasn't deported properly.
    Regards,
    Tim
    ---

  • 8130: CREATE ACTIVE STANDBY PAIR must only be run on one of the MASTER node

    CkptFrequency=600
    CkptLogVolume=128
    OracleNetServiceName=abmsrv1
    PassThrough=1
    plz give me some help~~thx
    帖子经 user11036969编辑过

    Here's the original post (the forum seems to have truncated it for some reason):
    Content of the new Post:
    Command> CREATE ACTIVE STANDBY PAIR abmmd ON "node1",abmmd ON "node2"
    > RETURN RECEIPT
    > STORE abmmd ON "node1" PORT 21000 TIMEOUT 30
    > STORE abmmd ON "node2" PORT 20000 TIMEOUT 30;
    8130: CREATE ACTIVE STANDBY PAIR must only be run on one of the MASTER nodes.
    [ABMMD]
    Driver=/abm/tt02/tt/TimesTen/tt1121/lib/libtten.so
    DataStore=/abm/tt02/tt/tt11g/data/abm
    LogDir=/abm/tt02/tt/tt11g/data/logs
    SMPOptLevel=1
    TypeMode =0
    DurableCommits=0
    ExclAccess=0
    Connections=1000
    Isolation=1
    LockLevel=0
    PermSize=50000
    TempSize=1000
    ThreadSafe=1
    WaitForConnect=0
    Logging=1
    LogFileSize=256
    LogPurge=1
    CkptFrequency=600
    CkptLogVolume=128
    OracleNetServiceName=abmsrv1
    PassThrough=1
    plz give me some help~~thx
    This error means that when TimesTen is processing this statement and asks the operating sysstem for the official hostname of the local node the O/S is returning somethign different to 'node1' or 'node2'.
    it may be that you have incorrectly set the hostname to inculue a DNS domain (e.g. node1.xxx.yy.zzz).
    Chris

  • BDB Native Version 5.0.21 - asynchronous write at the master node

    Hi There,
    As part of performance tuning, we think of introducing asynchronous write capabilities at the master node in replication code that uses BDB native edition (11g).
    Are there any known issues with the asynchronous write at the master node? We'd like to confirm with Oracle before we promote to production.
    For asynchronous write at the master node we have configured a TaskExecutor with the following configuration:
    <bean id="MasterAsynchronousWriteTaskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
    <property name="corePoolSize" value="3"/>
    <property name="maxPoolSize" value="10"/>
    <property name="daemon" value="true"/>
    <property name="queueCapacity" value="200000"/>
    <property name="threadNamePrefix" value="Master_Entity_Writer_Thread"/>
    <property name="threadGroupName" value="BDBMasterWriterThreads"/>
    </bean>
    Local test showed no issues. Please let us know at the EARLIEST convenience if there are any changes required to corePoolSize, “maxPoolSize” and “queueCapacity” values as a result of asynchronous write.
    To summarize, 2 questions:
    1) Are there any known issues with the asynchronous write at the master node for BDB Native, version 5.0.21?
    2) If there are no issues, are any changes required to corePoolSize, “maxPoolSize” and “queueCapacity” values as a result of asynchronous write, and based on the configuration above?
    Thank you!

    Hello,
    If you have not already, please take a look at the documentation
    on "Database and log file archival" at:
    http://download.oracle.com/docs/cd/E17076_02/html/programmer_reference/transapp_archival.html
    Are you periodically creating backups of your database files?
    These snapshots are either a standard backup, which creates a
    consistent picture of the databases as of a single instant in
    time; or an on-line backup (also known as a hot backup), which
    creates a consistent picture of the databases as of an
    unspecified instant during the period of time when the
    snapshot was made. After backing up the database files you
    should periodically archive the log files being created in the
    environment. And I believe the question here is how often
    the periodic archive should take place to establish the
    best protocol for catastrophic recovery in the case of a
    failure like physical hardware being destroyed, etc.
    As the documentation describes, it is often helpful to think
    of database archival in terms of full and incremental filesystem
    backups. A snapshot is a full backup, whereas the periodic
    archival of the current log files is an incremental backup.
    For example, it might be reasonable to take a full snapshot
    of a database environment weekly or monthly, and archive
    additional log files daily. Using both the snapshot and the
    log files, a catastrophic crash at any time can be recovered
    to the time of the most recent log archival; a time long after
    the original snapshot.
    What other details can you provide about how how much activity
    there is on your system with regards to log file creation
    and how often a full backup is being taken, etc.
    Thank,
    Sandra

  • SDL Link Out of Service - Node down

    We keep seeing these RTMT alerts and then receive SDL Link out service or server node down. I have been working with TAC but they are not able to pin point if its a CUCM issue or network issues. I have the network team check the CPU usage during this time and nothing major happening same thing on the CUCM side.
     At Tue Oct 28 14:56:48 PDT 2014 on node  the following SyslogSeverityMatchFound events generated: 
    SeverityMatch : Alert
    MatchedEvent : Oct 28 14:56:23 PUB local7 1 : 18: PUB: Oct 28 2014 21:56:23.242 UTC :  %UC_Location Bandwidth Manager-1-LBMLinkOOS: %[LocalNodeId=1][LocalApplicationID=700][RemoteIPAddress=][RemoteApplicationID=700][LinkID=1:700:SUB02:700][AppID=Cisco Location Bandwidth Manager][ClusterID=EvergreenHospital][NodeID=PUB]: LBM link to remote application is out of service AppID : Cisco Syslog Agent ClusterID
    Has anyone experienced this issue? We recently upgraded to 9.x.

    We are experiencing the exact same thing.  There doesn't seem to be anything out of the ordinary or in the logs, but these errors randomly kick off as far as we can tell for no reason.
    I opened a TAC case and they related it to CPU load on our Publisher, I bumped it as recommended and the error cleared for a few days, but just came back today.
    Anyone have any insight?  As I said, doesn't seem to be service impacting, so is it just another 'Cisco Feature'?

  • Error "Comatose" when leave the master node.

    Hi All,
    I'm trying to configured cluster on OES11 and i have many problems with it. this is one of:
    OES5:~ # cluster status
    Master_IP_Address_Resource Running OES1 3
    DATAP_SERVER Comatose OES5 6
    The error "commatose" show on status of data resource when i try to migrate DATAP_SERVER to the second node (OES5) with command: "cluster migrate DATAP_SERVER OES5"
    The same problem occurs when i "cluster leave" with the master node, and the Master_IP_Address_Resource is running on second node but DATAP_SERVER.
    My system:
    Openfire(10.10.5.56) : ISCSI target with: 1GB for SBD, 19GB for DATA
    OES1(10.10.5.155): Master Node, eDirectory
    IP Cluster: 10.10.5.44
    Data Resource: 10.10.5.43
    Mounted ISCSI initiator SBD and DATA From Openfire
    OES5(10.10.5.123): Second Node.
    Mounted ISCSI initiator SBD and DATA From Openfire
    i saw in the log file of the second node OES5:
    Apr 26 23:31:15 oes1 ncs-resourced: DATAP_SERVER.load: Error opening NSS management file (No such file or directory) on server at /sbin/nss line 49.
    it's because of the folder "_admin" doesn't exit on the second node.
    i wonder how can make an _admin folder on the second node, it's not a normal folder. ????
    do you have anyidea??
    thanks for reading.
    ndhuynh

    This is a NSS problem (likely caused by eDir not running correctly at the time the server started up).
    The easiest way to fix this is to reboot the node and check NSS with this command "nss /pools". If the command fails, you can further check eDir status with this command "ndsstat".
    If reboot comes back good, you don't need to do anything. If it doesn't, please contact NTS.
    Regards,
    Changju
    Originally Posted by ndhuynh
    Hi All,
    I'm trying to configured cluster on OES11 and i have many problems with it. this is one of:
    The error "commatose" show on status of data resource when i try to migrate DATAP_SERVER to the second node (OES5) with command: "cluster migrate DATAP_SERVER OES5"
    The same problem occurs when i "cluster leave" with the master node, and the Master_IP_Address_Resource is running on second node but DATAP_SERVER.
    i saw in the log file of the second node OES5:
    it's because of the folder "_admin" doesn't exit on the second node.
    i wonder how can make an _admin folder on the second node, it's not a normal folder. ????
    do you have anyidea??
    thanks for reading.
    ndhuynh

  • Changing Cluster Master node

       Hello,
                  I have two nodes rac setup. I just need to change the master node to different node in the cluster. How to change the master node?
      Please help me out.

    Hi,
    Master Node not exist on RAC and I have pretty sure you are not talking about "OCR Master", you are talking about RESOURCE MASTER.
    The only thing closest to something like a "Master" is that only one node has the role to update the Oracle Cluster Registry (OCR), all other nodes only read the OCR.
    However this is just a role, which can easily switch between the nodes (but is normally fix as long as the responsible nodes lives).
    This node hover is then called OCR master.
    How do I change node which hold  "OCR Master"?
    You can't decide that and you can't change that manually the Clusterware do it automatically without human intervention.
    OCR Master is not a MASTER NODE it's only a role.
    Good answer here:
    Re: Identify the OCR master node for 11.2
    Re: which node will become master
    Please don't confuse the concept of MASTER OCR with RESOURCE MASTER (e.g Data Block).
    All nodes hold MASTER RESOURCE, because that I said : all nodes are equal.
    With "billions" of data block spread out in memory of cluster (SGA Instances), one node maintains extensive information (i.e Lock, Version, etc) about a particular resource (i.e a data block).
    So, one node has master resource of data block "data_01" the other node has a master resource of data block "data_02" and so on. If a node which hold master resource fails (shutdown) GCS choose other node in the cluster to be a MASTER Resource of that particular resource.
    http://www.oracleracsig.org/pls/apex/RAC_SIG.download_my_file?p_file=1003567
    RAC object remastering ( Dynamic remastering )  Oracle database internals by Riyaj
    Message was edited by: Levi-Pereira

Maybe you are looking for