Converting 8 node to 4 node caused CLUSTER waits

Hi,
We had an 8 node 10.2.0.3 cluster which was working fine. For some reasons, we reinstalled the cluster with 4 nodes (fresh OSs, but everything was same). we mounted the previous asm diskgroups with the new cluster. Everything seems to be working fine, except we have lots of waits on cluster class, mostly gc cr block busy. Does anyone have any idea about that?
thanks

AWR says a few sql statements causing the cluster waits, but these are not new queries. On the 8 node cluster we had the same sql queries, but we rarely had CLUSTER waits.Not Sure about nodes to 4 nodes.
But I think we need to investigate and improve from that query before.
check Top 5 Timed Events, Wait Events, objects, indexes , gather stats, block size and etc...
Good Luck

Similar Messages

  • Do I use same oracle account on 2 cluster nodes cause problem?

    Do I use same oracle account on 2 cluster nodes cause problem?
    If I use same oracle account on 2 cluster nodes running 2 database, when failover happens, 2 database will be running on one node, does 2 oracle account make SHM ... memory conflict?
    or do I have to use oracle01 account on node1, oracle02 account on node2? Can not use same name account?
    Thanks.

    I'm not 100% certain I understood the question, so I'll rephrase them and answer them.
    Q. If I have the same Oracle account on each cluster node, e.g. uid=100 (oracle) gid=100 (oinstall), groups dba=200, can I run two databases, one on each cluster node without problems?
    A. Yes. Having multiple DBs on one node is not a problem and doesn't cause shared memory problems. Obviously each database needs a different database name and thus different SID.
    Q. Can I have two different Oracle accounts on each cluster node e.g. uid=100 (oraclea) gid=100 (oinstall), groups dba=200 and e.g. uid=300 (oracleb) gid=100 (oinstall), groups dba=200, and run two databases, one for each Oracle user?
    A. Yes. The different Oracle user names would need to be associated with different Oracle installations, i.e. Oracle HOMEs. So you might have /oracle/oracle/product/10.2.0/db_1 (oraclea) and /oracle/oracle/product/11.0.1.0/db_1 (oracleb). The ORACLE_HOME is then used to determine the Oracle user name by checking the user of the Oracle binary in the ${ORACLE_HOME}/bin directory.
    Tim
    ---

  • Failed to add node to cluster

    Hey, I am currently migrating my cluster.
    I removed the server pool master according to the metalink note by doing a failover (stopped the agent on the server pool master)
    Deleted the old master (node2) from the server pool.
    Executed the cleanup script on node2 and switched it off
    Modified the cluster.conf on the remaining node and remove the entries for the old master node2.
    Replaced the old server with new hardware -
    same name - same ip.
    Now I try to add this server to the server pool, but I get a timeout message
    OVM-1006 Register Oracle VM Server (node2) Failed: errcode=00001, errmsg=CDS accquire lock /etc/ovs-agent/db/srv.lock timeout. locker process is 8339
    Where can I look ?
    Christian

    Lemeunier wrote:
    > environment: sles 10 sp3, oes2, cluster services
    >
    > problem: reconfiguring oes to add a node to the cluster is causing the
    > error *failed to add node to cluster*
    >
    > history: I installed a 4 node cluster in a HP C7000 blade. We had to
    > replace the network switch in the blade center by a virtual connect
    > flex-10. This resulted in a loss of network connectivity, so I removed 3
    > of 4 nodes from cluster and eDirectory.
    > This worked fine, replication and time synchronisation was succesfully
    > and all server objects belonging to these 3 servers were deleted.
    >
    > Now the new switch has been configured and network connection
    > reestablished. Reconfiguring eDirectory and other oes2 services
    > succeeds, alle server objects are recreated, eDirectory is in sync, but
    > reconfiguring cluster services does not succeed.
    >
    > What do I have to do, to reconfigure cluster service and add nodes to
    > the cluster?
    >
    > Thank you for all hints.
    >
    > Ursula
    >
    >
    Did you remove the cluster rpms and then reinstall the rpms. I would
    recommend following TID 3131978 and see if that helps.

  • How to Delete the node from cluster when the machine crashed?

    In an three nodes Rac of 11g r2,How to delete the node from cluster when the machine crashed?
    There is now way to repair the machine and have to add a new one.
    What is step to follow up?

    hi
    IF YOU WANT TO DELETE RAC1 NODE
    check $./olsnodes
    1) delete the instance using dbca from any active nodes
    crs_stat -t
    srvctl stop asm -n rac1
    2) delete listener
    3) delete oracle_home from oracle user
    $ORACLE_HOME/bin/runInstaller -updatenodelist ORACLE_HOME=<db_home> "CLUSTER_NODES={RAC1}
    4)delete asm home
    $ORACLE_HOME/bin/runInstaller -updatenodelist ORACLE_HOME=<asm_home> "CLUSTER_NODES={RAC1}
    5) update cluster node
    $ORACLE_HOME/bin/runInstaller -updatenodelist ORACLE_HOME=<db_home> "CLUSTER_NODES={active nodes like rac2,rac3}
    6) update ASm home
    $ORACLE_HOME/bin/runInstaller -updatenodelist ORACLE_HOME=<asm_home> "CLUSTER_NODES={active nodes like rac2,rac3}
    cd $ORA_CRS_HOME
    cd crs/opmn/conf
    check for
    $cat ons.config
    remoteport=6200
    cd crs_home/bin
    $./racgons remove_config rac1:6200
    $ go to crs home
    and $ORA_CRS_HOME/crs/install/rootdelete.sh
    $ORA_CRS_HOME/crs/install/rootdeletenode.sh
    check for ./olsnodes

  • Multiple databases/instances on 4-node RAC Cluster including Physical Stand

    OS: Windows 2003 Server R2 X64
    DB: 10.2.0.4
    Virtualization: NONE
    Node Configuration: x64 architecture - 4-Socket Quad-Core (16 CPUs)
    Node Memory: 128GB RAM
    We are planning the following on the above-mentioned 4-node RAC cluster:
    Node 1: DB1 with instanceDB11 (Active-Active: Load-balancing & Failover)
    Node 2: DB1 with instanceDB12 (Active-Active: Load-balancing & Failover)
    Node 3: DB1 with instanceDB13 (Active-Passive: Failover only) + DB2 with instanceDB21 (Active-Active: Load-balancing & Failover) + DB3 with instanceDB31 (Active-Active: Load-balancing & Failover) + DB4 with instance41 (Active-Active: Load-balancing & Failover)
    Node 4: DB1 with instanceDB14 (Active-Passive: Failover only) + DB2 with instanceDB22 (Active-Active: Load-balancing & Failover) + DB3 with instanceDB32 (Active-Active: Load-balancing & Failover) + DB4 with instance42 (Active-Active: Load-balancing & Failover)
    Note: DB1 will be the physical primary PROD OLTP database and will be open in READ-WRITE mode 24x7x365.
    Note: DB2 will be a Physical Standby of DB1 and will be open in Read-Only mode for reporting purposes during the day-time, except for 3 hours at night when it will apply the logs.
    Note: DB3 will be a Physical Standby of a remote database DB4 (not part of this cluster) and will be mounted in Managed Recovery mode for automatic failover/switchover purposes.
    Note: DB4 will be the physical primary Data Warehouse DB.
    Note: Going to 11g is NOT an option.
    Note: Data Guard broker will be used across the board.
    Please answer/advise of the following:
    1. Is the above configuration supported and why so? If not, what are the alternatives?
    2. Is the above configuration recommended and why so? If not, what are the recommended alternatives?

    Hi,
    As far as i understand, there's nothing wrong in configuration except you need to consider below points while implementing final design.
    1. No of CPU on each servers
    2. Memory on each servers
    3. If you've RAC physical standby then apply(MRP0) will run on only one instance.
    4. Since you are configuring physical standby for on 3rd and 4th nodes of DB1 4 node cluster where DB13 and DB14 instances are used only for failver, if you've a disaster at data center or power failure in entire data center, you are losing both primary and secondary with an assumption that your primary and physical standby reside in same data center so it may not be highly available architecture. If you are going to use extended RAC for this configuration then it makes sense where Node 1 and Node 2 will reside in Datacenter A and Node 3 ,4 will reside in Datacenter B.
    Thanks,
    Keyur

  • How to remove a node from 4 node sun cluster 3.1

    Dear All,
    We are having a four nodes in a cluster.
    Could any one please guide me, how to remove a single node from a 4 node cluster.
    what are the procedure and step's I have to follow.
    Thanks in advance.
    Veera.

    Google is pretty good at finding the right pages in our docs quickly. I tried >how to remove a node Solaris Cluster< and it came up with
    http://docs.sun.com/app/docs/doc/819-2971/gcfso?a=view
    Tim
    ---

  • Automatic restart of services on a 1 node rac cluster with Clusterware

    How do we enable a service to automaticly start-up when the db starts up?
    Thanks,
    Dave

    srvctl enable service -d DBThanks for your reply M. Nauman. I researched that command and found we do have it enabled and that it only works if the database instance was previously taken down. Since the database does not go down on an Archiver Hung error as we are using FRA with an alt location, this never kicks in and brings up the service. What we are looking for something that will trigger off of when the archive logs error and switch from FRA(Flash Recovery Area) to our Alternate disk location. Or more presicely, when it goes back to a Valid status(on the FRA - after we've run an archive log backup to clear it).
    I found out from our 2 senior dba's that our other 2 node rac environment does not suffer from this problem, only the newly created 1 node rac cluster environment. The problem is we don't know what that is(a parameter on the db or cluster or what) and how do we set it?
    Anyone know?
    Thanks,
    Gib
    Message was edited by:
    Gib2008
    Message was edited by:
    Gib2008

  • Error converting DOM nodes into SOAP nodes

    Hi,
    i am doing an http call using http binding activity in oracle soa suite 11g.The http call requires some header information.When I pass the header information through headers in invoke activity and invoke the http call,I get an error " Error converting DOM nodes into SOAP nodes".What might be the reason and how to solve it.
    Naresh

    Hello,
    It appears your code is trying to repeat messagePart itself. Split Joins strictly adheres to a WSDL definition for incoming and outgoing messages.
    Instead of repeating <RootElement> which is defined as single occurance in your message definition within WSDL, you should find a way to tweak the WSDL to have "repetitive node which will become single message after split" within 1 Root element.
    This will call for having need of implementing transformation within split join based on need of target input message. Below is the example:
    WSDL contains:
    <wsdl:message name="inputMessageName">
         <wsdl:part name="partInput" element="rootElement"/>
    </wsdl:message>
    What your current xml structure is(please note how message itself is repeating below while it has been defined as single in wsdl, unfortunately there is nothing maxOccurs for message in WSDL definition , although you can define multiple parts but that is not case here):
    <soap:Body>
    <rootElement>
    </rootElement>
    <rootElement>
    </rootElement>
    <rootElement>
    </rootElement>
    </soap:Body>
    What Split-Join expects(If you have existing wsdl then you need to tweak it to conform to below kind of structure then use transformation within split join to convert it into correct xml structure for outgoing):
    <soap:Body>
    <rootElement>
         <repetitiveElementSpecificToIndividualSplitRequest/>
         <repetitiveElementSpecificToIndividualSplitRequest/>
         <repetitiveElementSpecificToIndividualSplitRequest/>
    </rootElement>
    </soap:Body>
    I hope this helps.
    Regards,
    Ankit

  • All connections are connecting to 2nd node only in a 2 Node RAC Cluster

    Hello,
    I have a 10.2.0.3 database on a two node RAC Cluster with only one service configured. This service set to be preferred on both nodes.
    However, all the connections are falling on Node2 only. Any idea where to look.
    $> srvctl config service -d PSDB
    psdbsrv1 PREF: psdb1 psdb2 AVAIL:
    Thanks,
    MM

    Application is using the following connection string.
    jdbc:oracle:thin:@(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = PQ2-PS-db-01-vip)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = PQ2-PS-db-02-vip)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = PSDBSRV1)(FAILOVER_MODE =(TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY = 5))))
    --MM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • 2 Node Failover Cluster - ISCSI Disks as 1 volume?

    Hi,
    Not sure if I'm in the correct forum. If I am I apologize.  I need some advice.  
    I have created a 2-node failover cluster with 2 HP Blades.  I also currently have 2 NAS Servers (HP X1600 24tb servers running 2008 Storage server) -- The ultimate goal would be to combine all of the storage space from the NAS's into 1 volume addressable
    by the failover cluster. (As well as disk space from any additional NAS's added in the future.)
    Right now, I can add the ISCSI disk space from the NAS Targets as different volumes under cluster shared volumes.  Because of the 16TB limit in the ISCSI target, I essentially have 2 ISCSI disks on each NAS. One for 16TB, and the other for 4TB (The
    NAS Drives are configured for RAID 5 so there's a 4TB Loss.)  So, I have 4 ISCSI disks in the cluster, each as their own volume.
    Any thoughts on making the 4 drives addressable as one volume? 
    Regards,
    -Eric

    We're running Server 2012 Data Center on the cluster nodes.
    I was thinking the same about the 3rd party software to do what I'd like it to do.   The data  is mostly security camera video from our security system.  Since its not really critical data, i'm just looking for a way to maximize
    the available hard drive space, and make it addressable as one volume or network share...
    -Eric
    You can build Storage Spaces (simple, not clustered as it would waste 50% of your capacity, MSFT can do mirror and parity with R2 for clustered only) from iSCSI LUs. Dog slow and unsupported but you'll have linear spanned space. See:
    Rough Guide To Setting Up A Scale-Out File Server
    http://www.aidanfinn.com/?p=13176
    Creating Virtual SoFS with shared VHDX
    http://www.aidanfinn.com/?p=15145
    you don;t need SoFS (obviously) but in this article Aidan creates Storage Spaces from iSCSI LUNs.
    Good luck!
    StarWind VSAN [Virtual SAN] clusters Hyper-V without SAS, Fibre Channel, SMB 3.0 or iSCSI, uses Ethernet to mirror internally mounted SATA disks between hosts.

  • Replace 2 Nodes in Cluster

    Have a 2 node SQL Cluster & looking for best way to replace these with two new servers. I was thinking of removing 1 SQL node and then remove node from windows failover cluster mmc. Then unplug crossover cable and plug into new server and make new
    server same name as one that was just removed.. Then add to cluster and start w. SQL nodes.. thoughts? any articles,etc to follow?

    Hi,
    You can refer the following same scenario solution:
    Add or Remove Nodes in a SQL Server Failover Cluster (Setup)
    http://technet.microsoft.com/en-us/library/ms191545.aspx
    Replace broken node on SQL 2008 failover cluster
    http://social.msdn.microsoft.com/Forums/sqlserver/en-US/a25cba7a-4762-45b5-
    be4c-18fc13ec7eab/replace-broken-node-on-sql-2008-failover-cluster?
    forum=sqldisasterrecovery
    Hope this helps.
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • Shutdown inactive node cause reboot active node

    Hi,
    when I try to shutddown a inactive node of Oracle clusterware (two nodes) the active node reboot (s.o Oracle Linux 5.6 and ocfs2).
    Fire it acquires vip of second node, then reboot and work correctly.
    Anyone can help me?
    Have a nice day

    user2907588 wrote:
    Here the log of active node:
    Jan 31 15:35:23 esse3-db1 avahi-daemon[7041]: Registering new address record for 192.168.101.222 on eth0. *> Jan 31 15:56:30 esse3-db1 kernel: bnx2: eth2 NIC Copper Link is Down*
    Jan 31 15:56:32 esse3-db1 kernel: bnx2: eth2 NIC Copper Link is Up, 100 Mbps full duplex, receive & transmit flow control ON
    Jan 31 15:56:59 esse3-db1 kernel: o2net: connection to node esse3-db2.unisalento.it (num 0) at 192.168.101.202:7777 has been idle for 30.0 seconds, shutting it down.
    Jan 31 15:56:59 esse3-db1 kernel: (0,17):o2net_idle_timer:1503 here are some times that might help debug the situation: (tmr 1359644189.432221 now 1359644219.431047 dr 1359644189.432200 adv 1359644189.432221:1359644189.432221 func (1f70fe7a:504) 1359644047.329097:1359644047.329102)
    Jan 31 15:56:59 esse3-db1 kernel: o2net: no longer connected to node esse3-db2.unisalento.it (num 0) at 192.168.101.202:7777 *> Jan 31 15:57:29 esse3-db1 kernel: (6377,17):o2net_connect_expired:1664 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors.*
    *> Jan 31 16:01:40 esse3-db1 syslogd 1.4.1: restart.*
    >
    Edited by: user2907588 on 1-feb-2013 7.39Hello,
    One of the nodes in cluster seemed to have been evicted previously due to eth2 NIC outage between nodes so as removing the failed node(could be what u r referring to "INACTIVE")
    Please check I have highlighted in provided log information... Check If you are able to ping to the specified IP, and do password-less ssh to other node (101), and ask your system/network administrator to look into it...
    Regards,
    Naga

  • Adding nodes to cluster in 10g r2 10.1.0.3

    I apologize if this is a repeat but my browser crashed before I could watch my post. I am asking a hypothetical question regarding adding nodes to the cluster. I am trying to get a feel for how much risk is involved in the operation and if there is any chance we could corrupt the current configuration?
    I was reading the article from Murali Vallath and notice that he made it a point to say that you should make a full cold backup before you perfrom step 6...
    Step 6: Add New Instance(s)
    DBCA has all the required options to add additional instances to the cluster.
    Requirements:
    Make a full cold backup of the database before commencing the upgrade process.
    Is there risk of corrupting the database during this step?
    We are running 10.2.0.3 on linux Itanium on RHEL4 and we are running a 2 node cluster. We are using OCFS2 for the OCR and Voting devices and we are using ASM and also ASMLIB for our shared storage option. We also are running EMC Powerpath on our hosts.
    Any tips or heads up would be greatly appreciated.
    Thanks.

    Duplicate post :- adding nodes to cluster in 10g r2 10.1.0.3

  • 2 node failover cluster power down

    I have a 2node failover cluster. When I power down a node that has the SQL server instance and resources, all the resources and service failover to the other node.   When I see that all the resources and service report "online" I then power
    that node.  I am being told that this is improper because failover may not have completed.  Is that correct?
    Also, in our 2 node failover cluster is there a proper sequence to restarting the powered down nodes?

    Hi,
    The cluster group containing SQL Server can be configured for automatic failback to the primary node when it becomes available again. By default, this is set to off.
    To Configure:
    Right-click the group containing SQL Server in the cluster administrator, select 'properties' then 'failback' tab.
    To prevent an auto-failback, select 'Prevent Failback', to allow select 'Allow Failback' then one of the following options:
    Immediately: Not recommended as it can disrupt clients
    Failback between n and n1 hours: allows a controlled failback to a preferred node (if it's online) during a certain period.
    The related article:
    Windows Failover Clustering Overview
    http://blogs.technet.com/b/rob/archive/2008/05/07/failover-clustering.aspx
    Hope this helps.
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • How to check that data was propagated to all nodes in cluster?

    Hi.
    We are using Weblogic 10.3.5 and Coherence 3.6. Both applications work in cluster mode and we are using replicated mode as a Coherence topology. Also the NameCache use to store and retreive data from Coherence cluster. Now I have a task to calculate a time that take data propagation to all nodes. So, from my sight of view coherence should raise some kind of event when each node in cluster will fulfield with the same data. Or may be there is a standard coherence(weblogic?) listener that provide such an information.
    I will be appreciate for help how to solve my task.

    Jonathan.Knight wrote:
    Hi,
    If you are using a replicated cache then the time taken to replicate the data is the time taken to do a put. Coherence will not return from a put method call on a NamedCache until the data has reached all the nodes. That is why replicated caches are a bad idea for clusters with a lot of nodes where there are frequent updates as they are slow.
    JKHi JK,
    actually, AFAIK, it is not 100% correct.
    From what I remember from an earlier discussion or email, replication in a replicated cache is synchronous to one other member (the lease owner), and asynchronous thereafter. The synchronous part of the protocol involves the mutating member and the entry lease owner (which may be the same). As I understand the lease owner orders the operations and resolves races between multiple mutators, and drives the asynchronous part of the replication to all other members.
    In short, total network cost is linear with nodes, but latency wise you do not need to wait until all updates actually took place on all other nodes (that would be a really sad scenario when some nodes are communicating slowly).
    Best regards,
    Robert

Maybe you are looking for

  • How do you change default gamma assigned to new displays

    Is there a way the change the default gamma assigned to a newly detected display not already in the database? Thank you, Nally

  • JSF page goes blank  when using with servlet filters...

    Hi there, I have a JSF page, which shows up fine (in both IE and Firefox) in normal scenarios. But as soon as I apply servlet filters onto the faces extension, I get into trouble. So after setting up the filters, when I load the same page(s) in each

  • Photoshop edited image not appearing back in Aperture!!

    I'm a Lightroom user but thought I'd try out Aperture 3 for a new project to see if I prefer it. I scanned in an old family photo album and imported them into Aperture. Some of the photos got torn as I removed them from the pages (they were glued!!)

  • Can I remove the "private browsing" option?

    Is it possible to completely remove the private browsing option in Firefox? I don't like the idea of people in my family having access to all sorts of sites, undetected by anyone. If not I may have to remove Firefox completely and return to Internet

  • 1.5 vs. 3GB/s Serial ATA on my Western Digital hard drive

    I just installed a 500GB Western Digital Scorpio Blue hard drive in my white MacBook. The tech specs shows that it's supposed to have a 3GB/s serial ATA, but when I check the system profiler on the MacBook, it only comes up as 1.5GB/s. Any idea why?