Regarding Three node cluster

Hi,
Can i know how the High Availability feature works foe three node cluster.I configured two node cluster and it works fine, if one goes down the other runs the application. What in case of three node or higher topology cluters.
Is the High availability feature confined to two node cluster , if not how it works for higher topology clusters.
Can anyone help me on this.

This assumes you use the agent builder to do this. If that is the case, you will need to put each of the generated packages on all of the nodes of the cluster. However, it's probably far easier to just create the resources from the command line. You only actually need to specify a start method, e.g
# clrg create my-rg
# clrs create -t SUNW.gds -g my-rg -p network_aware=false -p Start_command=/var/tmp/gds-start my-gds-rs
where /var/tmp/gds-start in my case contains:
#!/bin/ksh
sleep 100000 &
but in your case would contain the start commands for you application - which leaves some sort of process tree in place. If the process in the tree all fail, then the service would be restarted.
Have a look at the other resources on the web about writing GDS services.
Tim
---

Similar Messages

  • Failover Cluster Core Resources question on a Windows 2008R2 three node cluster

    We have a three node Windows 2008R2 cluster with SQL Server 2008 R2 as a clustered resource. There are three resource groups in this cluster 1) Available Storage 2) Cluster Group 3) SQL Server.  The Available Storage and SQL Server resource groups
    reside on one node while the Cluster Group resides on another.  The only resources residing in the Cluster Resource Group is the Cluster name and IP.  I'd like to failover the Cluster Resource Group to be on the same node as everything else. 
    I'm not sure what the implications are on doing this.  Failing over the Cluster Group shouldn't have any impact on the SQL Server Resource Group correct or would there be an interruption to SQL because of the failover of the Cluster Group.  It's
    an critical application of which I'm trying to gather some information for a change request and I know I'm going to be asked if this impacts the production database and everybody using it.
    Thanks
    RG

    No, that should not impact anything.  The cluster group is completely separate from the SQL group.
    . : | : . : | : . tim

  • Sun Cluster with three nodes

    I need a manual or advices for introducing a third node in a RAC with Sun Cluster. I don't know if qourum votes readjust automatic or I have add new quorum votes manualy, if I have to add a thirn mediator in svm ... etc
    A lot of thanks and sorry for my english

    After you have added your nodes to the cluster you will need to expand the RGs node list to include the new nodes if you need the RG to run on them. This is not automatic. Something like:
    # clrg set -n <nodelist> <rg_name>
    Is what you need.
    I'm not sure I understand what you said about the quorum count. Only nodes and quorum devices (QD) or quorum servers (QS) get a vote, cabinets do not. So each node gets a vote and a QD/QS gets a vote count equal to the number of nodes it connects to minus 1. Thus with a two node cluster, you have 3 votes with one QD. With a 4 node cluster with one fully connected QD/QS, you have 7 votes (after re-adding it).
    Hope that helps,
    Tim
    P.S. <shameless plug> I can recommend a good book on the product: "Oracle Solaris Cluster Essentials" ;-)

  • SQL Server 2012 standard edition on 2 node cluster

    I have an environment with three SQL instances running an application. Two of the instances are enterprise edition and one is standard. I am looking to move all instances up to SQL Server 2012 and hoping to put them on a 3 node Windows Failover Cluster.
    I understand that standard edition is limited to two nodes. What I want to know is what my options are?
    Can install standard edition on the cluster but only on two of the three nodes?
    Do I need to create a second Windows Cluster? Is this even possible?
    Regards, Matt Bowler MCITP, My Blog

    Can install standard edition on the cluster but only on two of the three nodes?
    Yes. That's possible. SQL Setup would block you to add third node. It doesn't check how many nodes we have in windows cluster.
    Balmukund Lakhani
    Please mark solved if I've answered your question, vote for it as helpful to help other users find a solution quicker
    This posting is provided "AS IS" with no warranties, and confers no rights.
    My Blog |
    Team Blog | @Twitter
    | Facebook
    Author: SQL Server 2012 AlwaysOn -
    Paperback, Kindle

  • Metaset for 3 nodes cluster

    Hi All,
    I had a question for setup metaset for a 3 nodes cluster.
    After i created a set and adding host and drive for it, it's ok.
    However, when i adding 3 hosts as a mediator, it prompted that it's not allow to
    add more than 2 hosts as mediator.
    I would like to ask that does a 3 nodes cluster no need for a mediator?
    Or is there any other method for it?
    Regards,
    Cheung

    So you are talking about a configuration where you have three nodes A, B and C with two arrays Y and Z. Both arrays Y and Z are connected to all hosts A,B and C. Then the scenario you are envisaging is something like: node A fails, one minute gap, array Y fails, another delay and then node B fails? If so, then I think you are right, you would be able to take over ownership of the array.
    However, that's a pretty extreme case consisting of multiple, separate failures. If you have time between the array Y failing and node B failing, then you could simply remove one of the stale replicas from the replica list, thus leaving you with a majority still working when node B fails.
    So the question is, how important is it that you survive this kind of scenario? I believe VxVM will survive this, but I'm not 100% sure. I gather there are similar issues with it with regards to its private regions. So I can't offer any guarantees as to this being a perfect solution. ZFS on the other hand works in a completely different way. My understanding is that you will always have a consistent file system even in the event of this type of cascading failure. Of course, it Solaris 10 and above only, unlike SVM or VxVM.
    Hope that helps.
    Tim
    ---

  • Hyper-V - 2 node cluster goes down if one server shutsdown

    Hi all,
    I built a 2 node cluster with tiered storage and then I started doing some tests:
    * Drain one node and all the VMs moved to the other node, perfect
    * shutdown the drained node.
    The entire cluster crashed!!! The remaining node is trying to re-connect to the iSCSI SAN without success.
    * I booted the drained node. And it would not re-connect to the iSCSI SAN either. I had to force the reconnect in the iscsi control panel to make it re connect.
    So why would shutting down one node kill the cluster ? Sure it was the node that had the tiered pool online, but even then, isn't failover cluster supposed to put that one back and working on the other node ?
    Why did the active node lose the iSCSI connection too ? It had VMs running on it prior to the shutdown of the other node. My DC that was running on that other node is also now un available, can't ping it or anything.
    So what did I miss in the configuration of the cluster ? I followed the msdn 2 node hyper-v cluster doc.
    I am really worried atm since I had over the past 3 months a ton of issues with hyper-v going from using tiered storage, shutting down nodes, MAC address on the VMs and the hosts,... I thought that after hyper-v 2008, Microsoft had really made some progress
    with Hyper-V but I truly regret not going with VMWare again this time around.
    That cluster was supposed to go into homologation phase tomorrow at the datacenter but now I am unsure if I ll ever be able to trust it to work.
    The SAN is an MD3200i which is reported as Hyper-V ready.
    Any hint on where I have gone wrong would be appreciated.
    Regards,
    Edit: even from the host with powershell I was not able to shutdown the DC and reboot it clean. Said the integration services were not reachable... it is a 2012 R2 servers...
    Edit2: One of my VM is gone ! Can't even find the file on the disks either locally on the hosts or on the SAN. WTF!!!

    Actually comes across very reasonable. And I think you are right. I tend to compare Hyper-V to vSphere with vCenter included. I have not seen nor used VMM. Also true that Storage Pools and iSCSI is not Hyper-V, but to me it comes as a package just as much
    as ESXi 'comes' with it.
    As for burning personnal hours and money on books, I have, just as much as I go to conferences when I can and can afford it. And the only thing I would envy you, is the fact you have your colleagues to bounce idea of / lay on if necessary.
    As for the few hundred box for the management suite. I believe the stack you speak of would actually cost my current company about 14k$, that is not a few hundred box. That is pretty much the cost of one of the 2 SANs.  By 14k$, I mean that
    we have 6 servers with 2 sockets each, running a lot of VMs, which means Datacenter licences which list price is 3.6k$. I am not even including the CALs. Or am I mistaken on the licensing ?
    If VMWare, I would be going with essentials which, at the same server perimeter, would be 30% less expensive. We don't really need the full blown one at our level.
    I am also locked by hardware that was ordered by my predecessor which do not provide the service we need them for. (I blame the vendor on that one, my predecessor was not an infrastructure guy). As for storage, if you are referring to SMB3 and using
    a failover cluster to provide the disks to the hyper-v hosts, I agree. I am just not too sure on the technology yet and went more for safety until I can test it thoroughly in the dev environment.
    I also hope Microsoft will add an easier way to set the media type on disks as well as allow for more than just SSD and HDD or even allow us to define our own.
    I actually fooled the system this time around because SSD are too expensive and too high a failure rate compared to 15kRPM (yes the performance are lovely) at the moment. So I made the 15kRPMs into SSD.
    For the remote management issue, i meant that actually, at this time, I have to disable the domain firewall to be able to manage the hyper-v 2012 R2. I tried hvremote, adding all the necessary rules, etc... What I did for 2012 worked perfectly fine and I
    had full controls. 2012 R2 does not. I have another thread on this in the forum and I ll come back to it as soon as I can.
    I don't really care for the no interface thingy, I enjoy Powershell and scripting fine :) Does a lot of automation for me :) I am used to scripting anyway and it is faster to reproduce steps that way. You do it once, and you got something you can apply with
    little changes to everything.
    I joined the company in November last year and I got dropped a full stack to upgrade in 6 months while maintaining the current one. Encountering the problem now is better and is, fundamentally, good, though it is time consumming.
    By full stack I mean:
    * Help the dev re-design the apps so it handles load better. Get out of Windows 2003 and migrate to 2012 and validate all the applications on 2012 as well as improve security.
    * Implementing a monitoring system for the infrastructure and the applications.
    * Upgrade SQL Server 2005 to, in this case 2012 Standard (no choice and enterprise not in the price range of the company). Converge our current test and prod environments. Optimize all the queries... And naturally validate the applications.
    * Upgrade the certificate authorities so they are available on all sites. Haven't scratched that one yet.
    * Design a fully site redundant architecture so that if a site goes dark we have no impact. If there is partial failure on one site, no issue either, and so on XD. I wish I had AG :)
    * Implement a single windows domain on all the sites, that will be a relief :) Running 4 domains and about 10 different workgroups atm
    * Upgrade the firewalls, switches, servers hardware, ... Implement the necessary networks for virtualization and improved security rules... (Don't buy SonicWalls, had a worse headache with them than Hyper-V :) )
    * Migrate the users from old domain to new one. Windows 8 does not have the user migration tool anymore :(
    * Plan the DR tests and processes :)
    * Gotta get Hyper-V replica working as well as backups.
    * And naturally make all the documentation so that if I happen to get under a bus, anyone in the company can just follow the documentation in case of an emergency.
    And I am sure I am still missing some part of the environment no one knows about at some places :) Found a new network today XD.
    As for the mixed VM management, I mean some that are in HA on the cluster and others that don't need and must not failover. Am sure there is a way to configure them in the failover cluster so they don't failover. I just need to spend the 15 minutes looking
    into it :)
    Since I have had that SAN, I spent about 5 days on the phone with Dell support. That iSCSI issue is not the first one I got :( But in the production environment, that issue has disappeared. So at this time, i have switched working on other issues. And with
    the information you provided, I'll be able to know if one of the node of the cluster lose the connectivity which will help avoid this issue in the future.
    And it is a very good and interesting challenge. I am just running in way to many issues doing a 10 year old system upgrade :) I was expecting to have an easier time with Microsoft Virtualization than I've had.
    Oh and just for the fun, I also had VM disappear, no more file, no nothing, just poof. Have not figured how that happened yet. Had the backup so not much of an issue, but still, it went poof when the original issue happened.
    Thanks for the time and the tips :)

  • Creation of diskset in Two node cluster

    Hi All ,
    I have created one diskset in solaris 9 using SVM in two node cluster.
    After diskset creation, I mounted the diskset in a primary node in the mount point /test. But, the disk set is mounting on both the nodes.
    I created this diskset for failover purposes, if one node goes down the other node will take care.
    My idea is to create a failover resource (diskset resources) in the two node cluster.
    Below are steps used for creating the disk set.
    root@host2# /usr/cluster/bin/scdidadm -L d8
    8        host1:/dev/rdsk/c1t9d0   /dev/did/rdsk/d8
    8        host2:/dev/rdsk/c1t9d0   /dev/did/rdsk/d8
    metaset -s diskset -a -h host2 host1
    metaset -s diskset -a -m host2 host1
    metaset -s diskset -a /dev/did/rdsk/d8
    metainit -s diskset d40 1 1 /dev/did/dsk/d8s0
    newfs /dev/md/diskset/rdsk/d40
    mount /dev/md/diskset/dsk/d40 /test
    root@host2# metaset -s diskset
    Set name = diskset, Set number = 1
    Host                Owner
      host2                  Yes
      host1
    Mediator Host(s)    Aliases
      host2
      host1
    Driv Dbase
    d8   YesPlease let me know how to mount the disk set in one node.
    If i am wrong, please correct me.
    Regards,
    R. Rajesh Kannan.

    The file system will only mount on both (all) nodes if you mount it globally, i.e with the global flag or if there is an entry in /etc/vfstab that has a global option.
    Given your output below, I would guess you have a global mount for /test defined in /etc/vfstab.
    Regards,
    Tim
    ---

  • Switching resource group in 2 node cluster fails

    hi,
    i configured a 2 node cluster to provide high availability for my oracle DB 9.2.0.7
    i have created a resource and named it oracleha-rg,
    and i crated later the following resources
    oraclelh-rs for logical hostname
    hastp-rs for the HA storage resource
    oracle-server-rs for oracle resource
    and listener-rs for listener
    whenever i try to switch the resource group between nodes is gives me the following in dmesg:
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 224900 daemon.notice] launching method <hafoip_stop> for resource <oraclelh-rs>, resource group <oracleha-rg>, node <DB1>, timeout <300> seconds+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource oraclelh-rs status on node DB1 change to R_FM_UNKNOWN+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource oraclelh-rs status msg on node DB1 change to <Stopping>+
    +Feb  6 16:17:49 DB1 ip: [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 010.050.033.009:0, remote = 000.000.000.000:0, start = -2, end = 6+
    +Feb  6 16:17:49 DB1 ip: [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource oraclelh-rs status on node DB1 change to R_FM_OFFLINE+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource oraclelh-rs status msg on node DB1 change to <LogicalHostname offline.>+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 515159 daemon.notice] method <hafoip_stop> completed successfully for resource <oraclelh-rs>, resource group <oracleha-rg>, node <DB1>, time used: 0% of timeout <300 seconds>+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 443746 daemon.notice] resource oraclelh-rs state on node DB1 change to R_OFFLINE+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 224900 daemon.notice] launching method <hastorageplus_postnet_stop> for resource <hastp-rs>, resource group <oracleha-rg>, node <DB1>, timeout <1800> seconds+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource hastp-rs status on node DB1 change to R_FM_UNKNOWN+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource hastp-rs status msg on node DB1 change to <Stopping>+
    +Feb  6 16:17:49 DB1 SC[,SUNW.HAStoragePlus:8,oracleha-rg,hastp-rs,hastorageplus_postnet_stop]: [ID 843127 daemon.warning] Extension properties FilesystemMountPoints and GlobalDevicePaths and Zpools are empty.+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 515159 daemon.notice] method <hastorageplus_postnet_stop> completed successfully for resource <hastp-rs>, resource group <oracleha-rg>, node <DB1>, time used: 0% of timeout <1800 seconds>+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 443746 daemon.notice] resource hastp-rs state on node DB1 change to R_OFFLINE+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource hastp-rs status on node DB1 change to R_FM_OFFLINE+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource hastp-rs status msg on node DB1 change to <>+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 529407 daemon.error] resource group oracleha-rg state on node DB1 change to RG_OFFLINE_START_FAILED+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group oracleha-rg state on node DB1 change to RG_OFFLINE+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 447451 daemon.notice] Not attempting to start resource group <oracleha-rg> on node <DB1> because this resource group has already failed to start on this node 2 or more times in the past 3600 seconds+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 447451 daemon.notice] Not attempting to start resource group <oracleha-rg> on node <DB2> because this resource group has already failed to start on this node 2 or more times in the past 3600 seconds+
    +Feb  6 16:17:49 DB1 Cluster.RGM.global.rgmd: [ID 674214 daemon.notice] rebalance: no primary node is currently found for resource group <oracleha-rg>.+
    +Feb  6 16:19:08 DB1 Cluster.RGM.global.rgmd: [ID 603096 daemon.notice] resource hastp-rs disabled.+
    +Feb  6 16:19:17 DB1 Cluster.RGM.global.rgmd: [ID 603096 daemon.notice] resource oraclelh-rs disabled.+
    +Feb  6 16:19:22 DB1 Cluster.RGM.global.rgmd: [ID 603096 daemon.notice] resource oracle-rs disabled.+
    +Feb  6 16:19:27 DB1 Cluster.RGM.global.rgmd: [ID 603096 daemon.notice] resource listener-rs disabled.+
    +Feb  6 16:19:51 DB1 Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group oracleha-rg state on node DB1 change to RG_OFF_PENDING_METHODS+
    +Feb  6 16:19:51 DB1 Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group oracleha-rg state on node DB2 change to RG_OFF_PENDING_METHODS+
    +Feb  6 16:19:51 DB1 Cluster.RGM.global.rgmd: [ID 224900 daemon.notice] launching method <bin/oracle_listener_fini> for resource <listener-rs>, resource group <oracleha-rg>, node <DB1>, timeout <30> seconds+
    +Feb  6 16:19:51 DB1 Cluster.RGM.global.rgmd: [ID 515159 daemon.notice] method <bin/oracle_listener_fini> completed successfully for resource <listener-rs>, resource group <oracleha-rg>, node <DB1>, time used: 0% of timeout <30 seconds>+
    +Feb  6 16:19:51 DB1 Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group oracleha-rg state on node DB1 change to RG_OFFLINE+
    +Feb  6 16:19:51 DB1 Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group oracleha-rg state on node DB2 change to RG_OFFLINE+
    and the resource group fails to switch...
    any help please?

    Hi,
    this forum is for Oracle Clusterware, not Solaris Cluster. You probably should close this thread and open your question in the corresponding Solaris Cluster forum, to get help.
    Regards
    Sebastian

  • Two node cluster - disk not responding to selection

    I'm building 2 node cluster (Solaris 10/SC3.2) on Dell's 1950/PERC6i servers with quorum as a virtual server. Because I need to introduce quorum server to the cluster - my cluster nodes are still in install mode.
    I have tried to add quorum using scsetup or clsetup but I'm getting always the same message:
    root@node01:~# scsetup
    Failed to get node zone list
    Failed to get node zone list
        This program has detected that the cluster "installmode" attribute is
        still enabled. As such, certain initial cluster setup steps will be
        performed at this time. This includes adding any necessary quorum
        devices, then resetting both the quorum vote counts and the
        "installmode" property.
        Please do not proceed if any additional nodes have yet to join the
        cluster.
        Is it okay to continue (yes/no) [yes]?  yes
    Unable to establish the list of cluster nodes.
    Press Enter to continue:Also the most imortant issue is that immediately after restart of the first node during scinstall procedure, I started to getting those follwing messages:
    Feb  4 17:33:20 node01 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,25e3@3/pci1028,1f0c@0/sd@0,0 (sd3):
    Feb  4 17:33:20 node01      disk not responding to selection
    Feb  4 17:33:21 node01 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,25e3@3/pci1028,1f0c@0/sd@0,0 (sd3):
    Feb  4 17:33:21 node01      disk not responding to selection
    Feb  4 17:26:46 node02 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,25e3@3/pci1028,1f0c@0/sd@1,0 (sd4):
    Feb  4 17:26:46 node02      disk not responding to selection
    Feb  4 17:26:46 node02 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,25e3@3/pci1028,1f0c@0/sd@0,0 (sd3):
    Feb  4 17:26:46 node02      disk not responding to selectionBoth nodes are extreamly slow, I could use only telnet to login because of long timeouts for many services.
    Here is output from: cfgadm -l
    root@node01:~# cfgadm -al
    Ap_Id                          Type         Receptacle   Occupant     Condition
    c0                             scsi-bus     connected    configured   unknown
    c0::dsk/c0t0d0                 disk         connected    configured   unknown
    c0::dsk/c0t1d0                 disk         connected    configured   unknown
    c4                             fc           connected    unconfigured unknown
    c5                             fc           connected    unconfigured unknown
    usb0/1                         unknown      empty        unconfigured ok
    usb0/2                         unknown      empty        unconfigured ok
    usb1/1                         unknown      empty        unconfigured ok
    usb1/2                         unknown      empty        unconfigured ok
    usb2/1                         unknown      empty        unconfigured ok
    usb2/2                         unknown      empty        unconfigured ok
    usb3/1                         unknown      empty        unconfigured ok
    usb3/2                         unknown      empty        unconfigured ok
    usb4/1                         usb-hub      connected    configured   ok
    usb4/1.1                       usb-device   connected    configured   ok
    usb4/1.2                       usb-device   connected    configured   ok
    usb4/2                         unknown      empty        unconfigured ok
    usb4/3                         unknown      empty        unconfigured ok
    usb4/4                         unknown      empty        unconfigured ok
    usb4/5                         usb-hub      connected    configured   ok
    usb4/5.1                       unknown      empty        unconfigured ok
    usb4/5.2                       unknown      empty        unconfigured ok
    usb4/5.3                       unknown      empty        unconfigured ok
    usb4/5.4                       unknown      empty        unconfigured ok
    usb4/6                         unknown      empty        unconfigured ok
    usb4/7                         unknown      empty        unconfigured ok
    usb4/8                         unknown      empty        unconfigured okHow to solve those two problems? This one with SCSI issue and a problem with node list...
    Best regards,
    Vladimir

    During a last weekend I have reinstalled both nodes. I rebuild virtual disks (LUNs) under PERC 6i controller. I have chosen RAID0 instead of RAID1 as it was before. Still I'm not sure did RAID0 help me or just a rebuilding disks. MegaCli tool from LSI could not help me and tell what was wrong with disk/partitions that I used with first try. For MegaCli status of controller was without errors.
    Probably something was wrong with this partitioning. I used also as in first try Solaris 10 with latest patches 01/2009 and Sun Cluster 3.2 also with latest security patch.
    After reinstalling everything is almost fine :-). The only difference with other productive clusters (the patch level of Solaris & SUN Cluster is not as not this test cluster) is that cacao container agent is offline:
    offline        Feb_07   svc:/application/management/common-agent-container-1:default
    because of that I have following service as disabled:
    disabled       Feb_07   svc:/system/cluster/rgm:default
    Does anyone knows, how serious is this? And how to enable now svc:/application/management/common-agent-container-1:default?
    Here is svcs -xv output:
    svc:/application/print/server:default (LP print server)
    State: disabled since Sat Feb 07 10:42:05 2009
    Reason: Disabled by an administrator.
       See: http://sun.com/msg/SMF-8000-05
       See: man -M /usr/share/man -s 1M lpsched
    Impact: 2 dependent services are not running:
            svc:/application/print/rfc1179:default
            svc:/application/print/ipp-listener:default
    svc:/system/cluster/rgm:default (Resource Group Manager Daemon)
    State: disabled since Sat Feb 07 10:42:06 2009
    Reason: Disabled by an administrator.
       See: http://sun.com/msg/SMF-8000-05
    Impact: 1 dependent service is not running:
            svc:/application/management/common-agent-container-1:default
    svc:/system/cluster/scsymon-srv:default (Sun Cluster SyMON Server Daemon)
    State: offline since Sat Feb 07 10:42:07 2009
    Reason: Dependency svc:/application/management/sunmcagent:default is absent.
       See: http://sun.com/msg/SMF-8000-E2
    Impact: This service is not running.Regards,
    Vladimir

  • Creating a Two Node Cluster

    Good afternoon,
    I'm looking to build/create an inexpensive two node cluster. I have a SLES11SP1 server that is running XEN as a virtual hosting server, I run about five servers in a virtual environment. I have three USB drives set up to host my Guest servers, what I would like to do is to purchase another USB drive, so that I can use that as an iscsi/SAN server location, it would be 2TB in size, and then I would build my new servers making them cluster enabled. They should then be able to "see" the SAN/iscsi/storage location.
    Does anyone have any further suggestions?
    Thanks
    -DS

    Originally Posted by gleach1
    while I wouldn't recommend running servers off USB drives (unless this is for testing), I don't see any issue in doing it but the performance may not be fantastic
    I really appreciate your responding back and assisting in this testing. This cluster is for testing only, I have a small server farm in my basement, and I have a customer that is using clustering and I would like to at least be able to have something to test with
    I've actually run a test cluster off a USB drive using vmware workstation before and it seemed to run fine just as a test system
    If you set up your xen host as an iscsi server, use the USB disk as the storage you present to the guests and set them up with iscsi initiators it should work like any other iscsi san would, obviously a touch slower...
    Is there some documentation that explains how to do the iscsi server setup that you described? I have the "Configuring Novell Cluster Services in a XEN Virtualization Environment" but I really don't see anything about the iscsi initiator setup, I was going to add the USB drive as a /storage volume on the XEN host, and then point the cluster to that? I will also be adding a third card to handle the clustering network.

  • New Howto guide available: HOW to INSTALL and CONFIGURE A TWO-NODE CLUSTER

    Hi,
    I am not sure that this has been posted or not. But for all of us who are too lazy (like myself) to read the full set of documentation, or who have not enough time, please have a look at this short document with screen shots that explains how to setup a simple two-node cluster.
    http://www.sun.com/software/solaris/howtoguides/twonodecluster.jsp
    I still recommend to browse through the "Sun Cluster Concepts Guide for Solaris OS" http://docs.sun.com/app/docs/doc/819-0421
    Have fun
    Hartmut

    I am afraid there is no such document. But I can try to give you some hints.
    N+1 can be seen as two things:
    - the phsical setup of the cluster and
    - the logical configuration of services
    A physical N+1 setup would mean, that you have one server that is connected to all shared storage, but the other N nodes are only connected to the shared storage they need.
    This configuration would also need the logical N+1 setup, namely that all HA services could only run on 2 nodes, their primary node and as a backup this one dedicated node that is connected to all shared storage.
    If you do not have the physical N+1 setup, you could easily have a logical N+1 setup, by having a N*N topology, j.e. all nodes connected to all storage, but logically each service again would have only 2 nodes on its nodelist.
    It seems that the second option is the one that is used more often, as ift offers you the ability to reconfigure things in case of failures.
    With regards to asymmetric and symmetric I am not quite sure what you mean: And I do not think that these terms are used in the SC docs.
    What I can think of is, that asymmetric means that you have one node being active with 1 or more HA services and have one inactive secondary node as a backup.
    Regards
    Hartmut

  • How do I install SQL Management Objects for SQL 2008 on a 2 node cluster?

    Hi,
    One of the software we use needs the SMO to be installed. This is on a SQL 2008 2 node cluster. In the control-panel, I do NOT see this installed, and I see the client tools SDK installed and I even did a repair of it. I still do not see it.
    Can some one please help me with SMO install on this 2 node SQL 2008 cluster? I could not find a working link to install this.
    Thanks!
    Suresh.
    Suresh Channamraju

    Hi Suresh,
    According to your descriptIon, you need to install SMO on two nodes of SQL 2008 cluster, right?
    If you want to develop an application that uses SQL Server Management Objects (SMO), you should select the Client Tools SDK when you install SQL Server. To install the Client TooLs SDK without installing SQL Server, install Shared Management Objects from
    the SQL Server 2008 Feature Pack.
    http://www.microsoft.com/en-us/download/details.aspx?id=6375
    By default, the SMO assemblies are installed in the C:\Program Files\Microsoft SQL Server\100\SDK\Assemblies\ directory.
    Regards,
    Charlie Liao
    If you have any feedback on our support, please click
    here.
    Charlie Liao
    TechNet Community Support

  • 64 BW 3.1 on a 3 node cluster

    Hello,
    We currently have a 2 node cluster for our BW/MSSQL server, we are planning to include a 3rd node thus making our BW system a 3 node cluster.  We are using MSCS with MSSQL server.  I am not sure if we can use the cluster wizard.  If anyone has done this, could they share their experience?
    Many Thanks in advance

    Hello Arun,
    Thanks for your quick response.  Actually we are planning to include the CI on the 3rd node cluster.  We are not planning to install another DI. 
    We currently have CI setup so it can failover to Node A or Node B, what we want to do is to be able to failover the CI on the 3rd node too Node C.
    Has anyone successfuly done that on MSCS?
    Regards,
    saud

  • GI installation on a single-node cluster error.

    Hello, I am trying to install GI on a single-node cluster (Solaris 10 / Sparc) but the root.sh script fails with the following error (this is not a GI installation for a Standalone Server :
    root@selvac./dev/ASM/OCRVTD_DG # /app/oracle/grid/11.2/root.sh
    Running Oracle 11g root script...
    The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME= /app/oracle/grid/11.2
    Enter the full pathname of the local bin directory: [usr/local/bin]:
    Copying dbhome to /usr/local/bin ...
    Copying oraenv to /usr/local/bin ...
    Copying coraenv to /usr/local/bin ...
    Creating /var/opt/oracle/oratab file...
    Entries will be added to the /var/opt/oracle/oratab file as needed by
    Database Configuration Assistant when a database is created
    Finished running generic part of root script.
    Now product-specific root actions will be performed.
    Using configuration parameter file: /app/oracle/grid/11.2/crs/install/crsconfig_params
    Creating trace directory
    LOCAL ADD MODE
    Creating OCR keys for user 'root', privgrp 'root'..
    Operation successful.
    OLR initialization - successful
    root wallet
    root wallet cert
    root cert export
    peer wallet
    profile reader wallet
    pa wallet
    peer wallet keys
    pa wallet keys
    peer cert request
    pa cert request
    peer cert
    pa cert
    peer root cert TP
    profile reader root cert TP
    pa root cert TP
    peer pa cert TP
    pa peer cert TP
    profile reader pa cert TP
    profile reader peer cert TP
    peer user cert
    pa user cert
    Adding daemon to inittab
    ACFS-9200: Supported
    ACFS-9300: ADVM/ACFS distribution files found.
    ACFS-9312: Existing ADVM/ACFS installation detected.
    ACFS-9314: Removing previous ADVM/ACFS installation.
    ACFS-9315: Previous ADVM/ACFS components successfully removed.
    ACFS-9307: Installing requested ADVM/ACFS software.
    ACFS-9308: Loading installed ADVM/ACFS drivers.
    ACFS-9327: Verifying ADVM/ACFS devices.
    ACFS-9309: ADVM/ACFS installation correctness verified.
    CRS-2672: Attempting to start 'ora.mdnsd' on 'selvac'
    CRS-2676: Start of 'ora.mdnsd' on 'selvac' succeeded
    CRS-2672: Attempting to start 'ora.gpnpd' on 'selvac'
    CRS-2676: Start of 'ora.gpnpd' on 'selvac' succeeded
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'selvac'
    CRS-2672: Attempting to start 'ora.gipcd' on 'selvac'
    CRS-2676: Start of 'ora.cssdmonitor' on 'selvac' succeeded
    CRS-2676: Start of 'ora.gipcd' on 'selvac' succeeded
    CRS-2672: Attempting to start 'ora.cssd' on 'selvac'
    CRS-2672: Attempting to start 'ora.diskmon' on 'selvac'
    CRS-2676: Start of 'ora.diskmon' on 'selvac' succeeded
    CRS-2676: Start of 'ora.cssd' on 'selvac' succeeded
    ASM created and started successfully.
    Disk Group OCRVTD_DG created successfully.
    The ora.asm resource is not ONLINE
    Did not succssfully configure and start ASM at /app/oracle/grid/11.2/crs/install/crsconfig_lib.pm line 6465.
    /app/oracle/grid/11.2/perl/bin/perl -I/app/oracle/grid/11.2/perl/lib -I/app/oracle/grid/11.2/crs/install /app/oracle/grid/11.2/crs/install/rootcrs.pl execution failed
    I also found the "PRVF-5150: Path OCRL:DISK1 is not a valid path on all nodes" error but as I have read it is a bug I Ignored it. But...
    I think my ASM_DG OCR and voting is ok, accessible by grid user and 660. It seems ASM does not start or does not start in time.
    Any help is wellcome.
    Thanks in advance.

    Thanks a lot for the hint. I had already checked this doc. but I think it is not the problem. Actually de error ora.asm is not online is not correct. After failing root.sh, ora.asm is ONLINE:
    root@selvac./app/oracle/grid/11.2/bin # ./crsctl check resource ora.asm -init
    root@selvac./app/oracle/grid/11.2/bin # ./crsctl stat resource ora.asm -init
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=ONLINE on selvac
    The last part of the /app/oracle/grid/11.2/cfgtoollogs/crsconfig/rootcrs_selvac.log file reads :
    >
    ASM created and started successfully.
    Disk Group OCRVTD_DG created successfully.
    End Command output2011-04-14 13:24:16: Executing cmd: /app/oracle/grid/11.2/bin/crsctl check resource ora.asm -init
    2011-04-14 13:24:17: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:17: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:17: Checking the status of ora.asm
    2011-04-14 13:24:22: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:22: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:22: Checking the status of ora.asm
    2011-04-14 13:24:27: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:28: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:28: Checking the status of ora.asm
    2011-04-14 13:24:33: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:33: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:33: Checking the status of ora.asm
    2011-04-14 13:24:38: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:38: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:38: Checking the status of ora.asm
    2011-04-14 13:24:43: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:43: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:43: Checking the status of ora.asm
    2011-04-14 13:24:48: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:49: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:49: Checking the status of ora.asm
    2011-04-14 13:24:54: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:54: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:54: Checking the status of ora.asm
    2011-04-14 13:24:59: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:24:59: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:24:59: Checking the status of ora.asm
    2011-04-14 13:25:04: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
    2011-04-14 13:25:04: Command output:
    NAME=ora.asm
    TYPE=ora.asm.type
    TARGET=ONLINE
    STATE=OFFLINE
    End Command output2011-04-14 13:25:04: Checking the status of ora.asm
    2011-04-14 13:25:09: The ora.asm resource is not ONLINE
    2011-04-14 13:25:09: Running as user grid: /app/oracle/grid/11.2/bin/cluutil -ckpt -oraclebase /app/grid -writeckpt -name ROOTCRS_BOOTCFG -state FAIL
    2011-04-14 13:25:09: s_run_as_user2: Running /bin/su grid -c ' /app/oracle/grid/11.2/bin/cluutil -ckpt -oraclebase /app/grid -writeckpt -name ROOTCRS_BOOTCFG -state FAIL '
    2011-04-14 13:25:10: Removing file /var/tmp/mbahSaGPn
    2011-04-14 13:25:10: Successfully removed file: /var/tmp/mbahSaGPn
    2011-04-14 13:25:10: /bin/su successfully executed
    2011-04-14 13:25:10: Succeeded in writing the checkpoint:'ROOTCRS_BOOTCFG' with status:FAIL
    2011-04-14 13:25:10: ###### Begin DIE Stack Trace ######
    2011-04-14 13:25:10: Package File Line Calling
    2011-04-14 13:25:10: --------------- -------------------- ---- ----------
    2011-04-14 13:25:10: 1: main rootcrs.pl 322 crsconfig_lib::dietrap
    2011-04-14 13:25:10: 2: crsconfig_lib crsconfig_lib.pm 6465 main::__ANON__
    2011-04-14 13:25:10: 3: crsconfig_lib crsconfig_lib.pm 6390 crsconfig_lib::perform_initial_config
    2011-04-14 13:25:10: 4: main rootcrs.pl 671 crsconfig_lib::perform_init_config
    2011-04-14 13:25:10: ####### End DIE Stack Trace #######
    2011-04-14 13:25:10: 'ROOTCRS_BOOTCFG' checkpoint has failed
    So this must be a bug. During root.sh execution ora.asm is OFFLINE but after failing it is ONLINE. It maight be a question of waiting/repeating or timeout as I see the "Checking the status of ora.asm" command is repeated several times during root.sh, but not enough perhaps. Now root.sh is failed, installation halted but ASM is ONLINE.
    Any other Idea?
    Thanks again.

  • 2 node cluster down and can't boot

    HI,
    Due to power problem my 2 node cluster (both the node) got down all of a sudden abruptly. Now i can not boot any node. Is giving following error:
    Rebooting with command: boot
    Boot device: /pci@1c,600000/scsi@2/disk@0,0:a File and args:
    SunOS Release 5.10 Version Generic_127127-11 64-bit
    Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Hardware watchdog enabled
    Jun 25 13:54:55 svc.startd[8]: svc:/system/cluster/cl_boot_check:default: Method "/usr/cluster/lib/svc/method/svc_boot_check start" failed with exit status 1.
    Jun 25 13:54:56 svc.startd[8]: svc:/system/cluster/cl_boot_check:default: Method "/usr/cluster/lib/svc/method/svc_boot_check start" failed with exit status 1.
    Jun 25 13:54:56 svc.startd[8]: svc:/system/cluster/cl_boot_check:default: Method "/usr/cluster/lib/svc/method/svc_boot_check start" failed with exit status 1.
    Jun 25 13:54:56 svc.startd[8]: system/cluster/cl_boot_check:default failed: transitioned to maintenance (see 'svcs -xv' for details)
    Hostname: clnode1
    Requesting System Maintenance Mode
    (See /lib/svc/share/README for more information.)
    Console login service(s) cannot run
    Root password for system maintenance (control-d to bypass):
    Could any one suggest me how to solve this?
    Thanks in advance

    Following is the log file:
    root@clnode1 #
    root@clnode1 #
    root@clnode1 # svcs -l cl_boot_check
    fmri svc:/system/cluster/cl_boot_check:default
    name Sun Cluster boot check
    enabled true
    state maintenance
    next_state none
    state_time Wed Jun 25 15:02:20 2008
    alt_logfile /etc/svc/volatile/system-cluster-cl_boot_check:default.log
    restarter svc:/system/svc/restarter:default
    dependency require_all/none svc:/system/filesystem/usr:default (online)
    root@clnode1 #
    root@clnode1 #
    root@clnode1 #
    root@clnode1 #
    root@clnode1 # cat /etc/svc/volatile/system-cluster-cl_boot_check:default.log
    [ Jun 25 15:02:14 Enabled. ]
    [ Jun 25 15:02:19 Executing start method ("/usr/cluster/lib/svc/method/svc_boot_check start") ]
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/sc_failfast:default: not found
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/rpc-pmf:default: not found
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/sc_ng_zones:default^J^J#^J# The following is the list of services that this script tries to disable^J# when booting a non-global zone in non-cluster mode. While booting in a^J# cluster mode nothing is done to change the state, thereby the following^J# services keeps the same state that it was in before the boot.^J#^JCLUSTER_LOCAL_ZONE_OTHER_SVCS=svc:/system/cluster/sc_restarter:default^J^J^Jif [ ! -f /lib/svc/share/smf_include.sh ]^Jthen^J^I#^J^I# This is an smf service. It should run only on Solaris 10 and above.^J^I#^J^Iexit 0^Jfi^J^J. /lib/svc/share/smf_include.sh^J^J#^J# Get the zone name.^J#^JZONENAME=global^JERROR=0^Jif [  -ne 0 ]; then^J^Iecho Error: not found
    [ Jun 25 15:02:20 Method "start" exited with status 1 ]
    [ Jun 25 15:02:20 Executing start method ("/usr/cluster/lib/svc/method/svc_boot_check start") ]
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/sc_failfast:default: not found
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/rpc-pmf:default: not found
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/sc_ng_zones:default^J^J#^J# The following is the list of services that this script tries to disable^J# when booting a non-global zone in non-cluster mode. While booting in a^J# cluster mode nothing is done to change the state, thereby the following^J# services keeps the same state that it was in before the boot.^J#^JCLUSTER_LOCAL_ZONE_OTHER_SVCS=svc:/system/cluster/sc_restarter:default^J^J^Jif [ ! -f /lib/svc/share/smf_include.sh ]^Jthen^J^I#^J^I# This is an smf service. It should run only on Solaris 10 and above.^J^I#^J^Iexit 0^Jfi^J^J. /lib/svc/share/smf_include.sh^J^J#^J# Get the zone name.^J#^JZONENAME=global^JERROR=0^Jif [  -ne 0 ]; then^J^Iecho Error: not found
    [ Jun 25 15:02:20 Method "start" exited with status 1 ]
    [ Jun 25 15:02:20 Executing start method ("/usr/cluster/lib/svc/method/svc_boot_check start") ]
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/sc_failfast:default: not found
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/rpc-pmf:default: not found
    /usr/cluster/lib/svc/method/svc_boot_check: svc:/system/cluster/sc_ng_zones:default^J^J#^J# The following is the list of services that this script tries to disable^J# when booting a non-global zone in non-cluster mode. While booting in a^J# cluster mode nothing is done to change the state, thereby the following^J# services keeps the same state that it was in before the boot.^J#^JCLUSTER_LOCAL_ZONE_OTHER_SVCS=svc:/system/cluster/sc_restarter:default^J^J^Jif [ ! -f /lib/svc/share/smf_include.sh ]^Jthen^J^I#^J^I# This is an smf service. It should run only on Solaris 10 and above.^J^I#^J^Iexit 0^Jfi^J^J. /lib/svc/share/smf_include.sh^J^J#^J# Get the zone name.^J#^JZONENAME=global^JERROR=0^Jif [  -ne 0 ]; then^J^Iecho Error: not found
    [ Jun 25 15:02:20 Method "start" exited with status 1 ]
    root@clnode1 #
    root@clnode1 #
    root@clnode1 #
    root@clnode1 #
    root@clnode1 #
    root@clnode1 #

Maybe you are looking for

  • Scanning enabled but still won't scan

        I have enabled my mac to be scanned to and have been able to in the past. My 8600 began to have issues several months ago, beginning with my document feeder and then to my flatbed scanner. I was at first able to scan from my printers interface bo

  • Framework order with account assignement U and item category B

    Hi guys, I have seen many thread regarding this topic but could not find my answer. I want to create a Framework order (I can't use K forbidden on my project... long story) withAcc *** U and Itm B. The thing is during Invoice verification all field r

  • Appraisal Documents (WD UI) open the same page in the pop up browser.

    Hi Expert,     I am using the business package for  Appraisal Documents. I got some problem on navigate to the new browser.    When the user click on a link in 'Appraisal Documents (WD UI)' in should open a new pop up for Appraisal Document(HAP_MAIN_

  • ADF Faces to Trinidad migrated app popup not working

    I have 10g version application (ADF bc and jsf) migrated to 11g using automated migration of jdeveloper and this application has lot of popups . for some reason pop ups are not working in the converted app if i open a popup programmatically it works

  • How to recover the Outlook password?

    I have just started a new job, and am on my own in an office (no IT support).  I am using Outlook 2007, but we can't recieve any emails. The pst file is almost 3GB so i have made another one and am trying to clear old stuff from the old one to the ne