Sun cluster resource group online but faulted

Hi,
recently our storage admin has deleted a volume d9 from the oradg disk set by mistake. for that we created a new disk d29 and restore data to it but we forgot to remove disk d9 from the disk set. after  a reboot the orarg resource group failed to go online with a faulted status because of ora-stor resource is faulted

You cannot rename a resource group in that way in Sun Cluster.
You have two options:
-Recreate the resource group with the new name
-Use an unsupported procedure to change the name in the CCR. This requires downtime of both nodes and as it is unsupported I am not going to describe it here. If that is what you want to do, please log a call with Sun.
If you think renaming resource groups is a useful feature may I also ask you to contact your Sun Service representative so that they can take proper action to log an RFE for the feature.

Similar Messages

  • Rename Sun Cluster Resource Group

    Hi All,
    We have a 2 node Sun Cluster 3.0 running on 2 x V440 servers. We want to change the resource group.
    Can I use "scrgadm -c -g RG_NAME -h nodelist -y property" command to change the resource group name. Can I do this online while the clusters are running or do I need to bring the cluster to mainenance mode? Any help would be appreciated.
    Thanks.

    You cannot rename a resource group in that way in Sun Cluster.
    You have two options:
    -Recreate the resource group with the new name
    -Use an unsupported procedure to change the name in the CCR. This requires downtime of both nodes and as it is unsupported I am not going to describe it here. If that is what you want to do, please log a call with Sun.
    If you think renaming resource groups is a useful feature may I also ask you to contact your Sun Service representative so that they can take proper action to log an RFE for the feature.

  • Why can't  Switch the Resource Group online

    I use scsetup command to setup ORACLE RAC DATASERVICE
    Then to Switch the Resource Group online:
    scswitch -Z -g rac-framework-rg
    #scstat -g
    -- Resource Groups and Resources --
    Group Name Resources
    Resources: nfs-rg cluster1-nfs nfs-stor nfs-res
    Resources: rac-framework-rg rac_framework rac_udlm rac_cvm
    -- Resource Groups --
    Group Name Node Name State Suspended
    Group: nfs-rg sysb Online No
    Group: nfs-rg sysc Offline No
    Group: rac-framework-rg sysc Online faulted No
    Group: rac-framework-rg sysb Online faulted No
    -- Resources --
    Resource Name Node Name State Status Message
    Resource: cluster1-nfs sysb Online Online - LogicalHostname online.
    Resource: cluster1-nfs sysc Offline Offline
    Resource: nfs-stor sysb Online Online
    Resource: nfs-stor sysc Offline Offline
    Resource: nfs-res sysb Online Online - Service is online.
    Resource: nfs-res sysc Offline Offline
    Resource: rac_framework sysc Start failed Faulted - Error in previous reconfiguration.
    Resource: rac_framework sysb Start failed Faulted - Error in previous reconfiguration.
    Resource: rac_udlm sysc Offline Offline
    Resource: rac_udlm sysb Offline Offline
    Resource: rac_cvm sysc Offline Offline
    Resource: rac_cvm sysb Offline Offline
    Thanks!

    The reason for this is that it allows the admin to diagnose why it failed previously without going into a loop.
    The comment in the shell script says:
    # SCMSGS
    # @explanation
    # Error was detected during previous reconfiguration of the
    # RAC framework component. Error is indicated in the message.
    # As a result of error, the ucmmd daemon was stopped and node
    # was rebooted.
    # On node reboot, the ucmmd daemon was not started on the node
    # to allow investigation of the problem.
    # RAC framework is not running on this node. Oracle parallel
    # server/ Real Application Clusters database instances will
    # not be able to start on this node.
    # @user_action
    # Review logs and messages in /var/adm/messages and
    # /var/cluster/ucmm/ucmm_reconf.log. Resolve the problem that
    # resulted in reconfiguration error. Reboot the node to start
    # RAC framework on the node.
    # Refer to the documentation of Sun Cluster support for Oracle
    # Parallel Server/ Real Application Clusters. If problem
    # persists, contact your Sun service representative.
    This should give you some idea of where the problem lies.
    Regards,
    Tim
    ---

  • File and print sharing resource is online but isn't responding to connection attempts

    hi,
    I'm on windows 8.1 enterprise.  All of a sudden in windows explorter when I try to access drives on my network I get "file and print sharing resource is online but isn't responding to connection attempts"..  This formerly did not have
    any issues.
    I did a restore to a point where I know it did work...but it still does not connect to network drives.
    Please advise,thanks James
    AJ Anning

    Hi,
    I made a search in Cisco support forum and found some thread which had similar problem with yours, the solution of them maybe helpful with your.
    https://supportforums.cisco.com/discussion/11533701/cisco-anyconnect-3008057-certificate-validation-failure
    You can follow the reply of this thread to check Cisco certificate on your system.
    Open Certificate manager: Open Run, Type
    certmgr.msc , Press Enter.
    Roger Lu
    TechNet Community Support

  • Information on the Virtual IP Address of cluster resource group

    Hi,
    I would like to know how we can find a Virtual IP address of resource group. i.e. The IP address associated to the SUNW.LogicalHostname resource of the resource group.
    Is it always present in the output of the command 'ifconfig -a'?
    What is difference between IPMP and the Virtual IP Address of the resource groups? Can these IP Addresses be common / same?
    Thanks,
    Chaitanya

    Chaitanya,
    There seems to be a little confusion in your question so let me try and explain.
    Resource groups do not necessarily have to have a logical (virtual) IP address. They will only have a logical IP address if you configure on by adding a SUNW.LogicalHostname resource to the resource group. When you create that resource, you give it the hostname of the IP address you want it to control. This host name should be in the name service, i.e. /etc/hosts, NIS, DNS, LDAP.
    When you have added such a resource to a resource group (RG), bringing the RG online will result in the logical IP address being plumbed up on one of the NICs that form the IPMP group that host the relevant subnet. So, if you have two NICs (ce0, ce1) in an IPMP group that support the 129.156.10.x/24 subnet and you added a logical IP address 129.156.10.42 to RG foo-rg, then bringing foo-rg online will result in 129.156.10.42 being plumbed in on either ce0 or ce1. ifconfig -a will then show this in the output, usually as ce0:1 or ce0:2 or ce1:3, etc.
    An IPMP group is a Solaris feature that supports IP multi-pathing. It uses either ICMP probes or link detection to determine the health of a network interface. In the event that a NIC fails, the IP addresses that are hosted on that NIC are transferred to the remaining NIC. So, if your host has an IP of 129.156.10.41 plumbed on ce0 and it fails, it will be migrated to ce1.
    That's a very short description of a much more detailed topic. Please have a look at the relevant sections in the Solaris and Solaris Cluster documentation on docs.sun.com.
    Hope that helps,
    Tim
    ---

  • SUN  CLUSTER RESOURCE FOR LEGATO CLIENT (LGTO.CLNT) in Oracle database

    hi everyone
    I am tryinig to create a LGTO.clnt resource in oracle-rg resource group in SUN CLUSTER 3.2 with the following commands
    clresource create -g resource_group_name -t LGTO.clnt \
    -x clientname=virtual_hostname -x owned_paths=pathname_1,
    pathname_2[,...] resource_name
    I just need to know what is value of Owned_Paths variable in the above commnad?
    or what PATH it is reffering to ( $ORACLE_HOME or Global devices path ...etc) ?

    Hello,
    The Owned_Paths parameter are the paths (or mountpoints) the legato client will be able to backup from.
    To configure a legato client in the Networker console (and to be managed as a cluster client) you need to declare the in the Owned_Paths the paths you want to save.
    The savesets paths can be a directory under the Owned_Paths.
    Regards
    Pablo Villanueva.

  • 11g r2 non rac using asm for sun cluster os (two node but non-rac)

    I am going to install grid installation for non-rac using asm for two node sun cluster environment..
    How to create candidate disk in solaris cluster (sparc os) to install grid home in asm.. please provide me the steps if anyone knows

    Please refer the thread Re: 11GR2 ASM in non-rac node not starting... failing with error ORA-29701
    and this doc http://docs.oracle.com/cd/E11882_01/install.112/e24616/presolar.htm#CHDHAAHE

  • Cluster Resource control and agent

    Hi,
    I have written a custom agent to make our java based application highly available. It is a scalable resource group. So I was just wondering if there is a way to bring resource group offline on just one node. One command I know of is switch -z -g <groupname> -h <list of node where u want service to be up>. This works fine but in my case we will have multiple nodes running this resource group and if I get the nodelist from resource group and remove the node where I want to stop my application and give all the remaining in the list then it will even bring resource group online on the node where it was stop by administrator previously or as a work around I will have to remember all the nodes where it was brought down and script it create list where it should be up. So I was just wondering if this feature is available in sun cluster 3.1 or it should be an enhancement request to sun.
    Secondly in my agent I have some extension properties defined, so I was just wondering if there is a way to define these properties in such a way that it can have different value on all the nodes of the cluster. I know sun cluster resource group has some such properties like RG_state where it has different values on all the nodes. So is there a way I can extend this to extension properties too.
    Thanks in advance.
    Vivek

    You can change the nodelist of the resource group; scswitch -cg <rg> -h <nodes> where nodes are all nodes except the node(s) you want to offline the resource.
    Eivind

  • Resource Failover on Sun Cluster

    Hi:
    I am a newbie on Solaris Cluster (i have worked with VCS since 4 yeasr ago) and I am evaluating SC like an alternative to VCS.
    I am testing in a two node cluster (SF v880 , 4 CPU's 16 Gb RAM). I have created a failover resource group with two resources:
    - A logical hostname
    - A HAStoragePlus resource (5 file systems)
    I have enabled the monitoring and managing of the resource group. In order to test the switch of the resource group I have executed:
    clresourcegroup switch -n xxxx app1_rg and works fine
    If I reboot one server (witch resource group online) the resource group is realocated in the other member of the cluster.
    I have found a problem (I suppose it sill be a configuration error) when I try to force a failure in the resources. By example If I umount all filesystems of the HAStoragePlus cluster doesn't detect this failure (the same when unplumb the network interface).
    Could somebody help me with this?
    Thanks in advance (I'm sorry because my bad English)

    Hi,
    It is not a configuration error, but a matter of expectations. The HAStoragePlus resource does not monitor the FS status, so the behaviour is as expected. This is not much of a problem, because an application probe will detect that the underlying FS is gon anyway. But becouse many people expressed the desire for a FS monitoring, there are discussions underway to implement one. But this is not available right now.
    The network resource is different. Unplumbing is not a valid test to insert a network error. The Logical Host monitors the status of the underlying IPMP group, and unplumbing does not change that. If you want to test a network error, you have to physically remove the cables.
    Cheers
    Detlef

  • Fe_rpc_command: cmd_type(enum) on Sun cluster

    Hi
    No matter what I do I seem to run into this problem.
    I designed three HA agents and one is really simple and all went ok until I tried to bring the resource group online:
    scswitch -Z -g Ingres_rg
    scstat -g show:
    -- Resource Groups and Resources --
    Group Name Resources
    Resources: Ingres_rg nodec Ingres_rs
    -- Resource Groups --
    Group Name Node Name State Suspended
    Group: Ingres_rg node2 Online faulted No
    Group: Ingres_rg node1 Offline No
    -- Resources --
    Resource Name Node Name State Status Message
    Resource: nodec node2 Online Online - LogicalHostname online.
    Resource: nodec node1 Offline Offline
    Resource: Ingres_rs node2 Online Faulted - Unable to Failover to other node.
    Resource: Ingres_rs node1 Offline Offline
    And then in /var/adm/messages:
    Mar 25 17:54:58 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource Ingres_rs status msg on node node2 change to <Starting>
    Mar 25 17:54:58 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <IngresHB_svc_start.ksh> for resource <Ingres_rs>, resource group <Ingres_rg>, node <node2>, timeout <300> seconds
    Mar 25 17:54:58 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</opt/pirar01IngresHB/bin/IngresHB_svc_start.ksh>:tag=<Ingres_rg.Ingres_rs.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar 25 17:54:59 node2 Cluster.PMF.pmfd: [ID 887656 daemon.notice] Process: tag="Ingres_rg,Ingres_rs,0.svc", cmd="/global/disk2s0/ing_nc_1/ingres/utility/ingstart", Failed to stay up.
    Mar 25 17:54:59 node2 Cluster.RGM.rgmd: [ID 494478 daemon.notice] resource Ingres_rs in resource group Ingres_rg has requested restart of the resource on node2.
    Mar 25 17:54:59 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <IngresHB_svc_start.ksh> completed successfully for resource <Ingres_rs>, resource group <Ingres_rg>, node <node2>, time used: 0% of timeout <300 seconds>
    Mar 25 17:54:59 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_JUST_STARTED
    Mar 25 17:54:59 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_ONLINE_UNMON
    Mar 25 17:54:59 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_STOPPING
    Mar 25 17:54:59 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource Ingres_rs status msg on node node2 change to <Stopping>
    Mar 25 17:54:59 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <IngresHB_svc_stop.ksh> for resource <Ingres_rs>, resource group <Ingres_rg>, node <node2>, timeout <300> seconds
    Mar 25 17:54:59 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</opt/pirar01IngresHB/bin/IngresHB_svc_stop.ksh>:tag=<Ingres_rg.Ingres_rs.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar 25 17:55:00 node2 pirar01.IngresHB:1.0,Ingres_rg,Ingres_rs: [ID 970707 daemon.error] Failed to stop IngresHB using the custom stop command; trying SIGKILL now.
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <IngresHB_svc_stop.ksh> completed successfully for resource <Ingres_rs>, resource group <Ingres_rg>, node <node2>, time used: 0% of timeout <300 seconds>
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_OFFLINE
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource Ingres_rs status on node node2 change to R_FM_OFFLINE
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource Ingres_rs status msg on node node2 change to <>
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_STARTING
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource Ingres_rs status on node node2 change to R_FM_UNKNOWN
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource Ingres_rs status msg on node node2 change to <Starting>
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <IngresHB_svc_start.ksh> for resource <Ingres_rs>, resource group <Ingres_rg>, node <node2>, timeout <300> seconds
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</opt/pirar01IngresHB/bin/IngresHB_svc_start.ksh>:tag=<Ingres_rg.Ingres_rs.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar 25 17:55:00 node2 Cluster.PMF.pmfd: [ID 887656 daemon.notice] Process: tag="Ingres_rg,Ingres_rs,0.svc", cmd="/global/disk2s0/ing_nc_1/ingres/utility/ingstart", Failed to stay up.
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <IngresHB_svc_start.ksh> completed successfully for resource <Ingres_rs>, resource group <Ingres_rg>, node <node2>, time used: 0% of timeout <300 seconds>
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_JUST_STARTED
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_ONLINE_UNMON
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource Ingres_rs status on node node2 change to R_FM_ONLINE
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource Ingres_rs status msg on node node2 change to <>
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_MON_STARTING
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <IngresHB_mon_start.ksh> for resource <Ingres_rs>, resource group <Ingres_rg>, node <node2>, timeout <300> seconds
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</opt/pirar01IngresHB/bin/IngresHB_mon_start.ksh>:tag=<Ingres_rg.Ingres_rs.7>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 494478 daemon.notice] resource Ingres_rs in resource group Ingres_rg has requested failover of the resource group on node2.
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource Ingres_rs status on node node2 change to R_FM_FAULTED
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource Ingres_rs status msg on node node2 change to <Unable to Failover to other node.>
    Mar 25 17:55:01 node2 [pirar01.IngresHB:1.0,Ingres_rs,Ingres_rg]: [ID 115617 daemon.error] Error in failing over the resource group:Ingres_rg
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <IngresHB_mon_start.ksh> completed successfully for resource <Ingres_rs>, resource group <Ingres_rg>, node <node2>, time used: 0% of timeout <300 seconds>
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource Ingres_rs state on node node2 change to R_ONLINE
    Mar 25 17:55:01 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group Ingres_rg state on node node2 change to RG_ONLINE
    What am I missing here ?
    Could it be related to :
    Mar 25 17:55:00 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</opt/pirar01IngresHB/bin/IngresHB_svc_start.ksh>:tag=<Ingres_rg.Ingres_rs.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Any workaround ?
    Thanks in advance for help.
    Armand

    Hi Thorsten
    Yes you are right, I used scdsbuilder in this case.
    Thank you for the directions pointed out.
    I am not sure I fully understand what you are saying, but to clarify a little bit things as to what I am trying to do here.
    ingstart when it runs it simply starts the DBMS installation in a specific manner, that is name server (iigcn), recovery (dmfrcp), archiver (dmfacp) , dbms server (iidbms) , net server (iigcc) . After that ingstart simply goes away.
    So I can see that from this standpoint, the enumerated processes can be considered children of ingstart. But from a pid correlation, ingstart is not considered a parent for any of those processes.
    The same if true for ingstop.
    On the other hand, in a pure failover scenario, all I want to do is to failover to node2 from node1, when node1 goes down, either hardware problem or OS problem. Which means that along with the host failover (logical hostname pointing to node2 from node1), all I need is to run an ingstop (there are reasons to do that) followed by an ingstart. As to probing, it is not required for me in this case at all. I can see the value of probing, but in an active/active cluster, i.e. scalable resources et all, not in a failover one.
    So, maybe you could clarify a little bit if possible how I should make use of Child_mon_level in this case and any other resources I may need to consult.
    Much appreciated the help.
    Thanks
    Armand

  • Problem switching on resource group

    I am in the process of setting up a new two node cluster. I do have Sun Cluster 3.2 installed on a pair of recently patched Solaris 10 T1000 servers.
    I ran the command "scswitch -z -h mw3 -g ensemble-rg" and the command just hung it has not completed or timed out. I tried to stop the command with "scswitch -k -Q -g ensemble-rg" on the mw3 server but that also has not completed.
    I tried to run "clresourcegroup online +" and "clresourcegroup offline -v +" and got the same message:
    clresourcegroup: (C667636) ensemble-rg: resource group is undergoing a reconfiguration, try again later
    What do I need to do to get the resource group and hosts completed?
    Thank you,
    Tom.

    Hello Tim,
    At this point there is nothing in the ensemble-rg. The logical host ens-perf has it's own ip address separate from the two nodes in the cluster and there is no other machine that answers on that IP address on the network.
    What can I look at to show me what might be wrong with the cluster configuration?
    Here is what is in /var/adm/messages for today:
    Nov 20 09:02:47 mw3 Cluster.CCR: [ID 499775 daemon.notice] resource group ensemble-rg added.
    Nov 20 09:03:27 mw3 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_validate> for resource <ens-perf>, resource group <ensemble-rg>, node <mw3>, timeout <300> seconds
    Nov 20 09:03:27 mw3 Cluster.RGM.rgmd: [ID 375444 daemon.notice] 8 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_validate>:tag=<ensemble-rg.ens-perf.2>: Calling security_clnt_connect(..., host=<mw3>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Nov 20 09:03:27 mw3 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_validate> completed successfully for resource <ens-perf>, resource group <ensemble-rg>, node <mw3>, time used: 0% of timeout <300 seconds>
    Nov 20 09:03:28 mw3 Cluster.CCR: [ID 973933 daemon.notice] resource ens-perf added.
    The scswitch command that hung was started at 09:04:51 according to ps.
    I still have the scswitch and clresourcegroup commands that are still hung.
    I was creating the resource group to install the application on the cluster nodes - so I don't have any application logs to check at this point because this is a brand new cluster that I am setting up to test.
    Here is the output for clrg status:
    clrg status
    Cluster Resource Groups ===
    Group Name Node Name Suspended Status
    ensemble-rg mw4 No Offline
    mw3 No Pending online
    I only have one resource group
    From an scstat:
    -- Device Group Servers --
    Device Group Primary Secondary
    Device group servers: ensemble mw3 mw4
    Device group servers: journal mw3 mw4
    Device group servers: wij mw3 mw4
    -- Device Group Status --
    Device Group Status
    Device group status: ensemble Online
    Device group status: journal Online
    Device group status: wij Online
    -- Multi-owner Device Groups --
    Device Group Online Status
    -- Resource Groups and Resources --
    Group Name Resources
    Resources: ensemble-rg ens-perf
    Thank you,
    Tom.

  • Sun Cluster 3.2  without share storage. (Sun StorageTek Availability Suite)

    Hi all.
    I have two node sun cluster.
    I am configured and installed AVS on this nodes. (AVS Remote mirror replication)
    AVS working fine. But I don't understand how integrate it in cluster.
    What did I do:
    Created remote mirror with AVS.
    v210-node1# sndradm -P
    /dev/rdsk/c1t1d0s1      ->      v210-node0:/dev/rdsk/c1t1d0s1
    autosync: on, max q writes: 4096, max q fbas: 16384, async threads: 2, mode: sync, group: AVS_TEST_GRP, state: replicating
    v210-node1# 
    v210-node0# sndradm -P
    /dev/rdsk/c1t1d0s1      <-      v210-node1:/dev/rdsk/c1t1d0s1
    autosync: on, max q writes: 4096, max q fbas: 16384, async threads: 2, mode: sync, group: AVS_TEST_GRP, state: replicating
    v210-node0#   Created resource group in Sun Cluster:
    v210-node0# clrg status avs_test_rg
    === Cluster Resource Groups ===
    Group Name       Node Name       Suspended      Status
    avs_test_rg      v210-node0      No             Offline
                     v210-node1      No             Online
    v210-node0#  Created SUNW.HAStoragePlus resource with AVS device:
    v210-node0# cat /etc/vfstab  | grep avs
    /dev/global/dsk/d11s1 /dev/global/rdsk/d11s1 /zones/avs_test ufs 2 no logging
    v210-node0#
    v210-node0# clrs show avs_test_hastorageplus_rs
    === Resources ===
    Resource:                                       avs_test_hastorageplus_rs
      Type:                                            SUNW.HAStoragePlus:6
      Type_version:                                    6
      Group:                                           avs_test_rg
      R_description:
      Resource_project_name:                           default
      Enabled{v210-node0}:                             True
      Enabled{v210-node1}:                             True
      Monitored{v210-node0}:                           True
      Monitored{v210-node1}:                           True
    v210-node0# In default all work fine.
    But if i need switch RG on second node - I have problem.
    v210-node0# clrs status avs_test_hastorageplus_rs
    === Cluster Resources ===
    Resource Name               Node Name    State     Status Message
    avs_test_hastorageplus_rs   v210-node0   Offline   Offline
                                v210-node1   Online    Online
    v210-node0# 
    v210-node0# clrg switch -n v210-node0 avs_test_rg
    clrg:  (C748634) Resource group avs_test_rg failed to start on chosen node and might fail over to other node(s)
    v210-node0#  If I change state in logging - all work.
    v210-node0# sndradm -C local -l
    Put Remote Mirror into logging mode? (Y/N) [N]: Y
    v210-node0# clrg switch -n v210-node0 avs_test_rg
    v210-node0# clrs status avs_test_hastorageplus_rs
    === Cluster Resources ===
    Resource Name               Node Name    State     Status Message
    avs_test_hastorageplus_rs   v210-node0   Online    Online
                                v210-node1   Offline   Offline
    v210-node0#  How can I do this without creating SC Agent for it?
    Anatoly S. Zimin

    Normally you use AVS to replicate data from one Solaris Cluster to another. Can you just clarify whether you are replicating to another cluster or trying to do it between a single cluster's nodes? If it is the latter, then this is not something that Sun officially support (IIRC) - rather it is something that has been developed in the open source community. As such it will not be documented in the Sun main SC documentation set. Furthermore, support and or questions for it should be directed to the author of the module.
    Regards,
    Tim
    ---

  • Apply one non-kernel Solaris10 patch at Sun Cluster ***Beginner Question***

    Dear Sir/Madam,
    Our two Solaris 10 servers are running Sun Cluster 3.3. One server "cluster-1" has one online running zone "classical". Another server
    "cluster-2" has two online running zones, namely "romantic" and "modern". We are tying to install a regular non-kernel patch #145200-03 at cluster-1 LIVE which doesn't have prerequisite and no need to reboot afterwards. Our goal is to install this patch at the global zone,
    three local zones, i.e., classical, romantic and modern at both cluster servers, cluster-1 and cluster02.
    Unfortunately, when we began our patching at cluster-1, it could patch the running zone "classical" but we were getting the following errors which prevent it from continuing with patching at zones, i.e., "romantic" and "modern" which are running on cluster-2. And when we try to patch cluster-2, we are getting similiar patching error about failing to boot non-global zone "classical" which is in cluster-1.
    Any idea how I could resolve this ? Do we have to shut down the cluster in order to apply this patch ? I would prefer to apply this
    patch with the Sun Cluster running. If not, what's the preferred way to apply simple non-reboot patch at all the zones at both nodes in the Sun Cluster ?
    Like to hear from folks who have experience in dealing with patching in Sun Cluster.
    Thanks, Mr. Channey
    p.s. Below are output form the patch #145200-03 run, zoneadm and clrg
    outputs at cluster-1
    root@cluster-1# patchadd 145200-03
    Validating patches...
    Loading patches installed on the system...
    Done!
    Loading patches requested to install.
    Done!
    Checking patches that you specified for installation.
    Done!
    Approved patches will be installed in this order:
    145200-03
    Preparing checklist for non-global zone check...
    Checking non-global zones...
    Failed to boot non-global zone romantic
    exiting
    root@cluster-1# zoneadm list -iv
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    15 classical running /zone-classical native shared
    - romantic installed /zone-romantic native shared
    - modern installed /zone-modern native shared
    root@cluster-1# clrg status
    === Cluster Resource Groups ===
    Group Name Node Name Suspended Status
    classical cluster-1 No Online
    cluster-2 No Offline
    romantic cluster-1 No Offline
    cluster-2 No Online
    modern cluster-1 No Offline
    cluster-2 No Online

    Hi Hartmut,
    I kind of got the idea. Just want to make sure. The zones 'romantic' and 'modern' show "installed" as the current status at cluster-1. These 2 zones are in fact running and online at cluster-2. So I will issue your commands below at cluster-2 to detach these zones to "configured" status :
    cluster-2 # zoneadm -z romantic detach
    cluster-2 # zoneadm -z modern detach
    Afterwards, I apply the Solaris patch at cluster-2. Then, I go to cluster-1 and apply the same Solaris patch. Once I am done patching both cluster-1 and cluster-2, I will
    go back to cluster-2 and run the following commands to force these zones back to "installed" status :
    cluster-2 # zoneadm -z romantic attach -f
    cluster-2 # zoneadm -z modern attach -f
    CORRECT ?? Please let me know if I am wrong or if there's any step missing. Thanks much, Humphrey
    root@cluster-1# zoneadm list -iv
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    15 classical running /zone-classical native shared
    - romantic installed /zone-romantic native shared
    - modern installed /zone-modern native shared

  • Two resource groups one on each node.

    Hi I am setting up a sun cluster with two nodes and storage.
    I want to run oracle and weblogic.
    I dont want to run both on the same server as the load matters and there are two boxes which can be used.
    I am planning to setup two resource groups with logical hosts one on each box and buind oracle to one and weblogic to second logical host on the other box.
    Now my requirement is,,
    I want first box to be primary for oracle resource group and second box to be primary for weblogic resource group.
    Can you please tell me how to configure different resource groupes to start on different boxes?
    Thanks in advance for the help.
    Ramesh.

    Hi, This document talks about bonding the resources in different ways. But doesnot suggest setting dependency between resources in different resource groups.
    To make it little more clear.
    I have box1 and box2 as nodes of cluster.
    the resources are distrebuted as follows.
    -----Box 1-------------------------Box2-------
    resourcegroup1--------Resourcegroup2
    logicalresource1-------logicalresource2
    Weblogicresource-----Oracleresource
    I will configure the resources to start on different nodes on startup by providing node list in different sequence for each resource group.
    But I want to set the dependency here..the "WeblogicResource" should not start before "Oracleresource".
    Can I straightaway set the dependency as we do for resources in same group or is there any other method to do this?
    Since I donot have a ready/usable test cluster I am unable to try it before posting the question.
    Please let me know the possible solution for this.
    Message was edited by:
    Ramesh_PS1

  • Creating Logical hostname in sun cluster

    Can someone tell me, what exactly logical hostname in sun cluster mean?
    For registering logical hostname resource in failoover group, what exactly i need to specify
    for example, i have two nodes in sun cluster , How to create or configure a logical hostanme and it should point to which IP Address ( Whether it should point to IP addresses of nodes in sun cluster). Can i get clarification on this?

    Thanks Thorsten for ur continue help...
    The output of clrs status abc_lg
    === Cluster Resources ===
    Resource Name Node Name State Status Message
    abc_lg node1 Offline Offline
    node2 Offline Offline
    The status is offline...
    the output of clresourcegroup status
    === Cluster Resource Groups ===
    Group Name Node Name Suspended Status
    abc_rg node1 No Unmanaged
    node2 No Unmanaged
    You say that the resource should de enabled after creating the resource.. I am using GDS and i am just following the steps he provided to acheive high availabilty (in developers guide...)
    I have 1) Logical hostname resorce.
    2) Application resource in my failover resource group
    When i bring online the failover resource group , what should my failover resource group status and the status of resource in my resource group

Maybe you are looking for