Help, node panic in the sun cluster 3.3.

We have a 2-node cluster which are connected to the same storage.
The resource owner of the cluster lost its power accidently, but it caused another node in the cluster panic.
We have checked the log of the panic node. It said "reservation conflict" before panic.
Does anybody know what is wrong ?

Hi.
What storage use as shared ?
How many path have every node to this storage ?
Cluster node reserve storage LUN for dedicated access ( storage reservation) but at failover spare nodes can't set reservation for shared devices that cause panic.
Check this settings and recomendation:
https://blogs.oracle.com/js/entry/prevent_reservation_conflict_panic_if
Regards.
Edited by: Nik on 10.08.2012 1:37

Similar Messages

  • Close/shutdown the Sun Cluster Package/resource Group

    Hi,
    I have a SUN cluster system.
    I want to know what script do when the SUN cluster shutdown the package "app-gcota-rg" as I may need to modify it ?? Where can I find out this information in the system??
    In which directory and log file ???
    Any suggestion ???
    Resource Groups --
    Group Name Node Name State
    Group: ora_gcota_rg ytgcota-1 Online
    Group: ora_gcota_rg ytgcota-2 Offline
    Group: app-gcota-rg ytgcota-1 Online
    Group: app-gcota-rg ytgcota-2 Offline

    Hi,
    you would first find out which resources belong to app-gcota-rg.
    Do a "clrs list -g app-gcota-rg". Then find out which of the resource is the one dealing with your application. Then try to find out its resource type:
    "clrs show -v <resource name>| fgrep Type". If it is a standard type like HA Oracle, it is an extremely bad idea to hack the scripts, as you'll lose support. If type is SUNWgds, the scripts to start, stop and monitor the application are user supplied. You can find their pathnames using:
    "clrs show -v <resource-name>| fgrep _command". This should display full pathnames.
    Regards
    Hartmut

  • Sun Cluster Core Conflict - on SUN Java install

    Hi
    We had a prototype cluster that we were playing with over two nodes.
    We decided to uninstall the cluster by putting node into single user mode and running scinstall -r.
    Afterwards we found that the Java Availability Suite was a little messed up - maybe because the kernel/registry had not been updated - it though the cluster and agent software was uninstalled and would not let us re-install. All the executabvles from /etc/cluster/bin had been removed from the nodes.
    So, On both nodes we ran the uninstall program from /var/sadm/prod/... and then selected cluster and agents to uninstall.
    On the first node, this completely removed the sun cluster compoenets and then allowed us to re-install the cluster software successfully.
    On the second node, for some reason, it has left behind the component "Sun Cluster Core", and will not allow us to remove it with the uninstall.
    When we try to re-install we get the following:
    "Conflict - incomplete version of Sun Cluster Core has been detected"
    In then points us to the sun cluster upgrade guide on sun.com.
    My question is - how do we 'clean up' this node and remove the sun cluster core so we can re-install the sun cluster software from scratch?
    I don't quite understand how this has been left behind....
    thanks in advance
    S1black.

    You can use prodreg directly to clean up when your de-install has gone bad.
    Use:
    # prodreg browse
    to list the products. You may need to recurse down into the individual items. The use:
    # prodreg unregister ...
    to unregister and pkgrm to remove the packages manually.
    That has worked for me in the past. Not sure if it is the 'official' way though!
    Regards,
    Tim
    ---

  • Cluster node panic on booting

    Hi
    I have setup a two nodes cluster with sun cluster 3.1 u4 on sun v890+StorageTek6140.On the cluster runs oracle RAC with oracle 10g+clusterware.
    When all thing finished ,I mirrored the bootdisk with SVM on the nodes,but during boot in solaris,it panic like this:
    Jun 23 15:14:37 hisa ID[SUNWudlm.udlm]: [ID 795570 local0.error] Unix DLM version (2) and SUN Unix DLM library version (1): compatible.
    Jun 23 15:14:37 hisa Cluster.OPS.UCMMD: [ID 525628 daemon.notice] CMM: Cluster has reached quorum.
    Jun 23 15:14:37 hisa Cluster.OPS.UCMMD: [ID 377347 daemon.notice] CMM: Node hisa (nodeid = 1) is up; new incarnation number = 1182582874.
    Jun 23 15:14:37 hisa Cluster.OPS.UCMMD: [ID 377347 daemon.notice] CMM: Node hisb (nodeid = 2) is up; new incarnation number = 1182582873.
    Jun 23 15:14:38 hisa java[1656]: [ID 807473 user.error] pkcs11_softtoken: Keystore version failure.
    Jun 23 15:15:30 hisa cl_dlpitrans: [ID 624622 kern.notice] Notifying cluster that this node is panicking
    Jun 23 15:15:30 hisa unix: [ID 836849 kern.notice]
    Jun 23 15:15:30 hisa ^Mpanic[cpu2]/thread=2a100047cc0:
    Jun 23 15:15:30 hisa unix: [ID 213328 kern.notice] kstat_q_exit: qlen == 0
    Jun 23 15:15:30 hisa unix: [ID 100000 kern.notice]
    Jun 23 15:15:30 hisa genunix: [ID 723222 kern.notice] 000002a100047020 SUNW,UltraSPARC-IV+:kstat_q_panic+8 (300026ab150, 0, ffffffffffffffff, 2200061, 300026ab150, 5800)
    Jun 23 15:15:30 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000000002 0000060001815000 0000000000000000 0000030000241b80
    Jun 23 15:15:30 hisa %l4-7: 0000030000241b80 0000000000000000 0000000000000000 0000000001297400
    Jun 23 15:15:31 hisa genunix: [ID 723222 kern.notice] 000002a1000470d0 md:md_kstat_done+cc (600060dda08, 60001fc5938, 0, 600060dda30, 200, 300026ab040)
    Jun 23 15:15:31 hisa genunix: [ID 179002 kern.notice] %l0-3: 00000300026ab040 00000300026ab150 0000000000000009 0000000000000008
    Jun 23 15:15:31 hisa %l4-7: 00000300026d5d00 0000000000000002 0000000000000008 0000000000000000
    Jun 23 15:15:31 hisa genunix: [ID 723222 kern.notice] 000002a100047180 md_sp:sp_done+114 (0, 600060dda08, 0, 60001fc5938, 6000750ddf0, 704b7800)
    Jun 23 15:15:31 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000200061 00000300026aaff0 000000000000000b 000000000000000a
    Jun 23 15:15:31 hisa %l4-7: 0000000000004000 0000000000000000 0000000000000001 00000000704b7800
    Jun 23 15:15:31 hisa genunix: [ID 723222 kern.notice] 000002a100047230 md_stripe:stripe_done+13c (4, 6000721cb38, 703c1400, 6000750f930, 60007509ce8, 60007509d40)
    Jun 23 15:15:31 hisa genunix: [ID 179002 kern.notice] %l0-3: 000006000750b730 0000000000004000 0000000000000000 0000000000000001
    Jun 23 15:15:31 hisa %l4-7: 0000000000000000 0000000000000000 00000000703c1400 0000000000000000
    Jun 23 15:15:31 hisa genunix: [ID 723222 kern.notice] 000002a1000472e0 did:did_done+3c (60004b8b9c0, 60001236a80, 6000750b770, 6000131d280, 2200061, 0)
    Jun 23 15:15:31 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000001202390 00000000018cafd8 0000000001202310 0000000000200061
    Jun 23 15:15:31 hisa %l4-7: 0000000002200061 00000000fdffffff 00000000fdfffc00 000000007b666310
    Jun 23 15:15:32 hisa genunix: [ID 723222 kern.notice] 000002a100047390 ssd:ssd_return_command+198 (60001236a80, 60004b8b9c0, 4, 6000131d280, 4, 4)
    Jun 23 15:15:32 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000000020 00000000018cad68 00000000018cac00 000000000126ced8
    Jun 23 15:15:32 hisa %l4-7: 0000000000000020 00000000018caf08 00000000018cac00 0000000000000004
    Jun 23 15:15:32 hisa genunix: [ID 723222 kern.notice] 000002a100047440 ssd:ssdintr+268 (60006e0f458, 0, 0, 6000594d680, 60004b8b9c0, 60001236a80)
    Jun 23 15:15:32 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000000 0000000000004000 0000060006e0f4f8
    Jun 23 15:15:32 hisa %l4-7: 0000000000000000 0000000000000000 0000000000000001 0000000000000000
    Jun 23 15:15:32 hisa genunix: [ID 723222 kern.notice] 000002a1000474f0 scsi_vhci:vhci_intr+7b0 (600011e8dc0, 60006e0f4b8, 600018b13e0, 0, 60001822388, 60006e0f458)
    Jun 23 15:15:32 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000060001863a40 0000000000000000 0000060006e0f4b8 0000000000000000
    Jun 23 15:15:32 hisa %l4-7: 0000000000000000 0000060006e0f4f8 00000600018b1284 0000000000000028
    Jun 23 15:15:32 hisa genunix: [ID 723222 kern.notice] 000002a1000475d0 fcp:ssfcp_cmd_callback+64 (600018b1438, 0, 1, 813, 600018b1248, 600011c2f40)
    Jun 23 15:15:32 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000000002 0000060001815000 0000000000000000 0000030000241b80
    Jun 23 15:15:32 hisa %l4-7: 0000030000241b80 0000000000000000 0000000000000000 0000000001297400
    Jun 23 15:15:33 hisa genunix: [ID 723222 kern.notice] 000002a100047680 qlc:ql_fast_fcp_post+178 (600018b15d8, 128ae70, 600018b1438, 60001236fc0, 60001237038, 128ae70)
    Jun 23 15:15:33 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000400000 00000000018d5148 0000000000000803 0000000000000001
    Jun 23 15:15:33 hisa %l4-7: 00000600018b1438 00000600018b1438 00000600018b1438 00000600018b1278
    Jun 23 15:15:33 hisa genunix: [ID 723222 kern.notice] 000002a100047730 qlc:ql_24xx_status_entry+1ec (0, 300012008c0, 2a100047958, 2a10004796c, 0, 0)
    Jun 23 15:15:33 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000000811 00000600018b15d8 0000000000000000 0000000000080811
    Jun 23 15:15:33 hisa %l4-7: 00000000fff7ffff 0000000000000001 0000000000000001 0000000000000000
    Jun 23 15:15:33 hisa genunix: [ID 723222 kern.notice] 000002a1000477e0 qlc:ql_response_pkt+248 (60001236fc0, 2a100047958, 2a10004796c, 2a100047968, 20aa, 2840)
    Jun 23 15:15:33 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000004000 0000000000002000 0000000000000000
    Jun 23 15:15:33 hisa %l4-7: 0000000000000000 00000300012008c0 0000000000000000 0000000000000000
    Jun 23 15:15:33 hisa genunix: [ID 723222 kern.notice] 000002a100047890 qlc:ql_isr+664 (60001236fc0, a2, 8000, a2, ffffffffffffffff, 60001237018)
    Jun 23 15:15:34 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000002000 0000000000004000 0000060001236fd8 00000000012db3a8
    Jun 23 15:15:34 hisa %l4-7: 0000000000000001 0000000000000000 0000000000000000 0000000000000003
    Jun 23 15:15:34 hisa genunix: [ID 723222 kern.notice] 000002a100047970 qlc:___const_seg_900000101+db4 (60001236fc0, 0, 60001236fc0, 0, 0, 60001237018)
    Jun 23 15:15:34 hisa genunix: [ID 179002 kern.notice] %l0-3: 0000000000002000 0000000000004000 0000060001236fd8 00000000012db3a8
    Jun 23 15:15:34 hisa %l4-7: 0000000000000001 0000000000000001 0000000000000000 00000000000001ab
    Jun 23 15:15:34 hisa genunix: [ID 723222 kern.notice] 000002a100047a20 pcisch:pci_intr_wrapper+b4 (6000122f7b0, 600010b5230, 0, 0, 0, 6000136c738)
    Jun 23 15:15:34 hisa genunix: [ID 179002 kern.notice] %l0-3: 00000000018d5170 00000600010f8c80 00000000018d51b8 0000000000000001
    Jun 23 15:15:34 hisa %l4-7: 0000030000220220 0000060001236fc0 0000000000000000 00000000012dc158
    Jun 23 15:15:34 hisa unix: [ID 100000 kern.notice]
    Jun 23 15:15:34 hisa genunix: [ID 672855 kern.notice] syncing file systems...
    Has any one ever met like this?

    Thank you for your attation!
    I would like to add some other information to this issue
    Between the hosts and storage,we did not use switch but directly connnect them use fibre cables ,I don't konw if this way could bring problems and we did not use QFS either.Beside panic at booting,sometimes the messages would display this information:
    Jun 26 15:32:20 hisb genunix: [ID 454863 kern.info] dump on /dev/dsk/c4t500000E0147BACB0d0s1 size 8198 MB
    Jun 26 15:33:18 hisb cacao[978]: [ID 388282 daemon.warning] com.sun.cacao.ModuleManager.garbage : Cannot garbage class loader for module com.sun.cacao.snmpv3_adaptor
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group rac-rg state on node hisb change to RG_PENDING_OFFLINE
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-svm-rs state on node hisb change to R_MON_STOPPING
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-udlm-rs state on node hisb change to R_MON_STOPPING
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-framework-rs state on node hisb change to R_MON_STOPPING
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_framework_monitor_stop> for resource <rac-framework-rs>, resource group <rac-rg>, timeout <3600> seconds
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_udlm_monitor_stop> for resource <rac-udlm-rs>, resource group <rac-rg>, timeout <300> seconds
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_svm_monitor_stop> for resource <rac-svm-rs>, resource group <rac-rg>, timeout <300> seconds
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_framework_monitor_stop> completed successfully for resource <rac-framework-rs>, resource group <rac-rg>, time used: 0% of timeout <3600 seconds>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-framework-rs state on node hisb change to R_ONLINE_UNMON
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_svm_monitor_stop> completed successfully for resource <rac-svm-rs>, resource group <rac-rg>, time used: 0% of timeout <300 seconds>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-svm-rs state on node hisb change to R_ONLINE_UNMON
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-svm-rs state on node hisb change to R_STOPPING
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource rac-svm-rs status on node hisb change to R_FM_UNKNOWN
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource rac-svm-rs status msg on node hisb change to <Stopping>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_svm_stop> for resource <rac-svm-rs>, resource group <rac-rg>, timeout <300> seconds
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_udlm_monitor_stop> completed successfully for resource <rac-udlm-rs>, resource group <rac-rg>, time used: 0% of timeout <300 seconds>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-udlm-rs state on node hisb change to R_ONLINE_UNMON
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-udlm-rs state on node hisb change to R_STOPPING
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_udlm_stop> for resource <rac-udlm-rs>, resource group <rac-rg>, timeout <300> seconds
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource rac-udlm-rs status on node hisb change to R_FM_UNKNOWN
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource rac-udlm-rs status msg on node hisb change to <Stopping>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource rac-svm-rs status msg on node hisb change to <RAC framework is running>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_svm_stop> completed successfully for resource <rac-svm-rs>, resource group <rac-rg>, time used: 0% of timeout <300 seconds>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-svm-rs state on node hisb change to R_OFFLINE
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource rac-udlm-rs status msg on node hisb change to <RAC framework is running>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_udlm_stop> completed successfully for resource <rac-udlm-rs>, resource group <rac-rg>, time used: 0% of timeout <300 seconds>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-udlm-rs state on node hisb change to R_OFFLINE
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-framework-rs state on node hisb change to R_STOPPING
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_framework_stop> for resource <rac-framework-rs>, resource group <rac-rg>, timeout <300> seconds
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource rac-framework-rs status on node hisb change to R_FM_UNKNOWN
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource rac-framework-rs status msg on node hisb change to <Stopping>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource rac-framework-rs status msg on node hisb change to <RAC framework is running>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_framework_stop> completed successfully for resource <rac-framework-rs>, resource group <rac-rg>, time used: 0% of timeout <300 seconds>
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource rac-framework-rs state on node hisb change to R_OFFLINE
    Jun 26 15:33:21 hisb Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group rac-rg state on node hisb change to RG_OFFLINE
    Jun 26 15:33:21 hisb xntpd[568]: [ID 866926 daemon.notice] xntpd exiting on signal 15
    Jun 26 15:33:23 hisb root: [ID 702911 user.error] Oracle CRSD 1099 set to stop
    Jun 26 15:33:23 hisb root: [ID 702911 user.error] Oracle CRSD 1099 shutdown completed
    Jun 26 15:33:23 hisb root: [ID 702911 user.error] Oracle EVMD set to stop
    Jun 26 15:33:23 hisb root: [ID 702911 user.error] Oracle CSSD being stopped
    Jun 26 15:33:41 hisb FIN_SVC_CTRL: [ID 702911 local0.error] Warning:      Because one or more of the sun cluster userland cluster      services are offline this service goes offline
    Jun 26 15:33:41 hisb cl_eventlogd[843]: [ID 247336 daemon.error] Going down on signal 15.
    Jun 26 15:33:43 hisb root: [ID 702911 user.error] Oracle CSSD graceful shutdown
    Jun 26 15:33:44 hisb Cluster.PNM: [ID 226280 daemon.notice] PNM daemon exiting.
    Regards,
    Caicia

  • Sun Cluster + meta set shared disks -

    Guys, I am looking for some instructions that most sun administrators would mostly know i believe.
    I am trying to create some cluster resource groups and resources etc., but before that i am creating the file systems that is going to be used by two nodes in the sun cluster 3.2. we use SVM.
    I have some drives that i plan to use for this specific cluster resource group that is yet to be created.
    i know i have to create a metaset since thats how other resource groups in my environment are setup already so i will go with the same concept.
    # metaset -s TESTNAME
    Set name = TESTNAME, Set number = 5
    Host Owner
    server1
    server2
    Mediator Host(s) Aliases
    server1
    server2
    # metaset -s TESTNAME -a /dev/did/dsk/d15
    metaset: server1: TESTNAME: drive d15 is not common with host server2
    # scdidadm -L | grep d6
    6 server1:/dev/rdsk/c10t6005076307FFC4520000000000004133d0 /dev/did/rdsk/d6
    6 server2:/dev/rdsk/c10t6005076307FFC4520000000000004133d0 /dev/did/rdsk/d6
    # scdidadm -L | grep d15
    15 server1:/dev/rdsk/c10t6005076307FFC4520000000000004121d0 /dev/did/rdsk/d15
    Do you see what i am trying to say ? If i want to add d6 in the metaset it will go through fine, but not for d15 since it shows only against one node as you see from the scdidadm output above.
    Please Let me know how i share the drive d15 same as d6 with the other node too. thanks much for your help.
    -Param
    Edited by: paramkrish on Feb 18, 2010 11:01 PM

    Hi, Thanks for your reply. You got me wrong. I am not asking you to be liable for the changes you recommend since i know thats not reasonable while asking for help. I am aware this is not a support site but a forum to exchange information that people already are aware of.
    We have a support contract but that is only for the sun hardware and those support folks are somewhat ok when it comes to the Solaris and setup but not that experts. I will certainly seek their help when needed and thats my last option. Since i thought this problem that i see is possibly something trivial i quickly posted a question in this forum.
    We do have a test environment but that do not have two nodes but a 1 node with zone clusters. hence i dont get to see this similar problem in the test environment and also the "cldev populate" would be of no use as well to me if i try it in the test environment i think since we dont have two nodes.
    I will check the logs as you suggested and will get back if i find something. If you have any other thoughts feel free to let me know ( dont bother about the risks since i know i can take care of that ).
    -Param

  • Sun Cluster, vx mode - "mode: enabled: cluster inactive"

    Hi,
    I have installed sun cluster 3.2 on solaris 9 (Solaris 9 9/05). I want to make it an active-active setup with shared veritas DGs. This setup also has vxvm 5 (Veritas-5.0_MP1_RP4.4) with rolling pack 4 and solaris has all the latest pathes updated via "updatemanager". The shared storage comes from DMX800.
    In order to get VxVM in cluster mode I have installed licenses for CVM, VCS and also ORCLudlm (3.3.4.8 ) package.
    The sun cluster install has all the necessary framework packages. But the VX mode refuses to be in cluster mode:
    #vxdctl -c mode
    mode: enabled: cluster inactive
    Issue is udlm daemon "dlmmon" isnt starting.
    Also I see the below errors
    cacao: Error: Fail to start cacao agent. (instance default)
    Error: Fail to start cacao agent. (instance default)
    AND messages file on nodeA shows the below error
    [ID 988885 daemon.error] libpnm error: can't connect to PNMd on nodeB
    I am at my wits end on how to resolve this issue :(
    Any help is appreciated.
    Regards,
    Ashish

    Well it could be the problem I ran into... and I went round and round for ages trying to figure out what was wrong - before I realised my mistake.
    Assuming you have VxVM/CVM licensed properly, check that ORCLudlm is installed on all nodes. Then create your rac-framework-rg and ensure you have a rac-framework-rs, rac-udlm-rs AND a rac-cvm-rs resource. Now, unless you have both of these and they can be brought enabled and brought online, then you'll have exactly the problem you are seeing.
    Hope that helps,
    Tim
    Edited by: Tim.Read on Feb 19, 2008 4:08 AM
    Ooops missed the rac-udlm-rs ... Doh!

  • Netegrity SiteMinder Agent on a Sun Cluster

    Salut!
    I have some doubts on how to run the Netegrity SiteMinder Agent on my Sun Cluster. Easiest solution would presumably be to run the Agent as a service on each of the physical cluster nodes.
    The application now can be accessed by the physical IP address/DNS-entry of the current cluster node, and the virtual one of the resource group. The users will access it by the virtual one. Now I somehow have to ensure that the agent watches the virtual one, too. Can this be configured? Takes DNS care of that (most likely not)?
    Or do I have to integrate the agent in the cluster software itself?
    Has anybody done that before?
    Thanks for your help,
    greetings,
    Martin

    Philippe,
    DS 6 Sun Cluster Agent was not tested with SC 3.2 in Zones.
    Zone support came with SC 3.2, and DS 6 Cluster Agent was built with SC 3.1, tested with SC 3.1 and 3.2 in the Global zone.
    Regards,
    Ludovic.

  • Didadm: unable to determine hostname.  error on Sun cluster 4.0 - Solaris11

    Trying to install Sun Cluster 4.0 on Sun Solaris 11 (x86-64).
    iscs sharedi Quorum Disk are available in /dev/rdsk/ .. ran
    devfsadm
    cldevice populate
    But don't see DID devices getting populated in /dev/did.
    Also when scdidadm -L is issued getting the following error. Has any seen the same error ??
    - didadm: unable to determine hostname.
    Found in cluster 3.2 there was a Bug 6380956: didadm should exit with error message if it cannot determine the hostname
    The sun cluster command didadm, didadm -l in particular, requires the hostname to function correctly. It uses the standard C library function gethostname to achieve this.
    Early in the cluster boot, prior to the service svc:/system/identity:node coming online, gethostname() returns an empty string. This breaks didadm.
    Can anyone point me in the right direction to get past this issue with shared quorum disk DID.

    Let's step back a bit. First, what hardware are you installing on? Is it a supported platform or is it some guest VM? (That might contribute to the problems).
    Next, after you installed Solaris 11, did the system boot cleanly and all the services come up? (svcs -x). If it did boot cleanly, what did 'uname -n' return? Do commands like 'getent hosts <your_hostname>' work? If there are problems here, Solaris Cluster won't be able to get round them.
    If the Solaris install was clean, what were the results of the above host name commands after OSC was installed? Do the hostnames still resolve? If not, you need to look at why that is happening first.
    Regards,
    Tim
    ---

  • Sun Cluster 3.2, Zones, HA-Oracle, & FSS

    I have a customer who wants to deploy a cluster utilizing Solaris 10 Zones. With creating the resource groups with the following: nodeA:zoneA, nodeB:zoneA, the Oracle resource group will be contained in the respective zone.
    First create the Zone after the Sun Cluster software has been installed?
    When installing Oracle, the binaries and such should reside in the Zone or in the global zone?
    When configuring FSS, should this be done after the resources have been configured?
    Thanks in advance,
    Ryan

    The Oracle biaries are not big at all, ther is not much IO happening at this fs, you can easily create a ufs file system for each zone, mount that via lofs mounts into the zone. Or you can create a zpool for the binaries. My personal take would be to include them in the root path of the zones an you are set.
    You must install the binaries in all zones where your Oracle database can fail over to. To reduce the maintenance work in the case of upgrades I would limit the binary installation to the zones in the nodelist of your oracle resource group. If you install the binaries on all nodes/zones of the cluster you have more work when it comes to an upgrade.
    Kind Regards
    Detlef

  • TimesTen database in Sun Cluster environment

    Hi,
    Currently we have our application together with the TimesTen database installed at the customer on two different nodes (running on Sun Solaris 10). The second node acts as a backup to provide failover functionality, although right now only manual failover is supported.
    We are now looking into a hot-standby / high availability solution using Sun Cluster software. As understood from the documentation, applications can be 'plugged-in' to the Sun Cluster using Agents to monitor the application. Sun Cluster Agents should be already available for certain applications such as:
    # MySQL
    # Oracle 9i, 10g (HA and RAC)
    # Oracle 9iAS Application Server
    # PostgreSQL
    (See http://www.sun.com/software/solaris/cluster/faq.jsp#q_19)
    Our question is whether Sun Cluster Agents are already (freely) available for TimesTen? If so, where to find them. If not, should we write a specific Agent separately for TimesTen or handle database problems from the application.
    Does someone have any experience using TimesTen in a Sun Cluster environment?
    Thanks in advance!

    Yes, we use 2-way replication, but we don't use cache connect. The replication is created like this on both servers:
    create replication MYDB.REPSCHEME
    element SERVER01_DS datastore
    master MYDB on "SERVER01_REP"
    transmit nondurable
    subscriber MYDB on "SERVER02_REP"
    element SERVER02_DS datastore
    master MYDB on "SERVER02_REP"
    transmit nondurable
    subscriber MYDB on "SERVER01_REP"
    store MYDB on "SERVER01_REP"
    port 16004
    failthreshold 500
    store MYDB on "SERVER02_REP"
    port 16004
    failthreshold 500
    The application runs on SERVER01 and is standby on SERVER02. If an invalid state is detected in the application, the application on SERVER01 is stopped and the application on SERVER02 is started.
    In addition to this, we want to fail over if the database on the SERVER01 is in invalid state. What should we have monitored by the Clustering Agent to detect an invalid state in TT?

  • RAW disks for Oracle 10R2 RAC NO SUN CLUSTER

    Yes you read it correctly....no Sun cluster. Then why am I on the Forum right? Well we have one Sun Cluster and another that is RAC only for testing. Between Oracle and Sun, neither accept any fault for problems with their perfectly honed products. Currently, I have multipathed fiber hba's to a Storedge 3510, and I've tried to get Oracle to use a raw lun for the ocr and voting disks. It doesn't see the disk. I've made sure they are stamped for oracle:dba, and tried oracle:oinstall. When presenting /dev/rdsk/C7t<long number>d0s6 for the ocr, I get a "can not find disk path." Does Oracle raw mean SVM raw? Should I create metadisks?

    "Between Oracle and Sun, neither accept any fault for problems with their perfectly honed products"...more specific:
    Not that the word "fault" is characterization of any liability, but a technical characterization of acting like a responsible stakeholder when you sell your product to a corporation. I've been working on the same project for a year, as an engineer. Not withstanding a huge expanse of management issues over the project, when technical gray areas have been reached, whereas our team has tried to get information to solve the issue. The area has become a big bouncing hot potato. Specifically, when Oracle has a problem reading a storage device, according to Oracle, that is a Sun issue. According to Sun, they didn't certify the software on that piece of equipment, so go talk to Oracle. In the sun cluster arena, if starting the database creates a node eviction from the cluster, good luck getting any specific team to say, that's our problem. Sun will say that Oracle writes crappy cluster verify scripts, and Oracle will say that Sun has not properly certified the device for use with their product. Man, I've seen it. The first time I said O.K. how do we avoid this in the future, the second time I said how did I let this happen again, and after more issues, money spent, hours lost, and customers, pissed --do the math.   I've even went as far as say, find me a plug and play production model for this specific environment, but good luck getting two companies to sign the specs for it...neither wants to stamp their name on the product due to the liability.  Yes your right, I should beat the account team, but as an engineer, man that's not my area, and I have other problems that I was hired to deal with.  I could go on.  What really is a slap in face is no one wants to work on these projects, if given the choice with doing a Windows deployment, because they can pop out mind bending amounts of builds why we plop along figuring out why clusterware doesn't like slice 6 of a /device/scsi_vhci/ .  Try finding good documentation on that.  ~You can deploy faster, but you can't pay more!                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • Apply one non-kernel Solaris10 patch at Sun Cluster ***Beginner Question***

    Dear Sir/Madam,
    Our two Solaris 10 servers are running Sun Cluster 3.3. One server "cluster-1" has one online running zone "classical". Another server
    "cluster-2" has two online running zones, namely "romantic" and "modern". We are tying to install a regular non-kernel patch #145200-03 at cluster-1 LIVE which doesn't have prerequisite and no need to reboot afterwards. Our goal is to install this patch at the global zone,
    three local zones, i.e., classical, romantic and modern at both cluster servers, cluster-1 and cluster02.
    Unfortunately, when we began our patching at cluster-1, it could patch the running zone "classical" but we were getting the following errors which prevent it from continuing with patching at zones, i.e., "romantic" and "modern" which are running on cluster-2. And when we try to patch cluster-2, we are getting similiar patching error about failing to boot non-global zone "classical" which is in cluster-1.
    Any idea how I could resolve this ? Do we have to shut down the cluster in order to apply this patch ? I would prefer to apply this
    patch with the Sun Cluster running. If not, what's the preferred way to apply simple non-reboot patch at all the zones at both nodes in the Sun Cluster ?
    Like to hear from folks who have experience in dealing with patching in Sun Cluster.
    Thanks, Mr. Channey
    p.s. Below are output form the patch #145200-03 run, zoneadm and clrg
    outputs at cluster-1
    root@cluster-1# patchadd 145200-03
    Validating patches...
    Loading patches installed on the system...
    Done!
    Loading patches requested to install.
    Done!
    Checking patches that you specified for installation.
    Done!
    Approved patches will be installed in this order:
    145200-03
    Preparing checklist for non-global zone check...
    Checking non-global zones...
    Failed to boot non-global zone romantic
    exiting
    root@cluster-1# zoneadm list -iv
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    15 classical running /zone-classical native shared
    - romantic installed /zone-romantic native shared
    - modern installed /zone-modern native shared
    root@cluster-1# clrg status
    === Cluster Resource Groups ===
    Group Name Node Name Suspended Status
    classical cluster-1 No Online
    cluster-2 No Offline
    romantic cluster-1 No Offline
    cluster-2 No Online
    modern cluster-1 No Offline
    cluster-2 No Online

    Hi Hartmut,
    I kind of got the idea. Just want to make sure. The zones 'romantic' and 'modern' show "installed" as the current status at cluster-1. These 2 zones are in fact running and online at cluster-2. So I will issue your commands below at cluster-2 to detach these zones to "configured" status :
    cluster-2 # zoneadm -z romantic detach
    cluster-2 # zoneadm -z modern detach
    Afterwards, I apply the Solaris patch at cluster-2. Then, I go to cluster-1 and apply the same Solaris patch. Once I am done patching both cluster-1 and cluster-2, I will
    go back to cluster-2 and run the following commands to force these zones back to "installed" status :
    cluster-2 # zoneadm -z romantic attach -f
    cluster-2 # zoneadm -z modern attach -f
    CORRECT ?? Please let me know if I am wrong or if there's any step missing. Thanks much, Humphrey
    root@cluster-1# zoneadm list -iv
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    15 classical running /zone-classical native shared
    - romantic installed /zone-romantic native shared
    - modern installed /zone-modern native shared

  • CRS over SUN cluster

    hello
    sorry i am a newbie in this area
    what is the advantages of installing oracle CRS then the database over sun cluster?
    is it needed ? since we already installed SUN cluster....
    also could it work if only we install oracle database 10gR2 on the sun cluster?
    what are the advantages and disadvantages if we never use SUN cluster,
    just install oracle CRS then install the oracle database
    sorry for the long question
    i really appreciate your help

    what is the advantages of installing oracle CRS then the database over sun cluster?
    is it needed ? since we already installed SUN cluster....1. CRS is required irrespective of whether or not your any vendor clusterware (such as Sun Cluster).
    2. Check this white paper from Sun Microsystems which attempts to high light the advantages of using Oracle RAC with Sun Cluster:
    http://www.sun.com/blueprints/0105/819-1466.pdf
    also could it work if only we install oracle database 10gR2 on the sun cluster?As indicated above, No. You would still need CRS.
    what are the advantages and disadvantages if we never use SUN cluster,See if the above mentioned document helps you with this question.
    just install oracle CRS then install the oracle databaseYou could very well do this.
    HTH
    Thanks
    -Chandra Pabba

  • Veritas volume replicator under sun cluster

    Hi,
    Can we use VVR under sun cluster server? We want to replicate data on sun cluster nodes to a seperate box which is not part of the sun cluster. Is there any special configuration needed?
    Thanks and appreciate any response.

    Don't forget you will also need to obtain a VXVM Cluster functionality license to use VXVM in a Sun cluster, which is a separate license key to the base VXVM.

  • Availability of Sun Cluster 3.2

    Has anyone some news about the release date and the new features of Sun Cluster 3.2. It was once announced by end of 2006.
    Fritz

    Hi Tim
    I more or less expected this answer from you ;-)
    We are planing to use the Sun Cluster to switch Zones / Containers as GDS between nodes. We have currently some installations with Sun Cluster 3.1, but we are now in the process of evaluating a framework to deploy Solaris Zones in a large scale. This would also include containers in a clustered environment. There Sun Cluster 3.2 seems to have some interesting new features.
    Unfortunately I had not the resources to paritipizate in the beta program.
    Regards
    Fritz

Maybe you are looking for

  • Disk Full Alert

    Hi - I'm running 10.4.11 on Dual 2 GHZ PowerPC G5. I have 2 GB DDR SDRAM installed and 30 GB of free space on my hard drive. I keep getting a Disk Full Alert saying that "your Mac OS X start up disk has no more space available for application memory.

  • Zero Balance of Assets

    Dear All, I need the flow of make zero balance of existing assets (i.e.,) making scrap of assets... Can anybody tell me the process of how to do? Regards, Mohan. Edited by: Mohan on Feb 8, 2010 3:03 PM

  • Need to write a query while configuring Database Adapter

    Hi, I am working in BPEL.i need to write a query while configuring the DB adapter. i need to query a table which contains five rows each row containing 7 columns. my query should be like if i give the first column name(for eg:Title) it should return

  • Fdm 11.1.2.1 mapping table scripts does not accept vba functions like 'lef

    ok, in fdm 11.1.2.1, in the UD2 column of IMPORTS, i have something like "award::purpose::type" , and in the mapping table script , i'm trying to use the 'left' and 'mid' functions to break down the UD2 columns (delimited by "::"), and then mapping i

  • Unable to open Photoshop CS5 from LR3

    Just installed LR3... and found that when I have a photo open in it and  I  attempt to go to "Photo" / "Edit in"  within LR3 all references to Photoshop are "Dimmed". Have checked everything I can think of in Preferences, etc. but can't figure out wh