Solaris cluster 3.2 Sparc

Hi folks
First things first. I may not have great knowledge about Solaris clusters, so please be merciful :)
Here it is what I have:
- 2 x Netra T1 AC200 each with 1GB Ram, 2x18GB disks, 500 MHZ Sparc Cpu, 4 port ethernet card
- 1 array netra d130 3x36 GB
-- cable et all, switches , you name it
So, I set up the OS, all ok. I set up the cluster, all SEEMS to be ok.
But when I define my resources and stuff like that all goes fine, except when I try top bring the resource group on line.
On another configuration I teste the shared logical hostname and works fine.
Group Name Resources
Resources: ingresc nodec ingresr
-- Resource Groups --
Group Name Node Name State Suspended
Group: ingresc node2 Unmanaged No
Group: ingresc node1 Unmanaged No
-- Resources --
Resource Name Node Name State Status Message
Resource: nodec node2 Offline Offline
Resource: nodec node1 Offline Offline
Resource: ingresr node2 Offline Offline
Resource: ingresr node1 Offline Offline
scswitch: (C969069) Request failed because resource group ingresc is in ERROR_STOP_FAILED state and requires operator attention
Now, in /var/adm/messsages I spotted this :
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_stop>:tag=<IngresNCG.nodec.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
A little bit of research points in the direction of a bug (see CR 6565601)
Here it is what I see as my options:
1 - reinstall Solaris OS, but not the Solaris Cluster 3.2, instead using Solaris Express 10/07 or 2/08. But will this combination work ? Or will it work only in the combination Solaris Cluster Express and Solaris Express Developer Edition ? If the later, which versions will work together ?
2 - Beg for a Solaris Cluster 3.2 patch, although in my humble opinion, this should be free since it looks to me that once you write your own stuff, you run in the bug, and after all it is education
Any ideas, help, greatly appreciated
Many thanks
Armand

Although names are different since I used two setups, this is the relevant part of /var/adm/messages.
It looks to me Ingres resource is failing:
Mar  6 17:08:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_prenet_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar  6 17:08:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_prenet_start>:tag=<IngresNCG.nodec.10>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar  6 17:08:05 node2 svc.startd[8]: [ID 652011 daemon.warning] svc:/system/cluster/scsymon-srv:default: Method "/usr/cluster/lib/svc/method/svc_scsymon_srv start" failed with exit status 96.
Mar  6 17:08:05 node2 svc.startd[8]: [ID 748625 daemon.error] system/cluster/scsymon-srv:default misconfigured: transitioned to maintenance (see 'svcs -xv' for details)
Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_prenet_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 1% of timeout <300 seconds>
Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_PRENET_STARTED
Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_STARTING
Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <500> seconds
Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_start>:tag=<IngresNCG.nodec.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_ONLINE
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <LogicalHostname online.>
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <500 seconds>
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_JUST_STARTED
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE_UNMON
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STARTING
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_MON_STARTING
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_UNKNOWN
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Starting>
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <bin/ingres_server_start> for resource <IngresNCR>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_monitor_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</global/disk2s0/ing_nc_1/ingresclu/bin/ingres_server_start>:tag=<IngresNCG.IngresNCR.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_monitor_start>:tag=<IngresNCG.nodec.7>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar  6 17:08:12 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_monitor_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
Mar  6 17:08:12 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE
Mar  6 17:08:13 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server online.>
Mar  6 17:08:30 node2 sendmail[534]: [ID 702911 mail.alert] unable to qualify my own domain name (node2) -- using short name
Mar  6 17:08:30 node2 sendmail[535]: [ID 702911 mail.alert] unable to qualify my own domain name (node2) -- using short name
Mar  6 17:08:31 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server offline.>
Mar  6 17:08:45 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,stop]: [ID 147958 daemon.error] ERROR : HA-Ingres failed to stop.
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_FAULTED
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Ingres DBMS server faulted.>
Mar  6 17:08:46 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,start]: [ID 335575 daemon.error] ERROR : Stop method failed for the HA-Ingres data service.
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 938318 daemon.error] Method <bin/ingres_server_start> failed on resource <IngresNCR> in resource group <IngresNCG> [exit code <1>, time used: 11% of timeout <300 seconds>]
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_START_FAILED
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_PENDING_OFF_START_FAILED
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STOPPING
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_MON_STOPPING
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_UNKNOWN
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Stopping>
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <bin/ingres_server_stop> for resource <IngresNCR>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_monitor_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</global/disk2s0/ing_nc_1/ingresclu/bin/ingres_server_stop>:tag=<IngresNCG.IngresNCR.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_monitor_stop>:tag=<IngresNCG.nodec.8>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar  6 17:08:47 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server offline.>
Mar  6 17:08:48 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_monitor_stop> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
Mar  6 17:08:48 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE_UNMON
Mar  6 17:09:00 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,stop]: [ID 147958 daemon.error] ERROR : HA-Ingres failed to stop.
Mar  6 17:09:02 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_FAULTED
Mar  6 17:09:02 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Ingres DBMS server faulted.>
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 938318 daemon.error] Method <bin/ingres_server_stop> failed on resource <IngresNCR> in resource group <IngresNCG> [exit code <2>, time used: 5% of timeout <300 seconds>]
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STOP_FAILED
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_PENDING_OFF_STOP_FAILED
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 424774 daemon.error] Resource group <IngresNCG> requires operator attention due to STOP failure
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_STOPPING
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_UNKNOWN
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <Stopping>
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_stop>:tag=<IngresNCG.nodec.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar  6 17:09:04 node2 ip: [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 192.168.005.085:0, remote = 000.000.000.000:0, start = -2, end = 6
Mar  6 17:09:04 node2 ip: [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection
Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_OFFLINE
Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <LogicalHostname offline.>
Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_stop> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_OFFLINE
Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_ERROR_STOP_FAILED
Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 424774 daemon.error] Resource group <IngresNCG> requires operator attention due to STOP failure
Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 663692 daemon.error] failback attempt failed on resource group <IngresNCG> with error <resource group in ERROR_STOP_FAILED state requires operator attention>
Mar  6 17:09:10 node2 java[1652]: [ID 807473 user.error] pkcs11_softtoken: Keystore version failure.Thank you
Armand

Similar Messages

  • Hardware recommendations for learning Solaris Cluster on Sparc (at home)

    On a low budget, I'd like to put together a Solaris Cluster on Sparc (at home). At "work" in the next year we will be implementing a Solaris Cluster to run Tomcat and a custom CORBA server. (These apps will be migrated from very old hardware and VCS) The CORBA server is a Sparc binary, hence the need for Sparc. I'd like my home-office cluster to be similar in function to what I have at work. At work we have (2) T5120 Servers and a 2540 (2500-M2) Array waiting. From looking at the Solaris Cluster docs, it looks like you use a 2540 in a Direct-Connect configuration. We will be going to Solaris Cluster training eventually, but not soon. In the meantime, I'd like to keep/gain some skills/experience.
    Potential (cheap) Home Cluster:
    (2) SunFire V245 or (2) T1000 or (2) something_cheap
    connected to
    (1) Storedge D2 or (1) Storedge S1
    My main desire, is for the interconnects and failover on this Home Cluster to behave the same way as the T5120s with the 2540 Array. Example, if I yank a HD (or replace) then I'd like it to give very similar messages to what I will face at work in the future. I'd like the creation of ZFS pools etc to work similarly. I'd like SCSI cards (HBAs or whatever) and cabling to be cheap.
    Any recommendations on hardware> Servers? Arrays? SCSI Cards/cabling?
    Thanks,
    Scott

    I settled on:
    (2) Sunfire V210
    Storedge 3120
    Connected by VHDCI
    All used equipment at a cheap price. Should be a great little testbed.

  • Common Agent Container Problem on New Solaris Cluster 3.2 (1/09)

    Hi,
    I have just installed Solaris Cluster 3.2 u2. Getting following error when run cluster check or sccheck -v 2:
    cluster check
    cacaocsc: unable to connect: Connection refused by peer
    cluster check: (C704199) unable to reach Common Agent Container
    sccheck -v 2
    sccheck: Requesting explorer data and node report from node1.
    sccheck: Requesting explorer data and node report from node1-cl.
    sccheck: node1: Additional explorer arguments: -w !default,cluster,disks,etc,messages,nbu,netinfo,patch,pkg,sds,lvm,sonoma,sysconfig,var,vxvm,vxfs,vxfsextended -c "/usr/cluster/lib/sccheck/vfstab-global-mount-points" -c "/usr/cluster/lib/sccheck/netapp-nas-quorum-devices"
    sccheck: node1: WARNING: EXP_CONTRACT_ID not set!
    sccheck: node1: WARNING: EXP_REPLY not set!
    sccheck: node1:
    sccheck: node1: 2 warnings found in /etc/opt/SUNWexplo/default/explorer
    sccheck: node1:
    sccheck: node1: Mar 07 21:56:15 node1[5519] explorer: ERROR explorer
    sccheck: node1: Mar 07 21:56:15 node1[5519] explorer: ERROR Module or alias sds does not exist.
    sccheck: node1: Explorer run failed:
    sccheck: node1-cl: Additional explorer arguments: -w !default,cluster,disks,etc,messages,nbu,netinfo,patch,pkg,sds,lvm,sonoma,sysconfig,var,vxvm,vxfs,vxfsextended -c "/usr/cluster/lib/sccheck/vfstab-global-mount-points" -c "/usr/cluster/lib/sccheck/netapp-nas-quorum-devices"
    sccheck: node1-cl: WARNING: EXP_CONTRACT_ID not set!
    sccheck: node1-cl: WARNING: EXP_REPLY not set!
    sccheck: node1-cl:
    sccheck: node1-cl: 2 warnings found in /etc/opt/SUNWexplo/default/explorer
    sccheck: node1-cl:
    sccheck: node1-cl: Mar 07 21:56:15 node1-cl[3851] explorer: ERROR explorer
    sccheck: node1-cl: Mar 07 21:56:15 node1-cl[3851] explorer: ERROR Module or alias sds does not exist.
    sccheck: node1-cl: Explorer run failed:
    sccheck: node1 error: Unexpected early return from server.
    sccheck: node1-cl error: Unexpected early return from server.
    sccheck: Unable to run checks on: node1,node1-cl
    Even when I try to run Cluster Manager from web I am getting following error:
    "A communication problem was encountered by the system"
    Following is the version I am using:
    root@node1 #
    root@node1 # cacaoadm -V
    2.2.0.1
    root@node1 #
    root@node1 # smcwebserver -V
    Version 3.1
    root@node1 #
    root@node1 #
    root@node1 # svcs -a | grep container
    online 21:30:30 svc:/application/management/common-agent-container-1:default
    root@node1 #
    root@node1 #
    root@node1 # svcs -a | grep webconsole
    online 21:20:29 svc:/system/webconsole:console
    root@node1 #
    root@node1 #
    root@node1 #
    root@node1 # cat /etc/release
    Solaris 10 5/08 s10s_u5wos_10 SPARC
    Copyright 2008 Sun Microsystems, Inc. All Rights Reserved.
    Use is subject to license terms.
    Assembled 24 March 2008
    root@node1 #
    root@node1 #
    root@node1 # cat /etc/cluster/release
    Sun Cluster 3.2u2 for Solaris 10 sparc
    Copyright 2008 Sun Microsystems, Inc. All Rights Reserved.
    root@node1 #
    root@node1 #
    Do you have any idea or tips to solve this problem?
    Thanks
    Edited by: shmozumder on Mar 8, 2009 1:50 AM

    Hi Tim,
    Yes - it's running:
    default instance is ENABLED at system startup.
    Smf monitoring process:
    29410
    29411
    Uptime: 0 day(s), 0:14
    dssdbgen03p1 # svcs -a |grep -i comm
    disabled 17:03:06 svc:/network/rpc/mdcomm:default
    online 13:20:53 svc:/application/management/common-agent-container-2:default
    uninitialized 16:58:32 svc:/application/management/common-agent-container-1:default
    I also get messages like the following in the /var/adm/messages file when starting cacaoadm :
    Jul 7 13:20:52 dssdbgen03p1 java.lang.ClassNotFoundException: Cannot find class com.sun.cacao.rmi.impl.RMIModule in module com.sun.cacao.rmi
    Jul 7 13:20:52 dssdbgen03p1 java.lang.ClassNotFoundException: Cannot find class com.sun.cacao.invoker.impl.InvokerModule in module com.sun.cacao.invoker
    Jul 7 13:20:52 dssdbgen03p1 java.lang.ClassNotFoundException: Cannot find class com.sun.cacao.snmpv3adaptor.SnmpV3AdaptorModule in module com.sun.cacao.snmpv3_adaptor
    Jul 7 13:20:52 dssdbgen03p1 java.lang.ClassNotFoundException: Cannot find class com.sun.cacao.dtrace.impl.DTraceModule in module com.sun.cacao.dtrace
    Jul 7 13:20:52 dssdbgen03p1 java.lang.ClassNotFoundException: Cannot find class com.sun.cacao.rbac.impl.RbacModule in module com.sun.cacao.rbac
    Jul 7 13:20:52 dssdbgen03p1 java.lang.ClassNotFoundException: Cannot find class com.sun.cacao.instrum.impl.InstrumModule in module com.sun.cacao.instrum
    Jul 7 13:20:52 dssdbgen03p1 java.lang.ClassNotFoundException: Cannot find class com.sun.cacao.commandstream.CommandStreamAdaptorModule in module com.sun.cacao.command_stream_adaptor

  • Oracle ASM Configuration on Solaris Cluster - Oracle 11.2.0.3

    Hi,
    I want some clarifications!
    I need to set Active and Passive Cluster settup on Solaris 10 SPARC Operating System, the HA software is Solaris Cluster and Oracle 11.2.0.3.
    1) I understand "Single instance Oracle ASM is not supported with Oracle 11g release 2" so we need to go for Clustered ASM - is it required to use RAC framework in this case?
    2) When i use the RAC framework, do i need to have license for RAC?
    Am new to Oracle, any help is appreciated.
    Regards,
    Shashank

    Hi,
    I want some clarifications!
    I need to set Active and Passive Cluster settup on Solaris 10 SPARC Operating System, the HA software is Solaris Cluster and Oracle 11.2.0.3.
    1) I understand "Single instance Oracle ASM is not supported with Oracle 11g release 2" so we need to go for Clustered ASM - is it required to use RAC framework in this case?
    2) When i use the RAC framework, do i need to have license for RAC?
    Am new to Oracle, any help is appreciated.
    Regards,
    Shashank

  • Solaris Cluster 3.3 on VMware ESX 4.1

    Hi there,
    I am trying to setup Solaris Cluster 3.3 on Vmware ESX 4.1
    My first question is: Is there anyone out there setted up Solaris Cluster on vmware accross boxes?
    My tools:
    Solaris 10 U9 x64
    Solaris Cluster 3.3
    Vmware ESX 4.1
    HP DL 380 G7
    HP P2000 Fibre Channel Storage
    When I try to setup cluster, just next next next, it completes successfully. It reboots the second node first and then the itself.
    After second node comes up on login screen, ping stops after 5 sec. Same either nodes!
    I am trying to understand why it does that? I did every possibility to complete this job. Setted up quorum as RDM from VMware. Solaris has direct access to quorum disk now.
    I am new to Solaris and I am having the errors below. If someone would like to help me it will be much appreciated!
    Please explain me in more details i am new bee in solaris :) Thanks!
    I need help especially on error: /proc fails to mount periodically during reboots.
    Here is the error messages. Is there any one out there setted up Solaris Cluster on ESX 4.1 ?
    * cluster check (ver 1.0)
    Report Date: 2011.02.28 at 16.04.46 EET
    2011.02.28 at 14.04.46 GMT
    Command run on host:
    39bc6e2d- sun1
    Checks run on nodes:
    sun1
    Unique Checks: 5
    ===========================================================================
    * Summary of Single Node Check Results for sun1
    ===========================================================================
    Checks Considered: 5
    Results by Status
    Violated : 0
    Insufficient Data : 0
    Execution Error : 0
    Unknown Status : 0
    Information Only : 0
    Not Applicable : 2
    Passed : 3
    Violations by Severity
    Critical : 0
    High : 0
    Moderate : 0
    Low : 0
    * Details for 2 Not Applicable Checks on sun1
    * Check ID: S6708606 ***
    * Severity: Moderate
    * Problem Statement: Multiple network interfaces on a single subnet have the same MAC address.
    * Applicability: Scan output of '/usr/sbin/ifconfig -a' for more than one interface with an 'ether' line. Check does not apply if zero or only one ether line.
    * Check ID: S6708496 ***
    * Severity: Moderate
    * Problem Statement: Cluster node (3.1 or later) OpenBoot Prom (OBP) has local-mac-address? variable set to 'false'.
    * Applicability: Applicable to SPARC architecture only.
    * Details for 3 Passed Checks on sun1
    * Check ID: S6708605 ***
    * Severity: Critical
    * Problem Statement: The /dev/rmt directory is missing.
    * Check ID: S6708638 ***
    * Severity: Moderate
    * Problem Statement: Node has insufficient physical memory.
    * Check ID: S6708642 ***
    * Severity: Critical
    * Problem Statement: /proc fails to mount periodically during reboots.
    ===========================================================================
    * End of Report 2011.02.28 at 16.04.46 EET
    ===========================================================================
    Edited by: user13603929 on 28-Feb-2011 22:22
    Edited by: user13603929 on 28-Feb-2011 22:24
    Note: Please ignore memory error I have installed 5GB memory and it says it requires min 1 GB! i think it is a bug!
    Edited by: user13603929 on 28-Feb-2011 22:25

    @TimRead
    Hi, thanks for reply,
    I have already followed the steps also on your links but no joy on this.
    What i noticed here is cluster seems to be buggy. Because i have tried to install cluster 3.3 on physical hardware and it gave me excat same error messages! interesting isnt it?
    Please see errors below that I got from on top of VMware and also on Solaris Physical hardware installation:
    ERROR1:
    Comment: I have installed different memories all the time. It keeps sayying that silly error.
    problem_statement : *Node has insufficient physical memory.
    <analysis>5120 MB of memory is installed on this node.The current release of Solaris Cluster requires a minimum of 1024 MB of physical memory in each node. Additional memory required for various Data Services.</analysis>
    <recommendations>Add enough memory to this node to bring its physical memory up to the minimum required level.
    ERROR2
    Comment: Despite rmt directory is there I gor error below on cluster check
    <problem_statement>The /dev/rmt directory is missing.
    <analysis>The /dev/rmt directory is missing on this Solaris Cluster node. The current implementation of scdidadm(1M) relies on the existence of /dev/rmt to successfully execute 'scdidadm -r'. The /dev/rmt directory is created by Solaris regardless of the existence of the actual nderlying devices. The expectation is that the user will never delete this directory. During a reconfiguration reboot to add new disk devices, if /dev/rmt is missing scdidadm will not create the new devices and will exit with the following error: 'ERR in discover_paths : Cannot walk /dev/rmt' The absence of /dev/rmt might prevent a failover to this node and result in a cluster outage. See BugIDs 4368956 and 4783135 for more details.</analysis>
    ERROR3
    Comment: All Nics have different MAC address though, also I have done what it suggests me. No joy here as well!
    <problem_statement>Cluster node (3.1 or later) OpenBoot Prom (OBP) has local-mac-address? variable set to 'false'.
    <analysis>The local-mac-address? variable must be set to 'true.' Proper operation of the public networks depends on each interface having a different MAC address.</analysis>
    <recommendations>Change the local-mac-address? variable to true: 1) From the OBP (ok> prompt): ok> setenv local-mac-address? true ok> reset 2) Or as root: # /usr/sbin/eeprom local-mac-address?=true # init 0 ok> reset</recommendations>
    ERROR4
    Comment: No comment on this, i have done what it says no joy...
    <problem_statement>/proc fails to mount periodically during reboots.
    <analysis>Something is trying to access /proc before it is normally mounted during the boot process. This can cause /proc not to mount. If /proc isn't mounted, some Solaris Cluster daemons might fail on startup, which can cause the node to panic. The following lines were found:</analysis>
    Thanks!

  • Oracle ASM  installation in Solaris Cluster

    hello Experts,
    Could someone please tell me how to install Oracle ASM in Solaris Cluster and how to integrate it into the cluster resources.
    Details,
    2 Nodes (Pri & Sec) solaris 10 SPARC 64 bit OS
    solaris cluster 3.3 u5/11
    Thanks & Regards

    hi,
    pls take a look at tihs doc
    http://docs.oracle.com/cd/E18728_01/html/821-2678/gjcwv.html
    regards,

  • Solaris Cluster Server Homogeneous System Copy

    Hi,
    We have SAP ECC 6.0 on Solaris Sparc server. Our database is Oracle 10.2.0.2. We will buy 2 new solaris server and we want to use these server as cluster. We want to make homogeneous system copy from current server to new cluster server. How can we make homogeneous copy from current server to new cluster server? Firts of all do we have to install new ECC 6.0 on cluster server and then will we make homogeneous system copy?
    Also how can we install ECC 6.0 on Solaris Cluster server? Will we have install ECC 6.0 seperatly for each server of cluster unit? I have SAP installation document but it is not clear. Do you have another document for SAP installation on cluster server? Please help us about these issues.
    Best regards.

    First to move your ECC system from your current server to the new server, yes you do need to homogenous copy. There are other methods like storage sub-system level copy, but it is not supported. Since your system is ABAP, you can setup standby database on the new server and use dataguard to replicate database. You can manual move profiles, binaries and other application file systems manually since they are on external storage. You should change your profiles to use logical hostname instead of the physical hostname though Solaris 10 gives you the ability to move zone/containers with same hostname.
    I am assuming you have distributed system, ie database and SAP separate. You can setup RAC for database HA. For SAP, you can use sun cluster or veritas cluster to move SAP central instance with the logical hostname to the 2nd standby server.
    You can also separate ASCS from SAP central instance and setup failover for the CS instance with the cluster software. Check notes 821904, 870652. You can take another step in HA with ENQ replication (see http://help.sap.com/saphelp_nw70/helpdata/EN/36/67973c3f5aff39e10000000a114084/content.htm)
    -Regards

  • Cluster on VM sparc.

    Hi all,
    have 2 server Sparc T4 1 Storage 2540,is it necessary to install solaris cluster software for oracle database in each VM?
    Thanks,
    jari.

    Hi,
    it depends what you would like to achieve. A good starting point is
    SPARC: Oracle Solaris Cluster Topologies - Oracle Solaris Cluster Concepts Guide
    regards
    Walter

  • Problem with application directories on JDS - Solaris 10 11/06 sparc

    I am running a fully patched JDS on a Sun Blade with Solaris 10 11/06 sparc.
    After installing, removing, and re-installing Sun Download Manager 2.0 (web start version) I am having problems with JDS application menus.
    I used File Manager (probably accessed through the This Computer Desktop icon) to move around directories under applications:///. This information is saved in ~/.gnome2/vfolders. Folders available from the Launch menu under Applications were all messed up, but I solved this problem by removing all contents of ~/.gnome2/vfolders and restarting JDS.
    Now, when I enter the This Computer icon on the Desktop, and click on the Applications icon in there, there is an extra Applications icon there that is not associated with any application and cannot be removed (when I try and trash it, I get an error saying something about "not on same file system"). Also, when I enter the Internet folder under Applications, there is an extra Internet icon in there that also cannot be removed. These extra folders are not seen when accessing the Applications menu from the Launch menu. How can I clean up these "bad" launchers, or folders, or whatever they are, that appear to exist only under the Applications directory accessible through This Computer (computer:///)?
    Thank you...

    Maybe the answer is here:
    http://docs.sun.com/app/docs/doc/819-0918/6n3aglfdl?l=en&a=view&q=location+of+applications-all-users
    To Add a Menu Using Menu Files
    To add a menu for all users, perform the following steps:
    1. Create a directory entry file for the item that you want to add. Create the directory entry file in the /usr/share/gnome/vfolders directory. For more information about directory entry files, see Directory Entry Files.
    2. Locate the vfolder information file for the location where you want to add the menu. For example, to add a menu to the Applications menu, locate the file /etc/gnome-vfs-2.0/vfolders/applications-all-users.vfolder-info.
    3. In the vfolder information file, add a <Folder> element for the new menu. For more information about vfolder information files, see Vfolders and Menus.
    The next time users log in, the menu is in the assigned location.

  • Grid installation: root.sh failed on the first node on Solaris cluster 4.1

    Hi all,
    I'm trying to install the Grid (11.2.0.3.0) on the 2 node-clusters (OSC 4.1).
    When I run the root.sh on the first node, I got the out put as follow:
    xha239080-root-5.11# root.sh
    Performing root user operation for Oracle 11g
    The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME= /Grid/CRShome
    Enter the full pathname of the local bin directory: [/usr/local/bin]:
    /usr/local/bin is read only. Continue without copy (y/n) or retry (r)? [y]:
    Warning: /usr/local/bin is read only. No files will be copied.
    Creating /var/opt/oracle/oratab file...
    Entries will be added to the /var/opt/oracle/oratab file as needed by
    Database Configuration Assistant when a database is created
    Finished running generic part of root script.
    Now product-specific root actions will be performed.
    Using configuration parameter file: /Grid/CRShome/crs/install/crsconfig_params
    Creating trace directory
    User ignored Prerequisites during installation
    OLR initialization - successful
    root wallet
    root wallet cert
    root cert export
    peer wallet
    profile reader wallet
    pa wallet
    peer wallet keys
    pa wallet keys
    peer cert request
    pa cert request
    peer cert
    pa cert
    peer root cert TP
    profile reader root cert TP
    pa root cert TP
    peer pa cert TP
    pa peer cert TP
    profile reader pa cert TP
    profile reader peer cert TP
    peer user cert
    pa user cert
    Adding Clusterware entries to inittab
    CRS-2672: Attempting to start 'ora.mdnsd' on 'xha239080'
    CRS-2676: Start of 'ora.mdnsd' on 'xha239080' succeeded
    CRS-2672: Attempting to start 'ora.gpnpd' on 'xha239080'
    CRS-2676: Start of 'ora.gpnpd' on 'xha239080' succeeded
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'xha239080'
    CRS-2672: Attempting to start 'ora.gipcd' on 'xha239080'
    CRS-2676: Start of 'ora.cssdmonitor' on 'xha239080' succeeded
    CRS-2676: Start of 'ora.gipcd' on 'xha239080' succeeded
    CRS-2672: Attempting to start 'ora.cssd' on 'xha239080'
    CRS-2672: Attempting to start 'ora.diskmon' on 'xha239080'
    CRS-2676: Start of 'ora.diskmon' on 'xha239080' succeeded
    CRS-2676: Start of 'ora.cssd' on 'xha239080' succeeded
    ASM created and started successfully.
    Disk Group DATA created successfully.
    clscfg: -install mode specified
    Successfully accumulated necessary OCR keys.
    Creating OCR keys for user 'root', privgrp 'root'..
    Operation successful.
    CRS-4256: Updating the profile
    Successful addition of voting disk 9cdb938773bc4f16bf332edac499fd06.
    Successful addition of voting disk 842907db11f74f59bf65247138d6e8f5.
    Successful addition of voting disk 748852d2a5c84f72bfcd50d60f65654d.
    Successfully replaced voting disk group with +DATA.
    CRS-4256: Updating the profile
    CRS-4266: Voting file(s) successfully replaced
    ## STATE File Universal Id File Name Disk group
    1. ONLINE 9cdb938773bc4f16bf332edac499fd06 (/dev/did/rdsk/d10s6) [DATA]
    2. ONLINE 842907db11f74f59bf65247138d6e8f5 (/dev/did/rdsk/d8s6) [DATA]
    3. ONLINE 748852d2a5c84f72bfcd50d60f65654d (/dev/did/rdsk/d9s6) [DATA]
    Located 3 voting disk(s).
    Start of resource "ora.cssd" failed
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'xha239080'
    CRS-2672: Attempting to start 'ora.gipcd' on 'xha239080'
    CRS-2676: Start of 'ora.cssdmonitor' on 'xha239080' succeeded
    CRS-2676: Start of 'ora.gipcd' on 'xha239080' succeeded
    CRS-2672: Attempting to start 'ora.cssd' on 'xha239080'
    CRS-2672: Attempting to start 'ora.diskmon' on 'xha239080'
    CRS-2676: Start of 'ora.diskmon' on 'xha239080' succeeded
    CRS-2674: Start of 'ora.cssd' on 'xha239080' failed
    CRS-2679: Attempting to clean 'ora.cssd' on 'xha239080'
    CRS-2681: Clean of 'ora.cssd' on 'xha239080' succeeded
    CRS-2673: Attempting to stop 'ora.gipcd' on 'xha239080'
    CRS-2677: Stop of 'ora.gipcd' on 'xha239080' succeeded
    CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'xha239080'
    CRS-2677: Stop of 'ora.cssdmonitor' on 'xha239080' succeeded
    CRS-5804: Communication error with agent process
    CRS-4000: Command Start failed, or completed with errors.
    Failed to start Oracle Grid Infrastructure stack
    Failed to start Cluster Synchorinisation Service in clustered mode at /Grid/CRShome/crs/install/crsconfig_lib.pm line 1211.
    /Grid/CRShome/perl/bin/perl -I/Grid/CRShome/perl/lib -I/Grid/CRShome/crs/install /Grid/CRShome/crs/install/rootcrs.pl execution failed
    xha239080-root-5.11# history
    checking the ocssd.log, I see some thing as follow:
    2013-09-16 18:46:24.238: [    CSSD][1]clssscmain: Starting CSS daemon, version 11.2.0.3.0, in (clustered) mode with uniqueness value 1379371584
    2013-09-16 18:46:24.239: [    CSSD][1]clssscmain: Environment is production
    2013-09-16 18:46:24.239: [    CSSD][1]clssscmain: Core file size limit extended
    2013-09-16 18:46:24.248: [    CSSD][1]clssscmain: GIPCHA down 1
    2013-09-16 18:46:24.249: [    CSSD][1]clssscGetParameterOLR: OLR fetch for parameter logsize (8) failed with rc 21
    2013-09-16 18:46:24.250: [    CSSD][1]clssscExtendLimits: The current soft limit for file descriptors is 65536, hard limit is 65536
    2013-09-16 18:46:24.250: [    CSSD][1]clssscExtendLimits: The current soft limit for locked memory is 4294967293, hard limit is 4294967293
    2013-09-16 18:46:24.250: [    CSSD][1]clssscGetParameterOLR: OLR fetch for parameter priority (15) failed with rc 21
    2013-09-16 18:46:24.250: [    CSSD][1]clssscSetPrivEnv: Setting priority to 4
    2013-09-16 18:46:24.253: [    CSSD][1]clssscSetPrivEnv: unable to set priority to 4
    2013-09-16 18:46:24.253: [    CSSD][1]SLOS: cat=-2, opn=scls_mem_lockdown, dep=11, loc=mlockall
    unable to lock memory
    2013-09-16 18:46:24.253: [    CSSD][1](:CSSSC00011:)clssscExit: A fatal error occurred during initialization
    Do anyone have any idea what going on and how can I fix it ?

    Hi,
    solaris has several issues with DISM, e.g.:
    Solaris 10 and Solaris 11 Shared Memory Locking May Fail (Doc ID 1590151.1)
    Sounds like Solaris Cluster  has a similar bug. A "workaround" is to reboot the (cluster) zone, that "fixes" the mlock error. This bug was introduced with updates in september, atleast to our environment (Solaris 11.1). Prior i did not have the issue and now i have to restart the entire zone, whenever i stop crs.
    With 11.2.0.3 the root.sh script can be rerun without prior cleaning up, so you should be able to continue installation at that point after the reboot. After the root.sh completes some configuration assistants need to be run, to complete the installation. You need to execute this manually as you wipe your oui session
    Kind Regards
    Thomas

  • HOWTO: Create 2-node Solaris Cluster 4.1/Solaris 11.1(x64) using VirtualBox

    I did this on VirtualBox 4.1 on Windows 7 and VirtualBox 4.2 on Linux.X64. Basic pre-requisites are : 40GB disk space, 8GB RAM, 64-bit guest capable VirtualBox.
    Please read all the descriptive messages/prompts shown by 'scinstall' and 'clsetup' before answering.
    0) Download from OTN
    - Solaris 11.1 Live Media for x86(~966 MB)
    - Complete Solaris 11.1 IPS Repository Image (total 7GB)
    - Oracle Solaris Cluster 4.1 IPS Repository image (~73MB)
    1) Run VirtualBox Console, create VM1 : 3GB RAM, 30GB HDD
    2) The new VM1 has 1 NIC, add 2 more NICs (total 3). Setting the NIC to any type should be okay, 'VirtualBox Host Only Adapter' worked fine for me.
    3) Start VM1, point the "Select start-up disk" to the Solaris 11.1 Live Media ISO.
    4) Select "Oracle Solaris 11.1" in the GRUB menu. Select Keyboard layout and Language.
    VM1 will boot and the Solaris 11.1 Live Desktop screen will appear.
    5) Click <Install Oracle Solaris> from the desktop, supply necessary inputs.
    Default Disk Discovery (iSCSI not needed) and Disk Selection are fine.
    Disable the "Support Registration" connection info
    6) The alternate user created during the install has root privileges (sudo). Set appropriate VM1 name
    7) When the VM has to be rebooted after the installation is complete, make sure the Solaris 11.1 Live ISO is ejected or else the VM will again boot from the Live CD.
    8) Repeat steps 1-6, create VM2 and install Solaris.
    9) FTP(secure) the Solaris 11.1 Repository IPS and Solaris Cluster 4.1 IPS onto both the VMs e.g under /home/user1/
    10) We need to setup both the packages: Solaris 11.1 Repository and Solaris Cluster 4.1
    11) All commands now to be run as root
    12) By default the 'solaris' repository is of type online (pkg.oracle.com), that needs to be updated to the local ISO we downloaded :-
    +$ sudo sh+
    +# lofiadm -a /home/user1/sol-11_1-repo-full.iso+
    +//output : /dev/lofi/N+
    +# mount -F hsfs /dev/lofi/N /mnt+
    +# pkg set-publisher -G '*' -M '*' -g /mnt/repo solaris+
    13) Setup the ha-cluster package :-
    +# lofiadm -a /home/user1/osc-4_1-ga-repo-full.iso+
    +//output : /dev/lofi/N+
    +# mkdir /mnt2+
    +# mount -f hsfs /dev/lofi/N /mnt2+
    +# pkg set-publisher -g file:///mnt2/repo ha-cluster+
    14) Verify both packages are fine :-
    +# pkg publisher+
    PUBLISHER                   TYPE     STATUS P LOCATION
    solaris                     origin   online F file:///mnt/repo/
    ha-cluster                  origin   online F file:///mnt2/repo/
    15) Install the complete SC4.1 package by installing 'ha-cluster-full'
    +# pkg install ha-cluster-full+
    14) Repeat steps 12-15 on VM2.
    15) Now both VMs have the OS and SC4.1 installed.
    16) By default the 3 NICs are in the "Automatic" profile and have DHCP configured. We need to activate the Fixed profile and put the 3 NICs into it. Only 1 interface, the public interface, needs to be
    configured. The other 2 are for the cluster interconnect and will be automatically configured by scinstall. Execute the following commands :-
    +# netadm enable -p ncp defaultfixed+
    +//verify+
    +# netadm list -p ncp defaultfixed+
    +#Configure the public-interface+
    +#Verify none of the interfaces are listed, add all the 3+
    +# ipadm show-if+
    +# run dladm show-phys or dladm show-link to check interface names : must be net0/net1/net2+
    +# ipadm create-ip net0+
    +# ipadm create-ip net1+
    +# ipadm create-ip net2+
    +# ipadm show-if+
    +//select proper IP and configure the public interface. I have used 192.168.56.171 & 172+
    +# ipadm create-addr -T static -a 192.168.56.171/24 net0/publicip+
    +#IP plumbed, restart+
    +# ipadm down-addr -t net0/publicip+
    +# ipadm up-addr -t net0/publicip+
    +//Verify publicip is fine by pinging the host+
    +# ping 192.168.56.1+
    +//Verify, net0 should be up, net1/net2 should be down+
    +# ipadm+
    17) Repeat step 16 on VM2
    18) Verify both VMs can ping each other using the public IP. Add entries to each other's /etc/hosts
    Now we are ready to run scinstall and create/configure the 2-node cluster
    19)
    +# cd /usr/cluster/bin+
    +# ./scinstall+
    select 1) Create a new cluster ...
    select 1) Create a new cluster
    select 2) Custom in "Typical or Custom Mode"
    Enter cluster name : mycluster1 (e.g)
    Add the 2 nodes : solvm1 & solvm2 and press <ctrl-d>
    Accept default "No" for <Do you need to use DES authentication>"
    Accept default "Yes" for <Should this cluster use at least two private networks>
    Enter "No" for <Does this two-node cluster use switches>
    Select "1)net1" for "Select the first cluster transport adapter"
    If there is warning of unexpected traffic on "net"1, ignore it
    Enter "net1" when it asks corresponding adapter on "solvm2"
    Select "2)net2" for "Select the second cluster transport adapter"
    Enter "net2" when it asks corresponding adapter on "solvm2"
    Select "Yes" for "Is it okay to accept the default network address"
    Select "Yes" for "Is it okay to accept the default network netmask"Now the IP addresses 172.16.0.0 will be plumbed in the 2 private interfaces
    Select "yes" for "Do you want to turn off global fencing"
    (These are SATA serial disks, so no fencing)
    Enter "Yes" for "Do you want to disable automatic quorum device selection"
    (we will add quorum disks later)
    Enter "Yes" for "Proceed with cluster creation"
    Select "No" for "Interrupt cluster creation for cluster check errors"
    The second node will be configured and 2nd node rebooted
    The first node will be configured and rebootedAfter both nodes have rebooted, verify the cluster has been created and both nodes joined.
    On both nodes :-
    +# cd /usr/cluster/bin+
    +# ./clnode status+
    +//should show both nodes Online.+
    At this point there are no quorum disks, so 1 of the node's will be designated quorum vote. That node VM has to be up for the other node to come up and cluster to be formed.
    To check the current quorum status, run :-
    +# ./clquorum show+
    +//one of the nodes will have 1 vote and other 0(zero).+
    20)
    Now the cluster is in 'Installation Mode' and we need to add a quorum disk.
    Shutdown both the nodes as we will be adding shared disks to both of them
    21)
    Create 2 VirtualBox HDDs (VDI Files) on the host, 1 for quorum and 1 for shared filesystem. I have used a size of 1 GB for each :-
    *$ vboxmanage createhd --filename /scratch/myimages/sc41cluster/sdisk1.vdi --size 1024 --format VDI --variant Fixed*
    *0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%*
    *Disk image created. UUID: 899147b9-d21f-4495-ad55-f9cf1ae46cc3*
    *$ vboxmanage createhd --filename /scratch/myimages/sc41cluster/sdisk2.vdi --size 1024 --format VDI --variant Fixed*
    *0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%*
    *Disk image created. UUID: 899147b9-d22f-4495-ad55-f9cf15346caf*
    22)
    Attach these disks to both the VMs as shared type
    *$ vboxmanage storageattach solvm1 --storagectl "SATA" --port 1 --device 0 --type hdd --medium /scratch/myimages/sc41cluster/sdisk1.vdi --mtype shareable*
    *$ vboxmanage storageattach solvm1 --storagectl "SATA" --port 2 --device 0 --type hdd --medium /scratch/myimages/sc41cluster/sdisk2.vdi --mtype shareable*
    *$ vboxmanage storageattach solvm2 --storagectl "SATA" --port 1 --device 0 --type hdd --medium /scratch/myimages/sc41cluster/sdisk1.vdi --mtype shareable*
    *$ vboxmanage storageattach solvm2 --storagectl "SATA" --port 2 --device 0 --type hdd --medium /scratch/myimages/sc41cluster/sdisk2.vdi --mtype shareable*
    The disks are attached to SATA ports 1 & 2 of each VM. On my VirtualBox on Linux, the controller type is "SATA", whereas on Windows it is "SATA Controller".
    The "--mtype shareable' parameter is important
    23)
    Mark both disks as shared :-
    *$ vboxmanage modifyhd /scratch/myimages/sc41cluster/sdisk1.vdi --type shareable*
    *$ vboxmanage modifyhd /scratch/myimages/sc41cluster/sdisk2.vdi --type shareable*
    24) Start both VMs. We need to format the 2 shared disks
    25) From VM1, run format. In my case, the 2 new shared disks show up as 'c7t1d0' and 'c7t2d0'.
    +# format+
    select disk 1 (c7t1d0)
    [disk formated]
    FORMAT MENU
    fdisk
    Type 'y' to accept default partition
    partition
    0
    <enter>
    <enter>
    1
    995mb
    print
    label
    <yes>
    quit
    quit26) Repeat step 25) for the 2nd disk (c7t2d0)
    27) Make sure the shared disks can be used for quorum :-
    On VM1
    +# ./cldevice refresh+
    +# ./cldevice show+
    On VM2
    +# ./cldevice refresh+
    +# ./cldevice show+
    The shared disks should have the same DID (d2,d3,d4 etc). Note down the DID that you are going to use for quorum (e.g d2)
    By default, global fencing is enabled for these disks. We need to turn it off for all disks as these are SATA disks :-
    +# cldevice set -p default_fencing=nofencing-noscrub d1+
    +# cldevice set -p default_fencing=nofencing-noscrub d2+
    +# cldevice set -p default_fencing=nofencing-noscrub d3+
    +# cldevice set -p default_fencing=nofencing-noscrub d4+
    28) It is better to do one more reboot of both VMs, otherwise I got a error when adding the quorum disk
    29) Run clsetup to add quorum disk and to complete cluster configuration :-
    +# ./clsetup+
    === Initial Cluster Setup ===
    Enter 'Yes' for "Do you want to continue"
    Enter 'Yes' for "Do you want add any quorum devices"
    Select '1) Directly Attached Shared Disk' for the type of device
    Enter 'Yes' for "Is it okay to continue"
    Enter 'd2' (or 'd3') for 'Which global device do you want to use'
    Enter 'Yes' for "Is it okay to proceed with the update"
    The command 'clquorum add d2' is run
    Enter 'No' for "Do you want to add another quorum device"
    Enter 'Yes' for "Is it okay to reset "installmode"?"Cluster initialization is complete.!!!
    30) Run 'clquorum status' to confirm both nodes and the quorum disk have 1 vote each
    31) Run other cluster commands to explore!
    I will cover Data services and shared file system in another post. Basically the other shared disk
    can be used to create a UFS filesystem and mount it on all nodes.

    The Solaris Cluster 4.1 Installation and Concepts Guide are available at :-
    http://docs.oracle.com/cd/E29086_01/index.html
    Thanks.

  • Oracle 9.0.1 on Solaris 9 4/03 Sparc ?

    Hi,
    I have a CD pack for Oracle 9.0.1 and I was going to install it on Solaris 9 4/03. But then I read the release notes and Solaris 9 is not mentionned. Not mentionned either in 9.2.0.1 releases notes... Only saw it for Oracle 10.
    Is Oracle 9.0.1 supported on Solaris 9 (4/03) Sparc ?
    Thanks,
    Alex...

    Anyone ?

  • Installation of Solaris 10 on SUN Sparc M4000

    Hi
    I am trying to install Solaris 10 on SUN SPARC M4000.
    Here is the error I have:
    +Enter filename [kernel/sparcv9/unix]: /platform/sun4v/kernel/sparcv9/unix+
    +Enter default directory for modules [platform/sun4v/kernel /kernel /usr/kernel]:+
    krtld: load_exec: fail to expand cpu/$CPU
    krtld: error during initial load/link phase
    panic - boot: exitto64 returned from client program
    Program terminated
    It seems that version of Solaris I trying to install is not support by the Hardware, when I use another Solaris 10 DVD, it works...
    But the problem is that I need to install this Solaris version because others softwares are depending on that.
    If someone knows how to make this installation possible, it will be very helpfull.
    Thanks in advance.
    BR
    Racine

    Minimum Supported
    Solaris[tm] Operating Environment Versions for the M4000 are as follows
    OpenSolaris 2009.06
    SPARC64 VI:
    10 - 11/06 (U3) plus required patches (minimum)
    10 - 08/07 (U4) (recommended)
    SPARC64 VII:
    10 - 08/07 (U4) plus required patches (minimum)
    10 - 5/08 (U5) (recommended)
    A full matrix for the M4000 which details the minimum patch requirements to support the h/w is available from https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1145383.1

  • Solaris 8 install on Sparc without CD-ROM

    I would like to install Solaris 8 on a Sparc 20 workstation that does not have a CD-ROM drive. I believe it should be possible to do an installation from another machine over the network. Does anyone have experience doing such an install? The host that I hope to install from is an x86 linux box, if that makes any difference.
    Any advice / hints would be appreciated.
    joey richards
    [email protected]

    Isn't is easier to temporarily attach a scsi cdrom to the
    sparc, just for installing the os?
    But you can boot/install solaris over the network.
    The boot procedure is documented here:
    http://docs.sun.com:80/ab2/coll.40.6/REFMAN1M/@Ab2PageView/25227
    You'll need:
    - a "rarp" deomon that maps the sun's ethernet addes to an
    ip-address.
    - a "tftp" daemon that has a copy of the boot code for the
    client. Stored under the hex encoded IP address of the
    discless client with it's architecture name appended, like
    this (example for a client at addr 172.20.0.72):
    % pwd
    /tftpboot
    % ls -l AC* inetboot.sun4m.solaris.8
    lrwxrwxrwx 1 root other 24 Jul 23 2000 AC140048 -> inetboot.sun4m.solaris.8
    lrwxrwxrwx 1 root other 8 Jul 23 2000 AC140048.SUN4M -> AC140048
    -rw-r--r-- 1 root other 117204 Jul 16 2000 inetboot.sun4m.solaris.8
    You'll find the "inetboot" somewhere on the solaris 8 boot
    cd. (...lib/fs/nfs/inetboot ?)
    - a rpc.bootparamd server, that tells the client where it
    can NFS mount it's root filesystem.
    An entry for a discless client named "disclessclient" looks
    like this:
    disclessclient root=server:/cdrom/cdrom0/s0/Solaris_8/Tools/Boot \
    install=server:/cdrom/cdrom0/s0 \
    boottype=:in rootopts=:rsize=32768
    Of cause the server must have the "Software 1 of 2" solaris
    cdrom mounted and must "NFS export" it. Note, the discless
    client needs "root" access to the filesystem; i.e. do not
    map the root uid to nobody.
    Bringing in linux to the equation complicates matters, for
    sure...

  • CPU message on Solaris 8 install on Sparc U30

    When installing Solaris 8 on Ultra Sparc 30, I get the following message after "Configuring /dev and /devices"
    panic[cpu0]/ thread=2a1001dd40: BAD TRAP: type31 rp 2a1001cf30 addr=2f150fcb998 mmu_fsr=0
    followed by;
    sched: trap type=0x31
    followed by what appears to be a memory dump
    followed by Watchdog Reset and Externally Initiated Reset
    Is this a media problem or hardware falilure?

    What's version of NES/Iplanet you are running?
    thanks
    Jong
    "Ed Itkin" <[email protected]> wrote:
    >
    I'm getting strange message when Netscape Client starts on Solaris 8:
    cbfe-dev-web01:/opt/weblogic:263> /opt/netscape/netscape
    ERROR: ld.so.1: /opt/netscape/netscape: fatal: relocation error: file
    /opt/netscape/plugins/libproxy.so:
    symbol __nsapi30_table: referenced symbol not found
    Cant load plugin /opt/netscape/plugins/libproxy.so. Ignored.
    Can you give me idea what happened to NSAPI ? It worked before.
    I rebooted the Solaris server - it did not help.
    Just a reminder : libproxy.so was copied from
    Weblogic 6.0 SP2 ../lib/solaris directory into /opt/netscape/plugins.

Maybe you are looking for