HOWTO: Create 2-node Solaris Cluster 4.1/Solaris 11.1(x64) using VirtualBox

I did this on VirtualBox 4.1 on Windows 7 and VirtualBox 4.2 on Linux.X64. Basic pre-requisites are : 40GB disk space, 8GB RAM, 64-bit guest capable VirtualBox.
Please read all the descriptive messages/prompts shown by 'scinstall' and 'clsetup' before answering.
0) Download from OTN
- Solaris 11.1 Live Media for x86(~966 MB)
- Complete Solaris 11.1 IPS Repository Image (total 7GB)
- Oracle Solaris Cluster 4.1 IPS Repository image (~73MB)
1) Run VirtualBox Console, create VM1 : 3GB RAM, 30GB HDD
2) The new VM1 has 1 NIC, add 2 more NICs (total 3). Setting the NIC to any type should be okay, 'VirtualBox Host Only Adapter' worked fine for me.
3) Start VM1, point the "Select start-up disk" to the Solaris 11.1 Live Media ISO.
4) Select "Oracle Solaris 11.1" in the GRUB menu. Select Keyboard layout and Language.
VM1 will boot and the Solaris 11.1 Live Desktop screen will appear.
5) Click <Install Oracle Solaris> from the desktop, supply necessary inputs.
Default Disk Discovery (iSCSI not needed) and Disk Selection are fine.
Disable the "Support Registration" connection info
6) The alternate user created during the install has root privileges (sudo). Set appropriate VM1 name
7) When the VM has to be rebooted after the installation is complete, make sure the Solaris 11.1 Live ISO is ejected or else the VM will again boot from the Live CD.
8) Repeat steps 1-6, create VM2 and install Solaris.
9) FTP(secure) the Solaris 11.1 Repository IPS and Solaris Cluster 4.1 IPS onto both the VMs e.g under /home/user1/
10) We need to setup both the packages: Solaris 11.1 Repository and Solaris Cluster 4.1
11) All commands now to be run as root
12) By default the 'solaris' repository is of type online (pkg.oracle.com), that needs to be updated to the local ISO we downloaded :-
+$ sudo sh+
+# lofiadm -a /home/user1/sol-11_1-repo-full.iso+
+//output : /dev/lofi/N+
+# mount -F hsfs /dev/lofi/N /mnt+
+# pkg set-publisher -G '*' -M '*' -g /mnt/repo solaris+
13) Setup the ha-cluster package :-
+# lofiadm -a /home/user1/osc-4_1-ga-repo-full.iso+
+//output : /dev/lofi/N+
+# mkdir /mnt2+
+# mount -f hsfs /dev/lofi/N /mnt2+
+# pkg set-publisher -g file:///mnt2/repo ha-cluster+
14) Verify both packages are fine :-
+# pkg publisher+
PUBLISHER                   TYPE     STATUS P LOCATION
solaris                     origin   online F file:///mnt/repo/
ha-cluster                  origin   online F file:///mnt2/repo/
15) Install the complete SC4.1 package by installing 'ha-cluster-full'
+# pkg install ha-cluster-full+
14) Repeat steps 12-15 on VM2.
15) Now both VMs have the OS and SC4.1 installed.
16) By default the 3 NICs are in the "Automatic" profile and have DHCP configured. We need to activate the Fixed profile and put the 3 NICs into it. Only 1 interface, the public interface, needs to be
configured. The other 2 are for the cluster interconnect and will be automatically configured by scinstall. Execute the following commands :-
+# netadm enable -p ncp defaultfixed+
+//verify+
+# netadm list -p ncp defaultfixed+
+#Configure the public-interface+
+#Verify none of the interfaces are listed, add all the 3+
+# ipadm show-if+
+# run dladm show-phys or dladm show-link to check interface names : must be net0/net1/net2+
+# ipadm create-ip net0+
+# ipadm create-ip net1+
+# ipadm create-ip net2+
+# ipadm show-if+
+//select proper IP and configure the public interface. I have used 192.168.56.171 & 172+
+# ipadm create-addr -T static -a 192.168.56.171/24 net0/publicip+
+#IP plumbed, restart+
+# ipadm down-addr -t net0/publicip+
+# ipadm up-addr -t net0/publicip+
+//Verify publicip is fine by pinging the host+
+# ping 192.168.56.1+
+//Verify, net0 should be up, net1/net2 should be down+
+# ipadm+
17) Repeat step 16 on VM2
18) Verify both VMs can ping each other using the public IP. Add entries to each other's /etc/hosts
Now we are ready to run scinstall and create/configure the 2-node cluster
19)
+# cd /usr/cluster/bin+
+# ./scinstall+
select 1) Create a new cluster ...
select 1) Create a new cluster
select 2) Custom in "Typical or Custom Mode"
Enter cluster name : mycluster1 (e.g)
Add the 2 nodes : solvm1 & solvm2 and press <ctrl-d>
Accept default "No" for <Do you need to use DES authentication>"
Accept default "Yes" for <Should this cluster use at least two private networks>
Enter "No" for <Does this two-node cluster use switches>
Select "1)net1" for "Select the first cluster transport adapter"
If there is warning of unexpected traffic on "net"1, ignore it
Enter "net1" when it asks corresponding adapter on "solvm2"
Select "2)net2" for "Select the second cluster transport adapter"
Enter "net2" when it asks corresponding adapter on "solvm2"
Select "Yes" for "Is it okay to accept the default network address"
Select "Yes" for "Is it okay to accept the default network netmask"Now the IP addresses 172.16.0.0 will be plumbed in the 2 private interfaces
Select "yes" for "Do you want to turn off global fencing"
(These are SATA serial disks, so no fencing)
Enter "Yes" for "Do you want to disable automatic quorum device selection"
(we will add quorum disks later)
Enter "Yes" for "Proceed with cluster creation"
Select "No" for "Interrupt cluster creation for cluster check errors"
The second node will be configured and 2nd node rebooted
The first node will be configured and rebootedAfter both nodes have rebooted, verify the cluster has been created and both nodes joined.
On both nodes :-
+# cd /usr/cluster/bin+
+# ./clnode status+
+//should show both nodes Online.+
At this point there are no quorum disks, so 1 of the node's will be designated quorum vote. That node VM has to be up for the other node to come up and cluster to be formed.
To check the current quorum status, run :-
+# ./clquorum show+
+//one of the nodes will have 1 vote and other 0(zero).+
20)
Now the cluster is in 'Installation Mode' and we need to add a quorum disk.
Shutdown both the nodes as we will be adding shared disks to both of them
21)
Create 2 VirtualBox HDDs (VDI Files) on the host, 1 for quorum and 1 for shared filesystem. I have used a size of 1 GB for each :-
*$ vboxmanage createhd --filename /scratch/myimages/sc41cluster/sdisk1.vdi --size 1024 --format VDI --variant Fixed*
*0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%*
*Disk image created. UUID: 899147b9-d21f-4495-ad55-f9cf1ae46cc3*
*$ vboxmanage createhd --filename /scratch/myimages/sc41cluster/sdisk2.vdi --size 1024 --format VDI --variant Fixed*
*0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%*
*Disk image created. UUID: 899147b9-d22f-4495-ad55-f9cf15346caf*
22)
Attach these disks to both the VMs as shared type
*$ vboxmanage storageattach solvm1 --storagectl "SATA" --port 1 --device 0 --type hdd --medium /scratch/myimages/sc41cluster/sdisk1.vdi --mtype shareable*
*$ vboxmanage storageattach solvm1 --storagectl "SATA" --port 2 --device 0 --type hdd --medium /scratch/myimages/sc41cluster/sdisk2.vdi --mtype shareable*
*$ vboxmanage storageattach solvm2 --storagectl "SATA" --port 1 --device 0 --type hdd --medium /scratch/myimages/sc41cluster/sdisk1.vdi --mtype shareable*
*$ vboxmanage storageattach solvm2 --storagectl "SATA" --port 2 --device 0 --type hdd --medium /scratch/myimages/sc41cluster/sdisk2.vdi --mtype shareable*
The disks are attached to SATA ports 1 & 2 of each VM. On my VirtualBox on Linux, the controller type is "SATA", whereas on Windows it is "SATA Controller".
The "--mtype shareable' parameter is important
23)
Mark both disks as shared :-
*$ vboxmanage modifyhd /scratch/myimages/sc41cluster/sdisk1.vdi --type shareable*
*$ vboxmanage modifyhd /scratch/myimages/sc41cluster/sdisk2.vdi --type shareable*
24) Start both VMs. We need to format the 2 shared disks
25) From VM1, run format. In my case, the 2 new shared disks show up as 'c7t1d0' and 'c7t2d0'.
+# format+
select disk 1 (c7t1d0)
[disk formated]
FORMAT MENU
fdisk
Type 'y' to accept default partition
partition
0
<enter>
<enter>
1
995mb
print
label
<yes>
quit
quit26) Repeat step 25) for the 2nd disk (c7t2d0)
27) Make sure the shared disks can be used for quorum :-
On VM1
+# ./cldevice refresh+
+# ./cldevice show+
On VM2
+# ./cldevice refresh+
+# ./cldevice show+
The shared disks should have the same DID (d2,d3,d4 etc). Note down the DID that you are going to use for quorum (e.g d2)
By default, global fencing is enabled for these disks. We need to turn it off for all disks as these are SATA disks :-
+# cldevice set -p default_fencing=nofencing-noscrub d1+
+# cldevice set -p default_fencing=nofencing-noscrub d2+
+# cldevice set -p default_fencing=nofencing-noscrub d3+
+# cldevice set -p default_fencing=nofencing-noscrub d4+
28) It is better to do one more reboot of both VMs, otherwise I got a error when adding the quorum disk
29) Run clsetup to add quorum disk and to complete cluster configuration :-
+# ./clsetup+
=== Initial Cluster Setup ===
Enter 'Yes' for "Do you want to continue"
Enter 'Yes' for "Do you want add any quorum devices"
Select '1) Directly Attached Shared Disk' for the type of device
Enter 'Yes' for "Is it okay to continue"
Enter 'd2' (or 'd3') for 'Which global device do you want to use'
Enter 'Yes' for "Is it okay to proceed with the update"
The command 'clquorum add d2' is run
Enter 'No' for "Do you want to add another quorum device"
Enter 'Yes' for "Is it okay to reset "installmode"?"Cluster initialization is complete.!!!
30) Run 'clquorum status' to confirm both nodes and the quorum disk have 1 vote each
31) Run other cluster commands to explore!
I will cover Data services and shared file system in another post. Basically the other shared disk
can be used to create a UFS filesystem and mount it on all nodes.

The Solaris Cluster 4.1 Installation and Concepts Guide are available at :-
http://docs.oracle.com/cd/E29086_01/index.html
Thanks.

Similar Messages

  • Creating 4 Node SQL Cluster

    Hi
    I need to create a 4 Node - SQL 2012 Cluster (all in active Mode). 
    Node-1 will be having 1 Instance, Node-2 will be having 2 Instance, Node-3 will be running with 1 instance, and Node-4 will be running with 6 SQL Instance.
    Let me know whether can I install all the 10 Instance on Node-1 and then can i go ahead and add the  node-2 by selecting 10 Instances which is already installed and then node-3 and node-4.
    or Let me know if any other recommended way of Setting up the Cluster.
    Raj

    Hi,
    In the Server 2012 Failover cluster the node maximum is 64, the SQL server also have the maximum node limit, such as SQL 2008 clustering depends on which version of the software
    you purchase, along with which version of the operating system you intend to use. Therefore for the rigid plan I suggest you ask the issue in SQL forum.
    More information:
    SQL Server forum:
    http://social.technet.microsoft.com/Forums/sqlserver/en-us/home?category=sqlserver
    Before Installing Failover Clustering
    http://msdn.microsoft.com/en-us/library/ms189910.aspx
    Thanks for your understanding and support.
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • Solaris Cluster 4 with Solaris 11/11/11 -- LDOM farm

    Hi,
    In the 2011 Openworld, I had the opportunity to meet some of the Oracle cluster experts. In conversations, I found that when configuring LDOMs within a clustered environments, we could pass a complete "/dev/did/*dsk/d<num>" device directly to the guest domain.
    Are there any notes/whitepapers that someone within Oracle could direct me to that elaborates this a little more? I can reach out via our regular pre-sales channels, but I'm posting here since I know the Cluster gurus frequent this watering hole :)

    Hi Hartmut,
    I chose to use the DID namespace because of it's simplicity. I can reference a /dev/did/rdsk/d<> and be consistent across the cluster. Also, since I'm using HA to cluster the LDOMs, I don't have to worry about bringing up resources on the Control domain (since all the FC storage I use is for guest domains). The control domains themselves (which are also the IO domains) have the internal drives of the T4-4 that contain the rpool etc.
    My vds devices look like this --
    <pre>
    VDS
    NAME VOLUME OPTIONS MPGROUP DEVICE
    primary_vds0 sol11 /local/sol-11-1111-text-sparc.iso
    sol10u10 ro /local/sol-10-u10-ga2-sparc-dvd.iso
    primary_shared_vds1 d9 /dev/did/dsk/d9s2
    d11 /dev/did/dsk/d11s2
    d12 /dev/did/dsk/d12s2
    d25 /dev/did/dsk/d25s2
    d27 /dev/did/dsk/d27s2
    d28 /dev/did/dsk/d28s2
    d29 /dev/did/dsk/d29s2
    d30 /dev/did/dsk/d30s2
    d31 /dev/did/dsk/d31s2
    d32 /dev/did/dsk/d32s2
    d33 /dev/did/dsk/d33s2
    d34 /dev/did/dsk/d34s2
    d35 /dev/did/dsk/d35s2
    d36 /dev/did/dsk/d36s2
    d37 /dev/did/dsk/d37s2
    </pre>
    Edited by: implicate_order on May 11, 2012 2:11 PM
    Also, I have a script that extracts the EMC array ID, scsi id, ctd name and size etc from the DID framework.
    Edited by: implicate_order on May 11, 2012 2:11 PM

  • Solaris cluster 3.2 Sparc

    Hi folks
    First things first. I may not have great knowledge about Solaris clusters, so please be merciful :)
    Here it is what I have:
    - 2 x Netra T1 AC200 each with 1GB Ram, 2x18GB disks, 500 MHZ Sparc Cpu, 4 port ethernet card
    - 1 array netra d130 3x36 GB
    -- cable et all, switches , you name it
    So, I set up the OS, all ok. I set up the cluster, all SEEMS to be ok.
    But when I define my resources and stuff like that all goes fine, except when I try top bring the resource group on line.
    On another configuration I teste the shared logical hostname and works fine.
    Group Name Resources
    Resources: ingresc nodec ingresr
    -- Resource Groups --
    Group Name Node Name State Suspended
    Group: ingresc node2 Unmanaged No
    Group: ingresc node1 Unmanaged No
    -- Resources --
    Resource Name Node Name State Status Message
    Resource: nodec node2 Offline Offline
    Resource: nodec node1 Offline Offline
    Resource: ingresr node2 Offline Offline
    Resource: ingresr node1 Offline Offline
    scswitch: (C969069) Request failed because resource group ingresc is in ERROR_STOP_FAILED state and requires operator attention
    Now, in /var/adm/messsages I spotted this :
    Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
    Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_stop>:tag=<IngresNCG.nodec.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    A little bit of research points in the direction of a bug (see CR 6565601)
    Here it is what I see as my options:
    1 - reinstall Solaris OS, but not the Solaris Cluster 3.2, instead using Solaris Express 10/07 or 2/08. But will this combination work ? Or will it work only in the combination Solaris Cluster Express and Solaris Express Developer Edition ? If the later, which versions will work together ?
    2 - Beg for a Solaris Cluster 3.2 patch, although in my humble opinion, this should be free since it looks to me that once you write your own stuff, you run in the bug, and after all it is education
    Any ideas, help, greatly appreciated
    Many thanks
    Armand

    Although names are different since I used two setups, this is the relevant part of /var/adm/messages.
    It looks to me Ingres resource is failing:
    Mar  6 17:08:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_prenet_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
    Mar  6 17:08:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_prenet_start>:tag=<IngresNCG.nodec.10>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar  6 17:08:05 node2 svc.startd[8]: [ID 652011 daemon.warning] svc:/system/cluster/scsymon-srv:default: Method "/usr/cluster/lib/svc/method/svc_scsymon_srv start" failed with exit status 96.
    Mar  6 17:08:05 node2 svc.startd[8]: [ID 748625 daemon.error] system/cluster/scsymon-srv:default misconfigured: transitioned to maintenance (see 'svcs -xv' for details)
    Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_prenet_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 1% of timeout <300 seconds>
    Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_PRENET_STARTED
    Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_STARTING
    Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <500> seconds
    Mar  6 17:08:09 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_start>:tag=<IngresNCG.nodec.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_ONLINE
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <LogicalHostname online.>
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <500 seconds>
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_JUST_STARTED
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE_UNMON
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STARTING
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_MON_STARTING
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_UNKNOWN
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Starting>
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <bin/ingres_server_start> for resource <IngresNCR>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_monitor_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</global/disk2s0/ing_nc_1/ingresclu/bin/ingres_server_start>:tag=<IngresNCG.IngresNCR.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar  6 17:08:11 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_monitor_start>:tag=<IngresNCG.nodec.7>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar  6 17:08:12 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_monitor_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
    Mar  6 17:08:12 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE
    Mar  6 17:08:13 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server online.>
    Mar  6 17:08:30 node2 sendmail[534]: [ID 702911 mail.alert] unable to qualify my own domain name (node2) -- using short name
    Mar  6 17:08:30 node2 sendmail[535]: [ID 702911 mail.alert] unable to qualify my own domain name (node2) -- using short name
    Mar  6 17:08:31 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server offline.>
    Mar  6 17:08:45 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,stop]: [ID 147958 daemon.error] ERROR : HA-Ingres failed to stop.
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_FAULTED
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Ingres DBMS server faulted.>
    Mar  6 17:08:46 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,start]: [ID 335575 daemon.error] ERROR : Stop method failed for the HA-Ingres data service.
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 938318 daemon.error] Method <bin/ingres_server_start> failed on resource <IngresNCR> in resource group <IngresNCG> [exit code <1>, time used: 11% of timeout <300 seconds>]
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_START_FAILED
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_PENDING_OFF_START_FAILED
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STOPPING
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_MON_STOPPING
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_UNKNOWN
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Stopping>
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <bin/ingres_server_stop> for resource <IngresNCR>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_monitor_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</global/disk2s0/ing_nc_1/ingresclu/bin/ingres_server_stop>:tag=<IngresNCG.IngresNCR.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar  6 17:08:46 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_monitor_stop>:tag=<IngresNCG.nodec.8>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar  6 17:08:47 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server offline.>
    Mar  6 17:08:48 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_monitor_stop> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
    Mar  6 17:08:48 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE_UNMON
    Mar  6 17:09:00 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,stop]: [ID 147958 daemon.error] ERROR : HA-Ingres failed to stop.
    Mar  6 17:09:02 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_FAULTED
    Mar  6 17:09:02 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Ingres DBMS server faulted.>
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 938318 daemon.error] Method <bin/ingres_server_stop> failed on resource <IngresNCR> in resource group <IngresNCG> [exit code <2>, time used: 5% of timeout <300 seconds>]
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STOP_FAILED
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_PENDING_OFF_STOP_FAILED
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 424774 daemon.error] Resource group <IngresNCG> requires operator attention due to STOP failure
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_STOPPING
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_UNKNOWN
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <Stopping>
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
    Mar  6 17:09:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_stop>:tag=<IngresNCG.nodec.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
    Mar  6 17:09:04 node2 ip: [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 192.168.005.085:0, remote = 000.000.000.000:0, start = -2, end = 6
    Mar  6 17:09:04 node2 ip: [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection
    Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_OFFLINE
    Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <LogicalHostname offline.>
    Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_stop> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
    Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_OFFLINE
    Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_ERROR_STOP_FAILED
    Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 424774 daemon.error] Resource group <IngresNCG> requires operator attention due to STOP failure
    Mar  6 17:09:04 node2 Cluster.RGM.rgmd: [ID 663692 daemon.error] failback attempt failed on resource group <IngresNCG> with error <resource group in ERROR_STOP_FAILED state requires operator attention>
    Mar  6 17:09:10 node2 java[1652]: [ID 807473 user.error] pkcs11_softtoken: Keystore version failure.Thank you
    Armand

  • Automatic restart of services on a 1 node rac cluster with Clusterware

    How do we enable a service to automaticly start-up when the db starts up?
    Thanks,
    Dave

    srvctl enable service -d DBThanks for your reply M. Nauman. I researched that command and found we do have it enabled and that it only works if the database instance was previously taken down. Since the database does not go down on an Archiver Hung error as we are using FRA with an alt location, this never kicks in and brings up the service. What we are looking for something that will trigger off of when the archive logs error and switch from FRA(Flash Recovery Area) to our Alternate disk location. Or more presicely, when it goes back to a Valid status(on the FRA - after we've run an archive log backup to clear it).
    I found out from our 2 senior dba's that our other 2 node rac environment does not suffer from this problem, only the newly created 1 node rac cluster environment. The problem is we don't know what that is(a parameter on the db or cluster or what) and how do we set it?
    Anyone know?
    Thanks,
    Gib
    Message was edited by:
    Gib2008
    Message was edited by:
    Gib2008

  • 2 node Webcenter Cluster setup

    Hi,
    I am trying to configure 2 node webcenter cluster, so far i was able to accomplish this:
    i) create 2 node weblogic cluster with webcenter [ WLS_Spaces1,WLS_Spaces2] , [WLS_Services1,WLS_Services2], [WLS_Portlet1,WLS_Portlet2]
    ii)configured the weblogic to Authenticate using AD, and configured Jive Admin to login using AD
    iii) configured JOC[java object cache] for webcenter spaces
    iv) and I am having issues configuring the WS-Security for spaces to acess discussion services.
    v) and also i am not able to see any policy store, and while trying to add a user to admin role for webcenter , i see it empty.
    in single node i am able to assign admin roles to users from AD with no issues.
    can some one help me in accomplishing the tasks for configuring the backend services of webcenter spaces when configured in cluster mode.
    Thank you
    A/
    * here is the doc link i am using for the setup
    http://sqltech.cl/doc/oas11gR1/core.1111/e12037/extend_wc.htm
    Edited by: user10696627 on Oct 7, 2010 10:28 PM

    We're running Server 2012 Data Center on the cluster nodes.
    I was thinking the same about the 3rd party software to do what I'd like it to do.   The data  is mostly security camera video from our security system.  Since its not really critical data, i'm just looking for a way to maximize
    the available hard drive space, and make it addressable as one volume or network share...
    -Eric
    You can build Storage Spaces (simple, not clustered as it would waste 50% of your capacity, MSFT can do mirror and parity with R2 for clustered only) from iSCSI LUs. Dog slow and unsupported but you'll have linear spanned space. See:
    Rough Guide To Setting Up A Scale-Out File Server
    http://www.aidanfinn.com/?p=13176
    Creating Virtual SoFS with shared VHDX
    http://www.aidanfinn.com/?p=15145
    you don;t need SoFS (obviously) but in this article Aidan creates Storage Spaces from iSCSI LUNs.
    Good luck!
    StarWind VSAN [Virtual SAN] clusters Hyper-V without SAS, Fibre Channel, SMB 3.0 or iSCSI, uses Ethernet to mirror internally mounted SATA disks between hosts.

  • Grid installation: root.sh failed on the first node on Solaris cluster 4.1

    Hi all,
    I'm trying to install the Grid (11.2.0.3.0) on the 2 node-clusters (OSC 4.1).
    When I run the root.sh on the first node, I got the out put as follow:
    xha239080-root-5.11# root.sh
    Performing root user operation for Oracle 11g
    The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME= /Grid/CRShome
    Enter the full pathname of the local bin directory: [/usr/local/bin]:
    /usr/local/bin is read only. Continue without copy (y/n) or retry (r)? [y]:
    Warning: /usr/local/bin is read only. No files will be copied.
    Creating /var/opt/oracle/oratab file...
    Entries will be added to the /var/opt/oracle/oratab file as needed by
    Database Configuration Assistant when a database is created
    Finished running generic part of root script.
    Now product-specific root actions will be performed.
    Using configuration parameter file: /Grid/CRShome/crs/install/crsconfig_params
    Creating trace directory
    User ignored Prerequisites during installation
    OLR initialization - successful
    root wallet
    root wallet cert
    root cert export
    peer wallet
    profile reader wallet
    pa wallet
    peer wallet keys
    pa wallet keys
    peer cert request
    pa cert request
    peer cert
    pa cert
    peer root cert TP
    profile reader root cert TP
    pa root cert TP
    peer pa cert TP
    pa peer cert TP
    profile reader pa cert TP
    profile reader peer cert TP
    peer user cert
    pa user cert
    Adding Clusterware entries to inittab
    CRS-2672: Attempting to start 'ora.mdnsd' on 'xha239080'
    CRS-2676: Start of 'ora.mdnsd' on 'xha239080' succeeded
    CRS-2672: Attempting to start 'ora.gpnpd' on 'xha239080'
    CRS-2676: Start of 'ora.gpnpd' on 'xha239080' succeeded
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'xha239080'
    CRS-2672: Attempting to start 'ora.gipcd' on 'xha239080'
    CRS-2676: Start of 'ora.cssdmonitor' on 'xha239080' succeeded
    CRS-2676: Start of 'ora.gipcd' on 'xha239080' succeeded
    CRS-2672: Attempting to start 'ora.cssd' on 'xha239080'
    CRS-2672: Attempting to start 'ora.diskmon' on 'xha239080'
    CRS-2676: Start of 'ora.diskmon' on 'xha239080' succeeded
    CRS-2676: Start of 'ora.cssd' on 'xha239080' succeeded
    ASM created and started successfully.
    Disk Group DATA created successfully.
    clscfg: -install mode specified
    Successfully accumulated necessary OCR keys.
    Creating OCR keys for user 'root', privgrp 'root'..
    Operation successful.
    CRS-4256: Updating the profile
    Successful addition of voting disk 9cdb938773bc4f16bf332edac499fd06.
    Successful addition of voting disk 842907db11f74f59bf65247138d6e8f5.
    Successful addition of voting disk 748852d2a5c84f72bfcd50d60f65654d.
    Successfully replaced voting disk group with +DATA.
    CRS-4256: Updating the profile
    CRS-4266: Voting file(s) successfully replaced
    ## STATE File Universal Id File Name Disk group
    1. ONLINE 9cdb938773bc4f16bf332edac499fd06 (/dev/did/rdsk/d10s6) [DATA]
    2. ONLINE 842907db11f74f59bf65247138d6e8f5 (/dev/did/rdsk/d8s6) [DATA]
    3. ONLINE 748852d2a5c84f72bfcd50d60f65654d (/dev/did/rdsk/d9s6) [DATA]
    Located 3 voting disk(s).
    Start of resource "ora.cssd" failed
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'xha239080'
    CRS-2672: Attempting to start 'ora.gipcd' on 'xha239080'
    CRS-2676: Start of 'ora.cssdmonitor' on 'xha239080' succeeded
    CRS-2676: Start of 'ora.gipcd' on 'xha239080' succeeded
    CRS-2672: Attempting to start 'ora.cssd' on 'xha239080'
    CRS-2672: Attempting to start 'ora.diskmon' on 'xha239080'
    CRS-2676: Start of 'ora.diskmon' on 'xha239080' succeeded
    CRS-2674: Start of 'ora.cssd' on 'xha239080' failed
    CRS-2679: Attempting to clean 'ora.cssd' on 'xha239080'
    CRS-2681: Clean of 'ora.cssd' on 'xha239080' succeeded
    CRS-2673: Attempting to stop 'ora.gipcd' on 'xha239080'
    CRS-2677: Stop of 'ora.gipcd' on 'xha239080' succeeded
    CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'xha239080'
    CRS-2677: Stop of 'ora.cssdmonitor' on 'xha239080' succeeded
    CRS-5804: Communication error with agent process
    CRS-4000: Command Start failed, or completed with errors.
    Failed to start Oracle Grid Infrastructure stack
    Failed to start Cluster Synchorinisation Service in clustered mode at /Grid/CRShome/crs/install/crsconfig_lib.pm line 1211.
    /Grid/CRShome/perl/bin/perl -I/Grid/CRShome/perl/lib -I/Grid/CRShome/crs/install /Grid/CRShome/crs/install/rootcrs.pl execution failed
    xha239080-root-5.11# history
    checking the ocssd.log, I see some thing as follow:
    2013-09-16 18:46:24.238: [    CSSD][1]clssscmain: Starting CSS daemon, version 11.2.0.3.0, in (clustered) mode with uniqueness value 1379371584
    2013-09-16 18:46:24.239: [    CSSD][1]clssscmain: Environment is production
    2013-09-16 18:46:24.239: [    CSSD][1]clssscmain: Core file size limit extended
    2013-09-16 18:46:24.248: [    CSSD][1]clssscmain: GIPCHA down 1
    2013-09-16 18:46:24.249: [    CSSD][1]clssscGetParameterOLR: OLR fetch for parameter logsize (8) failed with rc 21
    2013-09-16 18:46:24.250: [    CSSD][1]clssscExtendLimits: The current soft limit for file descriptors is 65536, hard limit is 65536
    2013-09-16 18:46:24.250: [    CSSD][1]clssscExtendLimits: The current soft limit for locked memory is 4294967293, hard limit is 4294967293
    2013-09-16 18:46:24.250: [    CSSD][1]clssscGetParameterOLR: OLR fetch for parameter priority (15) failed with rc 21
    2013-09-16 18:46:24.250: [    CSSD][1]clssscSetPrivEnv: Setting priority to 4
    2013-09-16 18:46:24.253: [    CSSD][1]clssscSetPrivEnv: unable to set priority to 4
    2013-09-16 18:46:24.253: [    CSSD][1]SLOS: cat=-2, opn=scls_mem_lockdown, dep=11, loc=mlockall
    unable to lock memory
    2013-09-16 18:46:24.253: [    CSSD][1](:CSSSC00011:)clssscExit: A fatal error occurred during initialization
    Do anyone have any idea what going on and how can I fix it ?

    Hi,
    solaris has several issues with DISM, e.g.:
    Solaris 10 and Solaris 11 Shared Memory Locking May Fail (Doc ID 1590151.1)
    Sounds like Solaris Cluster  has a similar bug. A "workaround" is to reboot the (cluster) zone, that "fixes" the mlock error. This bug was introduced with updates in september, atleast to our environment (Solaris 11.1). Prior i did not have the issue and now i have to restart the entire zone, whenever i stop crs.
    With 11.2.0.3 the root.sh script can be rerun without prior cleaning up, so you should be able to continue installation at that point after the reboot. After the root.sh completes some configuration assistants need to be run, to complete the installation. You need to execute this manually as you wipe your oui session
    Kind Regards
    Thomas

  • Solaris Cluster 3 nodes

    Dears
    I have 3 nodes solaris cluster running with Oracle 9i database , my plan to upgrade to to Oracle RAC 11g.
    I have shutdown one node but i did not remove it from solaris cluster, i have installed Oracle 11g on it and will add second node.
    My question is : If I remove the second node from suncluster , Is the cluster will work with only one node??
    In the other meaning , I will remove 2 nodes out of 3 nodes cluser , is it possible?
    Thanks
    Ehab

    I have 3 nodes cluser and 1 x storage 6580,
    i created two quorom devices from the storage.
    I will remove two nodes, Is cluster will still work on single node ?
    Thanks
    Ehab

  • Creating a new Essbase Cluster on the same Solaris server

    Hi All,
    I have two servers:
    Server1: Foundation services, APS, EAS
    Server2: Essbase Server, Essbase Studio Server on epminstance_1
    Due to business requirements I need to "rename" the Essbase cluster from "EssbaseCluster-1" to something else.. I know this is not possible and from the below document I understand that I need to create a new instance on Server2 and configure Essbase and Essbase Studio on it.
    "How to Rename Essbase Cluster (Doc ID 1434439.1)"
    Goal: In EPM System Release 11.1.2.x, is it possible to rename the Essbase instance and cluster names after they are configured?
    Solution: No, it is not possible to rename the Essbase cluster or instance names after the initial configuration. If you need to change the instance and cluster names, create new instance and cluster. Export the applications from the old cluster and import them into the new cluster.
    My doubt lies with configuring the 2nd Essbase server as I am not clear how a single environment with two Essbase standalone instances on the same physical Solaris server, with each belonging to their own cluster will behave. I know that they are independent clusters and the concept of active/passive and active/active clusters are for Essbase instances part of the same cluster.
    I plan to create a new epminstance_2 instance on Server2 and configure the 2nd Essbase server as follows: give it a *new* cluster name and not assign it to the existing Essbase cluster and deploy it in standalone mode.
    1. Now I plan to use the 1st instance only as a backup option.. In an event where for some reason the new instance were to fail I would start up the services on the older essbase instance. Is this possible without any additional configuration or changes to OPMN?
    2. Alternatively, say we want to remove the older instance. Kindly suggest ways in which I can safely "remove" the older cluster (other than uninstalling). Also when users log in using SmartView, they would see the older and new cluster.. Is there anyways I can get rid of the older cluster without having to uninstall everthing on Server2 and start fresh?
    Thanks!

    Thanks John!
    I have a doubt which I hope you can throw some light now..
    I created a new instance and configured Essbase Server and Studio server on it. So now I have two EPM instances on the same physical server both having Essbase server and Essbase studio server - both Essbase servers belong to different Essbase clusters. Now from the EPM docs I do not find any mention that we can or cannot have multiple instances of Essbase Studio on the same server. Kindly correct me if I am wrong..
    In the deployment report I see two identical entries for the Essbase Studio server in the older epminstance but I do not see Essbase Studio of the 2nd Instance.
    epmsystem_1 (Server2)
                    Essbase Studio Server - 9080
    epmsystem_1 (Server2)
                    Essbase Studio Server - 9080
    After I start up the 2nd Essbase Studio server I tried connecting to it from Essbase Studio Console and got the error "Read permission denied to object folder:\'system'."
    Similarly when I run start_BPMS_bpms1_CommandLineClient.sh and issue the reinit command I get the same error.
    So I stopped the 2nd Essbase Studio server on epmsystem_2 and started up the first Essbase Studio server on epmsystem_1. I was able to connect to it fine from Essbase Studio console and reinit worked too.. I ran a drill through report which worked fine.
    -> So is it that we can have only one instance of Essbase Studio on a single server?
    Also say I want to use the Essbase Studio on the 2nd instance.. Could I re-configure just Essbase server on epmsystem_1 and re-configure Essbase server and Essbase studio server on epmsystem_2 ?
    Thanks,
    Kent

  • Prerequisites : 2-node Solaris Cluster 4.1 using VirtualBox.

    Hi,
    I am going to try building a 2-node Solaris Cluster 4.1 using VirtualBox. I have downloaded Solaris 11.1 ISO. Can someone please help me with the right configuration for the 2 nodes/guests, particularly the NICs and shared storage?
    Thanks,
    Shankar

    https://blogs.oracle.com/TF/entry/new_white_paper_practicing_solaris
    it's a bit dated but should still get you there.

  • Installing SOA Suite 10.1.3.4.0 on a Solaris cluster (2 nodes)

    Hi All,
    I have been looking for guidance on the installation of SOA Suite 10.1.3.4.0 on a Solaris cluster, and have been unable to find any. Does anyone have any info on this task? Or as an alternate question, how different is the cluster setup on Solaris from Linux?
    Thanks
    Sami

    There is no difference on installing a cluster on Solaris or Linux. The main difference are the required O/S packages. The you could follow my approach on installing a cluster:
    http://orasoa.blogspot.com/2009/04/soa-cluster-installation.html
    Marc

  • SAP 7.0 on SUN Cluster 3.2 (Solaris 10 / SPARC)

    Dear All;
    i'm installing a two nodes cluster (SUN Cluster 3.2 / Solaris 10 / SPARC), for a HA SAP 7.0 / Oracle 10g DataBase
    SAP and Oracle softwares were successfully installed and i could successfully cluster the Oracle DB and it is tested and working fine.
    for the SAP i did the following configurations
    # clresource create -g sap-ci-res-grp -t SUNW.sap_ci_v2 -p SAPSID=PRD -p Ci_instance_id=01 -p Ci_services_string=SCS -p Ci_startup_script=startsap_01 -p Ci_shutdown_script=stopsap_01 -p resource_dependencies=sap-hastp-rs,ora-db-res sap-ci-scs-res
    # clresource create -g sap-ci-res-grp -t SUNW.sap_ci_v2 -p SAPSID=PRD -p Ci_instance_id=00 -p Ci_services_string=ASCS -p Ci_startup_script=startsap_00 -p Ci_shutdown_script=stopsap_00 -p resource_dependencies=sap-hastp-rs,or-db-res sap-ci-Ascs-res
    and when trying to bring the sap-ci-res-grp online # clresourcegroup online -M sap-ci-res-grp
    it executes the startsap scripts successfully as following
    Sun Microsystems Inc.     SunOS 5.10     Generic     January 2005
    stty: : No such device or address
    stty: : No such device or address
    Starting SAP-Collector Daemon
    11:04:57 04.06.2008 LOG: Effective User Id is root
    Starting SAP-Collector Daemon
    11:04:57 04.06.2008 LOG: Effective User Id is root
    * This is Saposcol Version COLL 20.94 700 - V3.72 64Bit
    * Usage: saposcol -l: Start OS Collector
    * saposcol -k: Stop OS Collector
    * saposcol -d: OS Collector Dialog Mode
    * saposcol -s: OS Collector Status
    * Starting collector (create new process)
    * This is Saposcol Version COLL 20.94 700 - V3.72 64Bit
    * Usage: saposcol -l: Start OS Collector
    * saposcol -k: Stop OS Collector
    * saposcol -d: OS Collector Dialog Mode
    * saposcol -s: OS Collector Status
    * Starting collector (create new process)
    saposcol on host eccprd01 started
    Starting SAP Instance ASCS00
    Startup-Log is written to /export/home/prdadm/startsap_ASCS00.log
    saposcol on host eccprd01 started
    Running /usr/sap/PRD/SYS/exe/run/startj2eedb
    Trying to start PRD database ...
    Log file: /export/home/prdadm/startdb.log
    Instance Service on host eccprd01 started
    Jun 4 11:05:01 eccprd01 SAPPRD_00[26054]: Unable to open trace file sapstartsrv.log. (Error 11 Resource temporarily unavailable) [ntservsserver.cpp 1863]
    /usr/sap/PRD/SYS/exe/run/startj2eedb completed successfully
    Starting SAP Instance SCS01
    Startup-Log is written to /export/home/prdadm/startsap_SCS01.log
    Instance Service on host eccprd01 started
    Jun 4 11:05:02 eccprd01 SAPPRD_01[26111]: Unable to open trace file sapstartsrv.log. (Error 11 Resource temporarily unavailable) [ntservsserver.cpp 1863]
    Instance on host eccprd01 started
    Instance on host eccprd01 started
    and the it repeats the following warnings on the /var/adm/messages till it fails to the other node
    Jun 4 12:26:22 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:25 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:25 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:28 eccprd01 last message repeated 1 time
    Jun 4 12:26:28 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:31 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:31 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:34 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:34 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:37 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:37 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:40 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:40 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:43 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:43 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:46 eccprd01 last message repeated 1 time
    Jun 4 12:26:46 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:49 eccprd01 last message repeated 1 time
    Jun 4 12:26:49 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:52 eccprd01 last message repeated 1 time
    Jun 4 12:26:52 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:55 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:55 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:26:58 eccprd01 last message repeated 1 time
    Jun 4 12:26:58 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:01 eccprd01 last message repeated 1 time
    Jun 4 12:27:01 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:04 eccprd01 last message repeated 1 time
    Jun 4 12:27:04 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:07 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:07 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:10 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:10 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:13 eccprd01 last message repeated 1 time
    Jun 4 12:27:13 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:16 eccprd01 last message repeated 1 time
    Jun 4 12:27:16 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:19 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:19 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:22 eccprd01 last message repeated 1 time
    Jun 4 12:27:22 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:25 eccprd01 last message repeated 1 time
    Jun 4 12:27:25 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:28 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:28 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:31 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:31 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:34 eccprd01 last message repeated 1 time
    Jun 4 12:27:34 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:37 eccprd01 last message repeated 1 time
    Jun 4 12:27:37 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:40 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:40 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:43 eccprd01 last message repeated 1 time
    Jun 4 12:27:43 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
    Jun 4 12:27:46 eccprd01 last message repeated 1 time
    Jun 4 12:27:46 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dis
    can anyone one help me if there is any error on configurations or what is the cause of this problem.....thanks in advance
    ARSSES

    Hi all.
    I am having a similar issue with a Sun Cluster 3.2 and SAP 7.0
    Scenrio:
    Central Instance (not incluster) : Started on one node
    Dialog Instance (not in cluster): Started on the other node
    When I create the resource for SUNW.sap_as like
    clrs create --g sap-rg -t SUNW.sap_as .....etc etc
    in the /var/adm/messages I got lots of WAITING FOR DISPACHER TO COME UP....
    Then after timeout it gives up.
    Any clue? What does is try to connect or waiting for? I hve notest that it's something before the startup script....
    TIA

  • Cluster Setup on SOlaris 10 in Zone Environment

    Hi
    I would like to implement Sun CLuster 3.2 on Single Server, by creating 2 zones as nodes, In the same ref. can anyone provide me the detailed steps.
    Thanks
    Rajan

    For the single node cluster installation you can use the procedure explained at:
    http://opensolaris.org/os/community/ha-clusters/ohac/Documentation/SCXdocs/installsinglenode/
    While this is for Solaris Cluster Express, it will be the same procedure for Solaris 10 / Solaris Cluster 3.2.
    Once the single node cluster is installed, you simply configure and install two native non-global zones. Nothing cluster specific about that - refer to the standard Solaris Zones configuration.
    Lets assume you have configured two zones, names "z1" and "z1", your nodename for the global zone is "single-node". Then you can configure a resource group like:
    # clrg create -n single-node:z1,single-node:z2 my-rgAnd you can create your resources within my-rg. This resource group can then failover between z1 and z2 on that single node.
    Regards
    Thorsten

  • Solaris Cluster 3.3 on VMware ESX 4.1

    Hi there,
    I am trying to setup Solaris Cluster 3.3 on Vmware ESX 4.1
    My first question is: Is there anyone out there setted up Solaris Cluster on vmware accross boxes?
    My tools:
    Solaris 10 U9 x64
    Solaris Cluster 3.3
    Vmware ESX 4.1
    HP DL 380 G7
    HP P2000 Fibre Channel Storage
    When I try to setup cluster, just next next next, it completes successfully. It reboots the second node first and then the itself.
    After second node comes up on login screen, ping stops after 5 sec. Same either nodes!
    I am trying to understand why it does that? I did every possibility to complete this job. Setted up quorum as RDM from VMware. Solaris has direct access to quorum disk now.
    I am new to Solaris and I am having the errors below. If someone would like to help me it will be much appreciated!
    Please explain me in more details i am new bee in solaris :) Thanks!
    I need help especially on error: /proc fails to mount periodically during reboots.
    Here is the error messages. Is there any one out there setted up Solaris Cluster on ESX 4.1 ?
    * cluster check (ver 1.0)
    Report Date: 2011.02.28 at 16.04.46 EET
    2011.02.28 at 14.04.46 GMT
    Command run on host:
    39bc6e2d- sun1
    Checks run on nodes:
    sun1
    Unique Checks: 5
    ===========================================================================
    * Summary of Single Node Check Results for sun1
    ===========================================================================
    Checks Considered: 5
    Results by Status
    Violated : 0
    Insufficient Data : 0
    Execution Error : 0
    Unknown Status : 0
    Information Only : 0
    Not Applicable : 2
    Passed : 3
    Violations by Severity
    Critical : 0
    High : 0
    Moderate : 0
    Low : 0
    * Details for 2 Not Applicable Checks on sun1
    * Check ID: S6708606 ***
    * Severity: Moderate
    * Problem Statement: Multiple network interfaces on a single subnet have the same MAC address.
    * Applicability: Scan output of '/usr/sbin/ifconfig -a' for more than one interface with an 'ether' line. Check does not apply if zero or only one ether line.
    * Check ID: S6708496 ***
    * Severity: Moderate
    * Problem Statement: Cluster node (3.1 or later) OpenBoot Prom (OBP) has local-mac-address? variable set to 'false'.
    * Applicability: Applicable to SPARC architecture only.
    * Details for 3 Passed Checks on sun1
    * Check ID: S6708605 ***
    * Severity: Critical
    * Problem Statement: The /dev/rmt directory is missing.
    * Check ID: S6708638 ***
    * Severity: Moderate
    * Problem Statement: Node has insufficient physical memory.
    * Check ID: S6708642 ***
    * Severity: Critical
    * Problem Statement: /proc fails to mount periodically during reboots.
    ===========================================================================
    * End of Report 2011.02.28 at 16.04.46 EET
    ===========================================================================
    Edited by: user13603929 on 28-Feb-2011 22:22
    Edited by: user13603929 on 28-Feb-2011 22:24
    Note: Please ignore memory error I have installed 5GB memory and it says it requires min 1 GB! i think it is a bug!
    Edited by: user13603929 on 28-Feb-2011 22:25

    @TimRead
    Hi, thanks for reply,
    I have already followed the steps also on your links but no joy on this.
    What i noticed here is cluster seems to be buggy. Because i have tried to install cluster 3.3 on physical hardware and it gave me excat same error messages! interesting isnt it?
    Please see errors below that I got from on top of VMware and also on Solaris Physical hardware installation:
    ERROR1:
    Comment: I have installed different memories all the time. It keeps sayying that silly error.
    problem_statement : *Node has insufficient physical memory.
    <analysis>5120 MB of memory is installed on this node.The current release of Solaris Cluster requires a minimum of 1024 MB of physical memory in each node. Additional memory required for various Data Services.</analysis>
    <recommendations>Add enough memory to this node to bring its physical memory up to the minimum required level.
    ERROR2
    Comment: Despite rmt directory is there I gor error below on cluster check
    <problem_statement>The /dev/rmt directory is missing.
    <analysis>The /dev/rmt directory is missing on this Solaris Cluster node. The current implementation of scdidadm(1M) relies on the existence of /dev/rmt to successfully execute 'scdidadm -r'. The /dev/rmt directory is created by Solaris regardless of the existence of the actual nderlying devices. The expectation is that the user will never delete this directory. During a reconfiguration reboot to add new disk devices, if /dev/rmt is missing scdidadm will not create the new devices and will exit with the following error: 'ERR in discover_paths : Cannot walk /dev/rmt' The absence of /dev/rmt might prevent a failover to this node and result in a cluster outage. See BugIDs 4368956 and 4783135 for more details.</analysis>
    ERROR3
    Comment: All Nics have different MAC address though, also I have done what it suggests me. No joy here as well!
    <problem_statement>Cluster node (3.1 or later) OpenBoot Prom (OBP) has local-mac-address? variable set to 'false'.
    <analysis>The local-mac-address? variable must be set to 'true.' Proper operation of the public networks depends on each interface having a different MAC address.</analysis>
    <recommendations>Change the local-mac-address? variable to true: 1) From the OBP (ok> prompt): ok> setenv local-mac-address? true ok> reset 2) Or as root: # /usr/sbin/eeprom local-mac-address?=true # init 0 ok> reset</recommendations>
    ERROR4
    Comment: No comment on this, i have done what it says no joy...
    <problem_statement>/proc fails to mount periodically during reboots.
    <analysis>Something is trying to access /proc before it is normally mounted during the boot process. This can cause /proc not to mount. If /proc isn't mounted, some Solaris Cluster daemons might fail on startup, which can cause the node to panic. The following lines were found:</analysis>
    Thanks!

  • Error when creating ssh keys for Oracle RAC on Solaris 10

    I'm in the process of configuring 2 node oracle cluster running on Sun cluster 3.2 /solaris 10 OS.
    I have followed this oracle guide to when creating keys (oracle document No =B14205-01)
    But im having problem when executing this step
    bash-3.00$ scp authorized_keys tsavo-east:/oracle/.ssh/
    ssh: connect to host tsavo-east port 22: Connection timed out
    lost connection
    bash-3.00$
    Please advise
    Thanks
    Francis Mwangi

    Robert Thanks alot for your reply. have a look of what i found out
    ps -ef | grep sshd
    root 4270 4267 0 16:22:46 ? 0:00 /usr/lib/ssh/sshd
    root 759 1 0 Nov 09 ? 0:00 /usr/lib/ssh/sshd
    root 4267 759 0 16:22:41 ? 0:00 /usr/lib/ssh/sshd
    root 4372 4311 0 18:56:52 pts/3 0:00 grep sshd
    ===================================
    netstat -a | grep ssh
    *.ssh *.* 0 0 49152 0 LISTEN
    tsavo-west.ssh 10.30.210.213.2241 63668 51 49640 0 ESTABLISHED
    *.ssh *.* 49152 0 LISTEN
    6002b4aec88 stream-ord 6002b5a0740 00000000 /tmp/ssh-mgPl3398/agent.3398
    ==============================
    telnet tsavo-west 22
    Trying 10.20.3.151...
    Connected to tsavo-west.
    Escape character is '^]'.
    SSH-2.0-Sun_SSH_1.1.1
    #has hanged here for the last 10 minutes
    Thanks please what can you conclude from above; also from the console monitoring both nodes
    i saw messages that end with
    LINK-3-UPDOWN: Interface Dot11Radio0 , Changed state to up
    #another line here
    LINK-3-UPDOWN: Interface Dot11Radio0 , Changed state to down
    #another line here
    LINK-3-UPDOWN: Interface Dot11Radio0 , Changed state to up
    this happens on both nodes
    Any idea
    did i use the right document. have other steps that works (anyway not sure if this error are related to the stepsfor configuring ssh)
    Please help if you can ....

Maybe you are looking for

  • Windows with Mac-basic questions

    I had been using a my Mac for work; then switched to my notebook PC (Dell, D630) . That becuase I had to use outlook and ACT and a few other programs that are work related. I would like to update my MacBook to Leopard, and use boot-camp to run Window

  • Best way to distribute the calls

    Hi all, I would like to ask a suggestion for the scripting. Let's say I have 8 groups of agents and I have a menu step for user input. Once the caller selects the option, what will be recommended for the next step? 1. Set the queue to the mapped CSQ,

  • How to distribute new object from org level

    Hi Expert, I'm trying to create a new object: MATKL to org level through program: PFCG_ORGFIELD_CREATE. But I find there are only distribute to all roles. Could you advice if it can be distribute by selected role instead of all roles? Any advice on h

  • Reformat old iMac and restore files on new iMac

    I have an early 2008 iMac which I am about to sell and replace with a newer version. I regularly use Time Machine to do back ups. What's the best way to reformat my old machine before I sell and also restore all my files onto the new iMac?

  • HI.  Why does it take me a minimum of 4 hours to download 1 tv series.

    Hi     I am trying to download a tv series.  It is taking me over 4 hours for 1 episode why is this?