Cluster on VM sparc.

Hi all,
have 2 server Sparc T4 1 Storage 2540,is it necessary to install solaris cluster software for oracle database in each VM?
Thanks,
jari.

Hi,
it depends what you would like to achieve. A good starting point is
SPARC: Oracle Solaris Cluster Topologies - Oracle Solaris Cluster Concepts Guide
regards
Walter

Similar Messages

Solaris cluster 3.2 Sparc

Hi folks
First things first. I may not have great knowledge about Solaris clusters, so please be merciful :)
Here it is what I have:
- 2 x Netra T1 AC200 each with 1GB Ram, 2x18GB disks, 500 MHZ Sparc Cpu, 4 port ethernet card
- 1 array netra d130 3x36 GB
-- cable et all, switches , you name it
So, I set up the OS, all ok. I set up the cluster, all SEEMS to be ok.
But when I define my resources and stuff like that all goes fine, except when I try top bring the resource group on line.
On another configuration I teste the shared logical hostname and works fine.
Group Name Resources
Resources: ingresc nodec ingresr
-- Resource Groups --
Group Name Node Name State Suspended
Group: ingresc node2 Unmanaged No
Group: ingresc node1 Unmanaged No
-- Resources --
Resource Name Node Name State Status Message
Resource: nodec node2 Offline Offline
Resource: nodec node1 Offline Offline
Resource: ingresr node2 Offline Offline
Resource: ingresr node1 Offline Offline
scswitch: (C969069) Request failed because resource group ingresc is in ERROR_STOP_FAILED state and requires operator attention
Now, in /var/adm/messsages I spotted this :
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_stop>:tag=<IngresNCG.nodec.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
A little bit of research points in the direction of a bug (see CR 6565601)
Here it is what I see as my options:
1 - reinstall Solaris OS, but not the Solaris Cluster 3.2, instead using Solaris Express 10/07 or 2/08. But will this combination work ? Or will it work only in the combination Solaris Cluster Express and Solaris Express Developer Edition ? If the later, which versions will work together ?
2 - Beg for a Solaris Cluster 3.2 patch, although in my humble opinion, this should be free since it looks to me that once you write your own stuff, you run in the bug, and after all it is education
Any ideas, help, greatly appreciated
Many thanks
Armand

Although names are different since I used two setups, this is the relevant part of /var/adm/messages.
It looks to me Ingres resource is failing:
Mar 6 17:08:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_prenet_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar 6 17:08:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_prenet_start>:tag=<IngresNCG.nodec.10>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar 6 17:08:05 node2 svc.startd[8]: [ID 652011 daemon.warning] svc:/system/cluster/scsymon-srv:default: Method "/usr/cluster/lib/svc/method/svc_scsymon_srv start" failed with exit status 96.
Mar 6 17:08:05 node2 svc.startd[8]: [ID 748625 daemon.error] system/cluster/scsymon-srv:default misconfigured: transitioned to maintenance (see 'svcs -xv' for details)
Mar 6 17:08:09 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_prenet_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 1% of timeout <300 seconds>
Mar 6 17:08:09 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_PRENET_STARTED
Mar 6 17:08:09 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_STARTING
Mar 6 17:08:09 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <500> seconds
Mar 6 17:08:09 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_start>:tag=<IngresNCG.nodec.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_ONLINE
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <LogicalHostname online.>
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <500 seconds>
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_JUST_STARTED
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE_UNMON
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STARTING
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_MON_STARTING
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_UNKNOWN
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Starting>
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <bin/ingres_server_start> for resource <IngresNCR>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_monitor_start> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</global/disk2s0/ing_nc_1/ingresclu/bin/ingres_server_start>:tag=<IngresNCG.IngresNCR.0>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar 6 17:08:11 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_monitor_start>:tag=<IngresNCG.nodec.7>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar 6 17:08:12 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_monitor_start> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
Mar 6 17:08:12 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE
Mar 6 17:08:13 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server online.>
Mar 6 17:08:30 node2 sendmail[534]: [ID 702911 mail.alert] unable to qualify my own domain name (node2) -- using short name
Mar 6 17:08:30 node2 sendmail[535]: [ID 702911 mail.alert] unable to qualify my own domain name (node2) -- using short name
Mar 6 17:08:31 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server offline.>
Mar 6 17:08:45 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,stop]: [ID 147958 daemon.error] ERROR : HA-Ingres failed to stop.
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_FAULTED
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Ingres DBMS server faulted.>
Mar 6 17:08:46 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,start]: [ID 335575 daemon.error] ERROR : Stop method failed for the HA-Ingres data service.
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 938318 daemon.error] Method <bin/ingres_server_start> failed on resource <IngresNCR> in resource group <IngresNCG> [exit code <1>, time used: 11% of timeout <300 seconds>]
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_START_FAILED
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_PENDING_OFF_START_FAILED
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STOPPING
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_MON_STOPPING
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_UNKNOWN
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Stopping>
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <bin/ingres_server_stop> for resource <IngresNCR>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_monitor_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</global/disk2s0/ing_nc_1/ingresclu/bin/ingres_server_stop>:tag=<IngresNCG.IngresNCR.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar 6 17:08:46 node2 Cluster.RGM.rgmd: [ID 268902 daemon.notice] 45 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_monitor_stop>:tag=<IngresNCG.nodec.8>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar 6 17:08:47 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Bringing Ingres DBMS server offline.>
Mar 6 17:08:48 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_monitor_stop> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
Mar 6 17:08:48 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_ONLINE_UNMON
Mar 6 17:09:00 node2 SC[Ingres.ingres_server,IngresNCG,IngresNCR,stop]: [ID 147958 daemon.error] ERROR : HA-Ingres failed to stop.
Mar 6 17:09:02 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource IngresNCR status on node node2 change to R_FM_FAULTED
Mar 6 17:09:02 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource IngresNCR status msg on node node2 change to <Ingres DBMS server faulted.>
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 938318 daemon.error] Method <bin/ingres_server_stop> failed on resource <IngresNCR> in resource group <IngresNCG> [exit code <2>, time used: 5% of timeout <300 seconds>]
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource IngresNCR state on node node2 change to R_STOP_FAILED
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_PENDING_OFF_STOP_FAILED
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 424774 daemon.error] Resource group <IngresNCG> requires operator attention due to STOP failure
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_STOPPING
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_UNKNOWN
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <Stopping>
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_stop> for resource <nodec>, resource group <IngresNCG>, node <node2>, timeout <300> seconds
Mar 6 17:09:03 node2 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_stop>:tag=<IngresNCG.nodec.1>: Calling security_clnt_connect(..., host=<node2>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Mar 6 17:09:04 node2 ip: [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 192.168.005.085:0, remote = 000.000.000.000:0, start = -2, end = 6
Mar 6 17:09:04 node2 ip: [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection
Mar 6 17:09:04 node2 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource nodec status on node node2 change to R_FM_OFFLINE
Mar 6 17:09:04 node2 Cluster.RGM.rgmd: [ID 922363 daemon.notice] resource nodec status msg on node node2 change to <LogicalHostname offline.>
Mar 6 17:09:04 node2 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_stop> completed successfully for resource <nodec>, resource group <IngresNCG>, node <node2>, time used: 0% of timeout <300 seconds>
Mar 6 17:09:04 node2 Cluster.RGM.rgmd: [ID 443746 daemon.notice] resource nodec state on node node2 change to R_OFFLINE
Mar 6 17:09:04 node2 Cluster.RGM.rgmd: [ID 529407 daemon.notice] resource group IngresNCG state on node node2 change to RG_ERROR_STOP_FAILED
Mar 6 17:09:04 node2 Cluster.RGM.rgmd: [ID 424774 daemon.error] Resource group <IngresNCG> requires operator attention due to STOP failure
Mar 6 17:09:04 node2 Cluster.RGM.rgmd: [ID 663692 daemon.error] failback attempt failed on resource group <IngresNCG> with error <resource group in ERROR_STOP_FAILED state requires operator attention>
Mar 6 17:09:10 node2 java[1652]: [ID 807473 user.error] pkcs11_softtoken: Keystore version failure.Thank you
Armand

OracleAS R2 - Cluster Mixed Solaris SPARC/x86?

Is a mixed Solaris SPARC/x86 active-active cluster environment supported for OracleAS R2? What I mean by this is, can I put together a supported environment where an Identity Management node (node 1) is running as SPARC, and a second Identity Management node (node 2) is running Solaris x86?
Both OS's would be as identically configured as possible (both OS version & patch levels).
Cheers, Brad

Metalink note 429995.1
Says
Goal
Is it supported to install Application Server Oracle Homes on different operating systems or different versions of the same operating system?
Example 1: AS Infrastructure is installed on a Solaris 8 server and a Business Intelligence and Forms Middle Tier is installed on a Solaris 10 server.
Example 2: AS Infrastructure is installed on a Red Hat linux server and Business Intelligence and Forms Middle Tiers are installed on Windows 2003 servers.
Solution
It is completely supported to install each Application Server oracle home onto a different operating system or onto different versions of the same operating system.
Both of the above example scenarios are supported.
The only restriction is that members of a Middle-tier DCM-Managed OracleAS Cluster must be on the same operating system 'flavour'. As per the High Availability Guide:
All Oracle Application Server instances that are to be members of a DCM-Managed OracleAS Cluster must be installed on the same flavour operating system. For example, different variants of UNIX are clusterable together, but they are not clusterable with Windows systems.
Greetings

GI installation on a single-node cluster error.

Hello, I am trying to install GI on a single-node cluster (Solaris 10 / Sparc) but the root.sh script fails with the following error (this is not a GI installation for a Standalone Server :
root@selvac./dev/ASM/OCRVTD_DG # /app/oracle/grid/11.2/root.sh
Running Oracle 11g root script...
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /app/oracle/grid/11.2
Enter the full pathname of the local bin directory: [usr/local/bin]:
Copying dbhome to /usr/local/bin ...
Copying oraenv to /usr/local/bin ...
Copying coraenv to /usr/local/bin ...
Creating /var/opt/oracle/oratab file...
Entries will be added to the /var/opt/oracle/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /app/oracle/grid/11.2/crs/install/crsconfig_params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
OLR initialization - successful
root wallet
root wallet cert
root cert export
peer wallet
profile reader wallet
pa wallet
peer wallet keys
pa wallet keys
peer cert request
pa cert request
peer cert
pa cert
peer root cert TP
profile reader root cert TP
pa root cert TP
peer pa cert TP
pa peer cert TP
profile reader pa cert TP
profile reader peer cert TP
peer user cert
pa user cert
Adding daemon to inittab
ACFS-9200: Supported
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9309: ADVM/ACFS installation correctness verified.
CRS-2672: Attempting to start 'ora.mdnsd' on 'selvac'
CRS-2676: Start of 'ora.mdnsd' on 'selvac' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'selvac'
CRS-2676: Start of 'ora.gpnpd' on 'selvac' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'selvac'
CRS-2672: Attempting to start 'ora.gipcd' on 'selvac'
CRS-2676: Start of 'ora.cssdmonitor' on 'selvac' succeeded
CRS-2676: Start of 'ora.gipcd' on 'selvac' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'selvac'
CRS-2672: Attempting to start 'ora.diskmon' on 'selvac'
CRS-2676: Start of 'ora.diskmon' on 'selvac' succeeded
CRS-2676: Start of 'ora.cssd' on 'selvac' succeeded
ASM created and started successfully.
Disk Group OCRVTD_DG created successfully.
The ora.asm resource is not ONLINE
Did not succssfully configure and start ASM at /app/oracle/grid/11.2/crs/install/crsconfig_lib.pm line 6465.
/app/oracle/grid/11.2/perl/bin/perl -I/app/oracle/grid/11.2/perl/lib -I/app/oracle/grid/11.2/crs/install /app/oracle/grid/11.2/crs/install/rootcrs.pl execution failed
I also found the "PRVF-5150: Path OCRL:DISK1 is not a valid path on all nodes" error but as I have read it is a bug I Ignored it. But...
I think my ASM_DG OCR and voting is ok, accessible by grid user and 660. It seems ASM does not start or does not start in time.
Any help is wellcome.
Thanks in advance.

Thanks a lot for the hint. I had already checked this doc. but I think it is not the problem. Actually de error ora.asm is not online is not correct. After failing root.sh, ora.asm is ONLINE:
root@selvac./app/oracle/grid/11.2/bin # ./crsctl check resource ora.asm -init
root@selvac./app/oracle/grid/11.2/bin # ./crsctl stat resource ora.asm -init
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=ONLINE on selvac
The last part of the /app/oracle/grid/11.2/cfgtoollogs/crsconfig/rootcrs_selvac.log file reads :
>
ASM created and started successfully.
Disk Group OCRVTD_DG created successfully.
End Command output2011-04-14 13:24:16: Executing cmd: /app/oracle/grid/11.2/bin/crsctl check resource ora.asm -init
2011-04-14 13:24:17: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:17: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:17: Checking the status of ora.asm
2011-04-14 13:24:22: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:22: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:22: Checking the status of ora.asm
2011-04-14 13:24:27: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:28: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:28: Checking the status of ora.asm
2011-04-14 13:24:33: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:33: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:33: Checking the status of ora.asm
2011-04-14 13:24:38: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:38: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:38: Checking the status of ora.asm
2011-04-14 13:24:43: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:43: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:43: Checking the status of ora.asm
2011-04-14 13:24:48: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:49: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:49: Checking the status of ora.asm
2011-04-14 13:24:54: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:54: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:54: Checking the status of ora.asm
2011-04-14 13:24:59: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:24:59: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:24:59: Checking the status of ora.asm
2011-04-14 13:25:04: Executing cmd: /app/oracle/grid/11.2/bin/crsctl status resource ora.asm -init
2011-04-14 13:25:04: Command output:
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=OFFLINE
End Command output2011-04-14 13:25:04: Checking the status of ora.asm
2011-04-14 13:25:09: The ora.asm resource is not ONLINE
2011-04-14 13:25:09: Running as user grid: /app/oracle/grid/11.2/bin/cluutil -ckpt -oraclebase /app/grid -writeckpt -name ROOTCRS_BOOTCFG -state FAIL
2011-04-14 13:25:09: s_run_as_user2: Running /bin/su grid -c ' /app/oracle/grid/11.2/bin/cluutil -ckpt -oraclebase /app/grid -writeckpt -name ROOTCRS_BOOTCFG -state FAIL '
2011-04-14 13:25:10: Removing file /var/tmp/mbahSaGPn
2011-04-14 13:25:10: Successfully removed file: /var/tmp/mbahSaGPn
2011-04-14 13:25:10: /bin/su successfully executed
2011-04-14 13:25:10: Succeeded in writing the checkpoint:'ROOTCRS_BOOTCFG' with status:FAIL
2011-04-14 13:25:10: ###### Begin DIE Stack Trace ######
2011-04-14 13:25:10: Package File Line Calling
2011-04-14 13:25:10: --------------- -------------------- ---- ----------
2011-04-14 13:25:10: 1: main rootcrs.pl 322 crsconfig_lib::dietrap
2011-04-14 13:25:10: 2: crsconfig_lib crsconfig_lib.pm 6465 main::__ANON__
2011-04-14 13:25:10: 3: crsconfig_lib crsconfig_lib.pm 6390 crsconfig_lib::perform_initial_config
2011-04-14 13:25:10: 4: main rootcrs.pl 671 crsconfig_lib::perform_init_config
2011-04-14 13:25:10: ####### End DIE Stack Trace #######
2011-04-14 13:25:10: 'ROOTCRS_BOOTCFG' checkpoint has failed
So this must be a bug. During root.sh execution ora.asm is OFFLINE but after failing it is ONLINE. It maight be a question of waiting/repeating or timeout as I see the "Checking the status of ora.asm" command is repeated several times during root.sh, but not enough perhaps. Now root.sh is failed, installation halted but ASM is ONLINE.
Any other Idea?
Thanks again.

Latest Patch Cluster

Hi,
Can some help me, how to find and download the latest patch cluster package for sparc machine.
Thanks
Karthik

Moderator Action:
Not a hardware question. It is an OS question.
Your post has been moved from the hardware forum you had put it into
to a Solaris forum.
(we guessed Solaris 10 -- you didn't botrher to mention what you are using.)
Suggestion:
Since patches and patch clusters are only available to those with service contract login credentials to Oracle Support, you might try to log into MOS and search over there.

Problem removing Multi-owner_SVM

Hi,
I had QFS on my cluster (3.2u1 SPARC). I wanted to remove it after testing.
1. unmounted samfs type filesystems
2. deleted entries from vfstab
3. deleted NFS resource group, containing QFS resource
4. Removed SUNW qfs pakcages
Now I want to remove multi-owner SVM diskset from a cluster.
[root@callisto:~]# metaset
Multi-owner Set name = qfsset, Set number = 1, Master = callisto
Host                Owner          Member
callisto           multi-owner   Yes
leda               multi-owner   Yes
Drive Dbase
d18   Yes
d41   Yes
[root@callisto:~]# cldg show
=== Device Groups ===
Device Group Name:                              qfsset
Type:                                            Multi-owner_SVM
failback:                                        false
Node List:                                       callisto, leda
preferenced:                                     false
numsecondaries:                                  0
diskset name:                                    qfsset
[root@callisto:~]# metaclear -s qfsset -a
Proxy command to: leda
metaclear: leda: qfsset/d411: metadevice is openThis fails for every metadevice while trying do delete one by one. I checked with lsof and found no references for metadevices. I want to get rid of this SVM multi-master diskset. Please advise, what should I do.
Edited by: aleksf on Apr 23, 2008 7:07 AM

I vaguely remember hearing my colleague mention that he had a similar problem but it went away on a retry of the command. I would try and metaclear all the individual metadevices working in the reverse order and method from which they were created.
If it still won't clear, please raise a ticket with Sun support.
Regards,
Tim
---

Limit of 4 nodes for x86_64 clusters ?

Hello,
In the 3.2 2/08 documentation, Sun Cluster Concepts Guide for Solaris OS, the manual states
"x86: Sun Cluster software supports up to four nodes in a cluster. "
Can anyone explain this limitation ?
Thanks,
Jim

The section "Cluster Nodes" in Chapter 2. "Key Concepts for Hardware Service Providers" in the Sun Cluster Concepts Guide for Solaris OS in the Sun Cluster 3.2 2/08 Software Collection for Solaris OS (http://docs.sun.com/app/docs/doc/820-2554/bacbbigh?l=en&a=view) states:
Cluster Nodes
A cluster node is a machine that is running both the Solaris Operating System and the Sun Cluster software. A cluster node is either a current member of the cluster (a cluster member) or a potential member.
* SPARC: Sun Cluster software supports from one to sixteen nodes in a cluster. Different hardware configurations impose additional limits on the maximum number of nodes that you can configure in a cluster composed of SPARC based systems. See SPARC: Sun Cluster Topologies for SPARC for the supported node configurations.
* x86: If a cluster runs Oracle Real Application Clusters (RAC), Sun Cluster software supports from one to eight nodes in that cluster. If a cluster does not run Oracle RAC, Sun Cluster software supports from one to four nodes in that cluster. See x86: Sun Cluster Topologies for x86 for the supported node configurations.
So, AFAIK, one must be running Oracle RAC to create a cluster with Sun Cluster 3.2 2/08 that contains up to 8 nodes. Please let me know if that statement is incorrect for the Sun Cluster 3.2 2/08 release, and I'll add a correction to the Release Notes.
In the next release of Sun Cluster, the statement for x86 as shown above is expected to change to the following statement:
* x86: Sun Cluster software supports from one to eight Solaris hosts in a cluster. Different hardware configurations impose additional limits on the maximum number of hosts that you can configure in a cluster composed of x86 based systems. See x86: Sun Cluster Topologies for the supported configurations.
Note: "Solaris host" and "host" in the next release equals "node" in previous releases.
In other words, the Oracle RAC restriction is expected to be removed, although removal of this restriction is not guaranteed.
-Brian Keith
-Sun Cluster Technical Writer

Form-based authentication problem with weblogic

Hi Everyone,
The following problem related to form-based authentication
was posted one week ago and no reponse. Can someone give it
a shot? One more thing is added here. When I try it on J2EE
server and do the same thing, I didn't encounter this error
message, and I am redirected to the homeage.
Thanks.
-John
I am using weblogic5.1 and RDBMSRealm as the security realm. I am having the following problem with the form-based authentication login mechanism. Does anyone have an idea what the problem is and how to solve it?
When I login my application and logout as normal procedure, it is OK. But if I login and use the browser's BACK button to back the login page and try to login as a new user, I got the following error message,
"Form based authentication failed. Could not find session."
When I check the LOG file, it gives me the following message,
"Form based authentication failed. One of the following reasons could cause it: HTTP sessions are disabled. An old session ID was stored in the browser."
Normally, if you login and want to relogin without logout first, it supposes to direct you to the existing user session. But I don't understand why it gave me this error. I also checked my property file, it appears that the HTTP sessions are enabled as follows,
weblogic.httpd.session.enable=true

Hi...
Hehe... I actually did implement the way you implement it. My login.jsp actually checks if the user is authenticated. If yes, then it will forward it to the home page. On the other hand, I used ServletAuthentication to solve the problem mentioned by Cameron where Form Authentication Failed usually occurs for the first login attempt. I'm also getting this error occasionally. Using ServletAuthentication totally eliminates the occurence of this problem.
I'm not using j_security_check anymore. ServletAuthentication does all the works. It also uses RDBMSRealm to authenticate the user. I think the biggest disadvantage I can see when using ServletAuthentication is that the requested resource will not be returned after authentication cause the page returned after authenticating the user is actually hard coded (for my case, it's the home.jsp)
cheers...
Jerson
"John Wang" <[email protected]> wrote:
>
Hi Jerson,
I tried your code this weekend, it didn't work in my case. But
I solved my specific problem other way. The idea behind my problem is that the user tries to relogin when he already logs in. Therefore, I just redirect the user into another page when he is getting the login page by htting the BACK button, rather than reauthenticate the user as the way you did.
But, I think your idea is very helpful if it could work. Problems such multiple concurrence logins can be solved by pre-processing.
In your new code, you solved the problem with a new approach. I am just wondering, do you still implement it with your login.jsp file? In other word, your action in login.jsp is still "Authenticate"? Where do you put the URL "j_security_check"?
Thanks.
-John
"Jerson Chua" <[email protected]> wrote:
I've solved the problem by using ServletAuthentication. So far I'm not getting the error message. One of the side effects is that it doesn't return the requested URI after authentication, it will always return the home page.
Jerson
package com.cyberj.catalyst.web;
import weblogic.servlet.security.*;
import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
public class Authenticate extends HttpServlet {
private ServletAuthentication sa = new ServletAuthentication("j_username", "j_password");
public void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, java.io.IOException {
int authenticated = sa.weak(request, response);
if (authenticated == ServletAuthentication.NEEDS_CREDENTIALS ||
authenticated == ServletAuthentication.FAILED_AUTHENTICATION) {
response.sendRedirect("fail_login.jsp");
} else {
response.sendRedirect("Home.jsp");
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, java.io.IOException {
doPost(request, response);
"Jerson Chua" <[email protected]> wrote:
The problem is still there even if I use page redirection. Grrr... My boss wants me to solve this problem so what are the alternatives I can do? Are there any other ways of authenticating the user? In my web tier... I'm using isUserInRole, getRemoteUser and the web tier actually connects to EJBs. If I implement my custom authentication, I wouldn't be able to use this functionalities.
Has anyone solved this problem? I've tried the example itself and the same problem occurs.
Jerson
"Cameron Purdy" <[email protected]> wrote:
Jerson,
First try it redirected (raw) to see if that indeed is the problem ... then
if it works you can "fix" it the way you want.
Peace,
Cameron Purdy
Tangosol, Inc.
http://www.tangosol.com
+1.617.623.5782
WebLogic Consulting Available
"Jerson Chua" <[email protected]> wrote in message
news:[email protected]...
Hi...
Thanks for your suggestion... I've actually thought of that solution. Butusing page redirection will expose the user's password. I'm thinking of
another indirection where I will redirect it to another servlet but the
password is encrypted.
What do you think?
thanks....
Jerson
"Cameron Purdy" <[email protected]> wrote:
Maybe redirect to the current URL after killing the session to let the
request clean itself up. I don't think that a lot of the request (such
as
remote user) will be affected by killing the session until the nextrequest
comes in.
Peace,
Cameron Purdy
Tangosol, Inc.
http://www.tangosol.com
+1.617.623.5782
WebLogic Consulting Available
"Jerson Chua" <[email protected]> wrote in message
news:[email protected]...
Hello guys...
I've a solution but it doesn't work yet so I need your help. Because
one
of the reason for getting form base authentication failed is if an
authenticated user tries to login again. For example, the one mentionedby
John using the back button to go to the login page and when the user logsin
again, this error occurs.
So here's my solution
Instead of submitting the page to j_security_check, submit it to a
servlet
which will check if the user is logged in or not. If yes, invalidates its
session and forward it to j_security_check. But there's a problem in this
solution, eventhough the session.invalidate() (which actually logs theuser
out) is executed before forwarded to j_security_check, the user doesn't
immediately logged out. How did I know this, because after calling
session.invalidate, i tried calling request.RemoteUser() and it doesn't
return null. So I'm still getting the error. What I want to ask you guyis
how do I force logout before the j_security_check is called.
here's the code I did which the login.jsp actually submits to
import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
public class Authenticate extends HttpServlet {
public void doPost(HttpServletRequest request, HttpServletResponseresponse)
throws ServletException, java.io.IOException {
if (request.getRemoteUser() != null) {
HttpSession session = request.getSession(false);
System.out.println(session.isNew());
session.invalidate();
Cookie[] cookies = request.getCookies();
for (int i = 0; i < cookies.length; i++) {
cookies.setMaxAge(0);
getServletContext().getRequestDispatcher("/j_security_check").forward(reques
t, response);
public void doGet(HttpServletRequest request, HttpServletResponseresponse)
throws ServletException, java.io.IOException {
doPost(request, response);
let's help each other to solve this problem. thanks.
Jerson
"Jerson Chua" <[email protected]> wrote:
I thought that this problem will be solved on sp6 but to my
disappointment, the problem is still there. I'm also using RDBMSRealm,same
as John.
Jerson
"Cameron Purdy" <[email protected]> wrote:
John,
1. You are using a single WL instance (i.e. not clustered) on that
NT
box
and doing so without a proxy (e.g. specifying http://localhost:7001),
correct?
2. BEA will pay more attention to the problem if you upgrade to SP6.If
you don't have a reason NOT to (e.g. a particular regression), then
you
should upgrade. That will save you one go-around with support: "Hi,I
am
on SP5 and I have a problem.", "Upgrade to SP6 to see if that fixes
it.
Call back if that doesn't work."
3. Make sure that you are not doing anything special before or after
J_SECURITY_CHECK ... make sure that you have everything configuredand
done
by the book.
4. Email BEA a bug report at [email protected] ... see what they say.
Peace,
Cameron Purdy
Tangosol, Inc.
http://www.tangosol.com
+1.617.623.5782
WebLogic Consulting Available
"John Wang" <[email protected]> wrote in message
news:[email protected]...
Cameron,
It seems to me that the problem I encountered is different a little
from
what you have, evrn though the error message is the same eventually.
Everytime I go through, I always get that error.
I am using weblogic5.1 and sp5 on NT4.0. Do you have any solutions
to
work
around this problem? If it was a BUG as you
pointed out, is there a way we can report it to the Weblogic
technical support and let them take a look?
Thnaks.
-John
"Cameron Purdy" <[email protected]> wrote:
John,
I will verify that I have seen this error now (after having read
about it
here for a few months) and it had the following characteristics:
1) It was intermittent, and appeared to be self-curing
2) It was not predictable, only seemed to occur at the first
login
attempt,
and may have been timing related
3) This was on Sun Solaris on a cluster of 2 Sparc 2xx's; the
proxy
was
Apache (Stronghold)
4) After researching the newsgroups, it appears that this "bug"
may
have gone away temporarily (?) in SP5 (although Jerson Chua
<[email protected]> mentioned that he still got it in SP5)
I was able to reproduce it most often by deleting the tmpwar and
tmp_deployments directories while the cluster was not running,
then
restarting the cluster. The first login attempt would fail(roughly
90%
of
the time?) and that server instance would then be ignored by the
proxy
for a
while (60 seconds?) -- meaning that the proxy would send all
traffic,
regardless of the number of "clients", to the other server in thecluster.
As far as I can tell, it is a bug in WebLogic, and probably has
been
there
for quite a while.
Peace,
Cameron Purdy
Tangosol, Inc.
http://www.tangosol.com
+1.617.623.5782
WebLogic Consulting Available
"John Wang" <[email protected]> wrote in message
news:[email protected]...
Hi Everyone,
The following problem related to form-based authentication
was posted one week ago and no reponse. Can someone give it
a shot? One more thing is added here. When I try it on J2EE
server and do the same thing, I didn't encounter this error
message, and I am redirected to the homeage.
Thanks.
-John
I am using weblogic5.1 and RDBMSRealm as the security realm. I
am
having
the following problem with the form-based authentication login
mechanism.
Does anyone have an idea what the problem is and how to solve it?
When I login my application and logout as normal procedure, it
is
OK.
But
if I login and use the browser's BACK button to back the login
page
and
try
to login as a new user, I got the following error message,
"Form based authentication failed. Could not find session."
When I check the LOG file, it gives me the following message,
"Form based authentication failed. One of the following reasons
could
cause it: HTTP sessions are disabled. An old session ID was stored
in
the
browser."
Normally, if you login and want to relogin without logout first,
it
supposes to direct you to the existing user session. But I don'tunderstand
why it gave me this error. I also checked my property file, it
appears
that
the HTTP sessions are enabled as follows,
weblogic.httpd.session.enable=true

SAP 7.0 on SUN Cluster 3.2 (Solaris 10 / SPARC)

Dear All;
i'm installing a two nodes cluster (SUN Cluster 3.2 / Solaris 10 / SPARC), for a HA SAP 7.0 / Oracle 10g DataBase
SAP and Oracle softwares were successfully installed and i could successfully cluster the Oracle DB and it is tested and working fine.
for the SAP i did the following configurations
# clresource create -g sap-ci-res-grp -t SUNW.sap_ci_v2 -p SAPSID=PRD -p Ci_instance_id=01 -p Ci_services_string=SCS -p Ci_startup_script=startsap_01 -p Ci_shutdown_script=stopsap_01 -p resource_dependencies=sap-hastp-rs,ora-db-res sap-ci-scs-res
# clresource create -g sap-ci-res-grp -t SUNW.sap_ci_v2 -p SAPSID=PRD -p Ci_instance_id=00 -p Ci_services_string=ASCS -p Ci_startup_script=startsap_00 -p Ci_shutdown_script=stopsap_00 -p resource_dependencies=sap-hastp-rs,or-db-res sap-ci-Ascs-res
and when trying to bring the sap-ci-res-grp online # clresourcegroup online -M sap-ci-res-grp
it executes the startsap scripts successfully as following
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
stty: : No such device or address
stty: : No such device or address
Starting SAP-Collector Daemon
11:04:57 04.06.2008 LOG: Effective User Id is root
Starting SAP-Collector Daemon
11:04:57 04.06.2008 LOG: Effective User Id is root
* This is Saposcol Version COLL 20.94 700 - V3.72 64Bit
* Usage: saposcol -l: Start OS Collector
* saposcol -k: Stop OS Collector
* saposcol -d: OS Collector Dialog Mode
* saposcol -s: OS Collector Status
* Starting collector (create new process)
* This is Saposcol Version COLL 20.94 700 - V3.72 64Bit
* Usage: saposcol -l: Start OS Collector
* saposcol -k: Stop OS Collector
* saposcol -d: OS Collector Dialog Mode
* saposcol -s: OS Collector Status
* Starting collector (create new process)
saposcol on host eccprd01 started
Starting SAP Instance ASCS00
Startup-Log is written to /export/home/prdadm/startsap_ASCS00.log
saposcol on host eccprd01 started
Running /usr/sap/PRD/SYS/exe/run/startj2eedb
Trying to start PRD database ...
Log file: /export/home/prdadm/startdb.log
Instance Service on host eccprd01 started
Jun 4 11:05:01 eccprd01 SAPPRD_00[26054]: Unable to open trace file sapstartsrv.log. (Error 11 Resource temporarily unavailable) [ntservsserver.cpp 1863]
/usr/sap/PRD/SYS/exe/run/startj2eedb completed successfully
Starting SAP Instance SCS01
Startup-Log is written to /export/home/prdadm/startsap_SCS01.log
Instance Service on host eccprd01 started
Jun 4 11:05:02 eccprd01 SAPPRD_01[26111]: Unable to open trace file sapstartsrv.log. (Error 11 Resource temporarily unavailable) [ntservsserver.cpp 1863]
Instance on host eccprd01 started
Instance on host eccprd01 started
and the it repeats the following warnings on the /var/adm/messages till it fails to the other node
Jun 4 12:26:22 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:25 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:25 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:28 eccprd01 last message repeated 1 time
Jun 4 12:26:28 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:31 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:31 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:34 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:34 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:37 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:37 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:40 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:40 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:43 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:43 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:46 eccprd01 last message repeated 1 time
Jun 4 12:26:46 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:49 eccprd01 last message repeated 1 time
Jun 4 12:26:49 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:52 eccprd01 last message repeated 1 time
Jun 4 12:26:52 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:55 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:55 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:26:58 eccprd01 last message repeated 1 time
Jun 4 12:26:58 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:01 eccprd01 last message repeated 1 time
Jun 4 12:27:01 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:04 eccprd01 last message repeated 1 time
Jun 4 12:27:04 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:07 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:07 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:10 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:10 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:13 eccprd01 last message repeated 1 time
Jun 4 12:27:13 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:16 eccprd01 last message repeated 1 time
Jun 4 12:27:16 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:19 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:19 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:22 eccprd01 last message repeated 1 time
Jun 4 12:27:22 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:25 eccprd01 last message repeated 1 time
Jun 4 12:27:25 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:28 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:28 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:31 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:31 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:34 eccprd01 last message repeated 1 time
Jun 4 12:27:34 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:37 eccprd01 last message repeated 1 time
Jun 4 12:27:37 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:40 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:40 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:43 eccprd01 last message repeated 1 time
Jun 4 12:27:43 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-scs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dispatcher to come up.
Jun 4 12:27:46 eccprd01 last message repeated 1 time
Jun 4 12:27:46 eccprd01 SC[SUNW.sap_ci_v2,sap-ci-res-grp,sap-ci-Ascs-res,sap_ci_svc_start]: [ID 183934 daemon.notice] Waiting for SAP Central Instance main dis
can anyone one help me if there is any error on configurations or what is the cause of this problem.....thanks in advance
ARSSES

Hi all.
I am having a similar issue with a Sun Cluster 3.2 and SAP 7.0
Scenrio:
Central Instance (not incluster) : Started on one node
Dialog Instance (not in cluster): Started on the other node
When I create the resource for SUNW.sap_as like
clrs create --g sap-rg -t SUNW.sap_as .....etc etc
in the /var/adm/messages I got lots of WAITING FOR DISPACHER TO COME UP....
Then after timeout it gives up.
Any clue? What does is try to connect or waiting for? I hve notest that it's something before the startup script....
TIA

Upgrade from Solaris 8 SPARC with Sun cluster 3.1u3 to Solaris 10 SPARC

Dear All,
We are planning an upgrade of the OS from Solaris 8 SPARC to Solaris 10 SPARC on a two-node active-standby clustered system.
The current major software we have on the Solaris 8 system are:
1: Sun Cluster 3.1u3
2: Oracle 9i 9.2.0.8
3: Veritas File System Vxfs v4.0
4: Sun Solaris 8 2/04 SPARC
Any pointers as to what sequence and how the upgrade should be done?
Thanks in advance.
Regards,
Ray

yes I know it can be quite complicated and complex, but Sun provided us with a detailed documentation, at least in our case Solaris 9 to 10 it was very helpful.
You might get better help in the cluster forum http://forums.sun.com/forum.jspa?forumID=842
-- Nick

Solaris 10 SPARC Recommended Patch Cluster for 2008 quarter 3 version

Dear All,
Could any one please guide me, where I can download "Solaris 10 SPARC Recommended Patch Cluster" for 2008 quarter 3 version.
I have checked in sunsolve.sun.com, I'm able to find only the latest release.
please guide me.
Thanks and regards,
veera

Ok,
Here's a cute little formula to try using your systems parameters to gain some "head's-up" on an expected or estimated time to complete the patch run.
You will need a few things to prepare.
Number of "real" CPU's (not hyperthreads)
Speed of each CPU as a whole number i.e 2.87Ghz = 2,867
the total number of patches from the cluster i.e. Sept 15th = 381 (Solaris 10)
Network factor if using NFS = 2.5
local cluster file factor = 1.5
Patch cluster on CDROM factor = 3.25
Now, combine all of those elements in this equation:
{ ( #1 * #2 ) / #3 * factor(#4 or #5 or #6) }
This will yield a number in minutes of patch run time. Then all you will need to add is the standard boot up time to get the total of all patching and reboot.
Of course you could write a script to extract all of this information and feed it to "bc -l" to get a quick figure.
Example, on one of my Solaris 10 boxes with the information filled in to the equation:
#> echo "((4*2660)/381)*1.5"|bc -l
#> 41.88976377952755905511
It actually took 44.2 minutes to complete the patching of this box plus another 12 minutes to reboot. But all in all a pretty fair estimate I think.

Problems installing Cluster 3.3.2-SPARC

Attempting to install Solaris Cluster 3.3.2-SPARC on an Ultrasparc T5140 w/ Firmware 7.2.10.a and running Solaris 10 (sunOS 5.10 Generic_147147-26 sun4v sparc
cd /cdrom/cdrom0/Solaris_sparc/
./installer
"The installer you have invoked can run only in platform. Please invoke the installer for Solaris_sparc platform."
./installer -nodisplay
"The installer you have invoked can run only in platform. Please invoke the installer for Solaris_sparc platform."
/cdrom/cdrom0/Solaris_sparc/installer
Usage: dirname [path]
/cdrom/cdrom0/Solaris_sparc/installer: /bin/basename,/usr/bin/basename,/usr/usb/basename: not found.
/cdrom/cdrom0/Solaris_sparc/installer: /bin/basename,/usr/bin/basename,/usr/usb/basename: not found.
/cdrom/cdrom0/Solaris_sparc/installer: /bin/cut,/usr/bin/cut: not found.
/cdrom/cdrom0/Solaris_sparc/installer: /bin/cut,/usr/bin/cut: not found.
You must be root to run this script.
id
uid=0(root) gid=0(root)
copied entire contents of disc to /Solaris\ Cluster\ 3.3.2-SPARC
chown -R root:root /Solaris\ Cluster\ 3.3.2-SPARC
Did the exact same process as detailed above with the exact same results (except it says /Solaris Cluster 3.3.2-SPARC/Solaris_sparc/installer: ... for error messages)
Can't for the life of me figure out why it won't do what it's supposed to... Any thoughts?

prtdiag -v - Pastebin.com
df -k - Pastebin.com
zoneadm list -cv
ID NAME
STATUS
PATH
BRAND
IP
   0 global
running
native   shared
which basename
/usr/bin/basename
ls -la /usr/bin/basename
-r-xr-xr-x   1 root
bin
10064 Jun 30 2006 /usr/bin/basename
the osc_4_2_repo_full is the solaris cluster 4.2 full repo, downloaded it out of desperation, realizing after the fact that it won't work with Solaris 10.

Hardware recommendations for learning Solaris Cluster on Sparc (at home)

On a low budget, I'd like to put together a Solaris Cluster on Sparc (at home). At "work" in the next year we will be implementing a Solaris Cluster to run Tomcat and a custom CORBA server. (These apps will be migrated from very old hardware and VCS) The CORBA server is a Sparc binary, hence the need for Sparc. I'd like my home-office cluster to be similar in function to what I have at work. At work we have (2) T5120 Servers and a 2540 (2500-M2) Array waiting. From looking at the Solaris Cluster docs, it looks like you use a 2540 in a Direct-Connect configuration. We will be going to Solaris Cluster training eventually, but not soon. In the meantime, I'd like to keep/gain some skills/experience.
Potential (cheap) Home Cluster:
(2) SunFire V245 or (2) T1000 or (2) something_cheap
connected to
(1) Storedge D2 or (1) Storedge S1
My main desire, is for the interconnects and failover on this Home Cluster to behave the same way as the T5120s with the 2540 Array. Example, if I yank a HD (or replace) then I'd like it to give very similar messages to what I will face at work in the future. I'd like the creation of ZFS pools etc to work similarly. I'd like SCSI cards (HBAs or whatever) and cabling to be cheap.
Any recommendations on hardware> Servers? Arrays? SCSI Cards/cabling?
Thanks,
Scott

I settled on:
(2) Sunfire V210
Storedge 3120
Connected by VHDCI
All used equipment at a cheap price. Should be a great little testbed.

Helps for WLP9.2 in a cluster: deploy errors, propagation errors

I wrote a document for our portal project's production support that details how to fix several recuring problems with WLP9.2 (GA) in cluster. I'm sharing this with the community as a help.
Email me if you'd like updates or have additional scenarios / fixes or you'd like the word doc version.
Curt Smith, Atlanta WLP consultant, [email protected], put HELP WLP as a part of the subject please.
1     WLP 9.2 Environment
The cluster environment where these symptoms are frequently seen is described by the following:
1.     Solaris 9 on 4 cpu sparc boxes. 16Gb Ram. The VM is given ?Xmx1034m.
2.     Bea private patch: RSGT, fixed cluster deployments.
3.     Bea private patch: BG74, fixed session affinity, changed the session tracking cookie name back to the standard name: JSESSIONID.
4.     IBM UDB (DB2) using the Bea DB2 driver. FYI: the problems described below I don?t feel have any relationship to the DB used for a portal.
2     Common problems and their fixes
2.1     Failed deploy (Install) of a new portal application via the cluster console.
Problem symptoms:
After clicking Finish (or) Commit the console displays that there where errors. All errors require this procedure.
Fix steps:
1.     Shut down the whole cluster. Using Bea?s shutdown script takes too long, or doesn?t work if the Admin is down or hung.
a.     Find the PID to kill: lsof ?C | grep <listen port number>
b.     kill <pid>
c.     Run kill again and if the pid is still running then do: kill -9 <pid>
2.     Admin server:
a.     Remove file: <domain>/config/config.lok
b.     Edit file: <domain>/config/config.xml
c.     Remove all elements of: <app-deployment> ? </app-deployment>
d.     Start the Admin server
e.     Make sure it comes up in the running state. Use the console: servers ? admin ? control - resume if needed.
3.     Both Managed servers:
a.     cd <domain>/config
b.     rm ?rf *
This forces the re-down load of the cleaned up config.xml.
c.     cd <domain>/servers/<managed_name>
d.     rm ?rf tmp cache stage data
This cleans up stale or jammed sideways deploys. Be sure to delete all three directorys: tmp, cache, stage and data.
e.     Start each managed server.
f.     Make sure the managed instance comes up in the running state and not admin. Go to server-<instance>-control-Resume to set the run state to running.
You can now use the console to Install your applications.
2.2     The portal throws framework / container exceptions on one managed instance.
Problem symptoms:
If you see exceptions from the classloader re a framework class not found, serialization failure or error etc. In general the symptom is that the container is not stable, running correctly, not making sense or your application works on one managed instance but not on the other.
Fix steps:
1.     Shut down the problem managed instance. Using Bea?s shutdown script takes too long, or doesn?t work if the Admin is down or hung.
a.     Find the PID to kill: lsof ?C | grep <listen port number>
b.     kill <pid>
c.     Run kill again and if the pid is still running then do: kill -9 <pid>
2.     Perform these clean up steps on one or both managed instances:
a.     cd <domain>/config
b.     rm ?rf *
This forces the re-down load of the cleaned up config.xml.
c.     cd <domain>/servers/<managed_name>
d.     rm ?rf tmp cache stage data
This cleans up stale or jammed sideways deploys. Be sure to delete all three directorys: tmp, cache, stage and data.
e.     Start each managed server.
f.     Make sure the managed instance comes up in the running state and not admin. Go to server-<instance>-control-Resume to set the run state to running.
3.     The libraries and applications should auto deploy as the managed instance comes up. Once the managed instance goes into the running state, or you Resume into the running state. Your application should be accessible. Sometimes it takes a few seconds after going into the running state for all applications to be instantiated.
2.3     Content propagation fails on the commit step.
Problem symptoms:
In the log of the managed instance you specified in the propagation ant script you?ll see exceptions regarding not being able to create or instantiate a dynamic delegated role.
There is an underlying bug / robustness issue with WLP9.2 (GA) where periodically you can?t create delegated roles either with the PortalAdmin or via the propagation utility.
Important issue:
This procedure was supplied by Bea which will remove from the internal LDAP and the portal DB your custom / created roles. This will leave your cluster in the new installation state with just the default users and roles: weblogic and portaladmin. The implications are that you?ll have to boot your cluster with console user: weblogic / weblogic. You can then add back your secure console user/password but you?ll have to do this over and over as propagations fail. The observed failure rate is once every 2-3 weeks if you do propagations daily.
Note:
The following assumes that you left the default console user weblogic password to be the default password of: weblogic. The following procedure deletes the local LDAP but leaves the rows in the DB.users table including the SHA-1 hashed passwords. The following procedure should still work if you changed the password for weblogic, but it probably won?t work if you try to substitute your secure console user/pw because there will be no delegated authorization roles mapping to your custom console user. You might experiment with this scenario.
Fix steps:
1.     Shut down the whole cluster. Using Bea?s shutdown script takes too long, or doesn?t work if the Admin is down or hung.
a.     Find the PID to kill: lsof ?C | grep <listen port number>
b.     kill <pid>
c.     Run kill again and if the pid is still running then do: kill -9 <pid>
2.     Admin server:
a.     Remove directory: <domain>/servers/AdminServer/data/ldap
b.     Run this SQL script after you edit it for your schema:
delete from yourschema.P13N_DELEGATED_HIERARCHY;
delete from yourschema.P13N_ENTITLEMENT_POLICY;
delete from yourschema.P13N_ENTITLEMENT_RESOURCE;
delete from yourschema.P13N_ENTITLEMENT_ROLE;
delete from yourschema.P13N_ENTITLEMENT_APPLICATION;
commit;
c.     Start the Admin server
d.     Make sure it comes up in the running state. Use the console: servers ? admin ? control - resume if needed.
3.     Both Managed servers:
a.     cd <domain>/servers/<managed_name>
b.     rm ?rf data
This forces the re-down load of the LDAP directory.
c.     Start each managed server.
d.     Make sure the managed instance comes up in the running state and not admin. Go to server-<instance>-control-Resume to set the run state to running.
2.4     The enterprise portal DB fails and needs to be restored OR switch DB instances
Restoring a portal DB or switching existing DBs are similar scenarios. The issues that you?ll face with WLP9.2 since it now uses a JDBCauthenticator to authenticate and authorize the console / boot user you need to first be able to connect to the DB before the admin and managed instances can boot. If you haven?t properly encrypted the DB user?s password in the <domain>/config/jdbc/*.xml files, then you?ll not be able to boot the admin server since you won?t be able to create a JDBC connection to the DB. The boot messages are not clear as to what the failure is.
You?ll need to know an Admin role user and password that?s in the DB you?re wanting to connect to, to put into boot.properties and on the managed instances in their boot.properties or startManaged scripts. Don?t forget that the managed instances have local credentials which is new for 9.2. They are in the startManaged script in clear text or a local boot.properties.
Note:
The passwords in the DB are SHA-1 hashed and there is no SHA-1 hash generator tool so you can?t change a password via SQL, but you can move the password from one DB to another. This is possible because the domain encryption salt is not used to generate the SHA-1 string. As it turns out, the SHA-1 string is compatible with all 9.2 cluster domains. IE the domain DES3 salt has nothing to do with password verification. The same password SHA-1 string taken from different domains or even same domain but different users will be different, this is just a randomization put into the algorithm yet every domain will be able to validate the given password against the DB?s SHA-1 string. Because of this, I?ve not had any problem moving DB instances between clusters, especially if I?ve given up on security and use weblogic/weblogic as the console user and this user / pw is in every DB.
Steps:
The assumption for restoring a DB is that the DB has been restored but it?s an older version and doesn?t have the console user/pw that is in boot.properties. At this point swaping a DB is the same as restoring an old version of the portal DB.
1.     Edit the console user and password as clear text into: <domain>/servers/AdminServers/security/boot.properties
This is where you may give in and use weblogic/weblogic.
2.     Set the correct DB access password encryption in the jdbc/*.xml files.
a.     cd <domain>/bin
b.     . ./setDomainEnv.sh or if you?re on windows just run: setDomainEnv.cmd
c.     java weblogic.security.Encrypt <the_db_password>
d.     Edit the returned string into every <domain>/config/jdbc/*.xml
e.     Make sure the *.xml files point to the correct DB host, port, schema, DB name, DB user.
3.     Start the Admin server. I should come up. If it doesn?t it has to be not being able to create a connection to the DB, which depends on the *.xml having the correct user and DES-3 encrypted password.
4.     Edit the new console user/password on the managed instance bin/startManaged script or the local boot.properties.
5.     Start the managed instances.
2.5     Install patches on a host that does not have internet access
The short description is to run smart update on a host that does have internet access. Fetch out of the <domain>/utils/bsu/cache_dir     directory the downloaded patch jar and xml. Manually apply the patch to your non-internet accessible hosts.

I wrote a document for our portal project's production support that details how to fix several recuring problems with WLP9.2 (GA) in cluster. I'm sharing this with the community as a help.
Email me if you'd like updates or have additional scenarios / fixes or you'd like the word doc version.
Curt Smith, Atlanta WLP consultant, [email protected], put HELP WLP as a part of the subject please.
1     WLP 9.2 Environment
The cluster environment where these symptoms are frequently seen is described by the following:
1.     Solaris 9 on 4 cpu sparc boxes. 16Gb Ram. The VM is given ?Xmx1034m.
2.     Bea private patch: RSGT, fixed cluster deployments.
3.     Bea private patch: BG74, fixed session affinity, changed the session tracking cookie name back to the standard name: JSESSIONID.
4.     IBM UDB (DB2) using the Bea DB2 driver. FYI: the problems described below I don?t feel have any relationship to the DB used for a portal.
2     Common problems and their fixes
2.1     Failed deploy (Install) of a new portal application via the cluster console.
Problem symptoms:
After clicking Finish (or) Commit the console displays that there where errors. All errors require this procedure.
Fix steps:
1.     Shut down the whole cluster. Using Bea?s shutdown script takes too long, or doesn?t work if the Admin is down or hung.
a.     Find the PID to kill: lsof ?C | grep <listen port number>
b.     kill <pid>
c.     Run kill again and if the pid is still running then do: kill -9 <pid>
2.     Admin server:
a.     Remove file: <domain>/config/config.lok
b.     Edit file: <domain>/config/config.xml
c.     Remove all elements of: <app-deployment> ? </app-deployment>
d.     Start the Admin server
e.     Make sure it comes up in the running state. Use the console: servers ? admin ? control - resume if needed.
3.     Both Managed servers:
a.     cd <domain>/config
b.     rm ?rf *
This forces the re-down load of the cleaned up config.xml.
c.     cd <domain>/servers/<managed_name>
d.     rm ?rf tmp cache stage data
This cleans up stale or jammed sideways deploys. Be sure to delete all three directorys: tmp, cache, stage and data.
e.     Start each managed server.
f.     Make sure the managed instance comes up in the running state and not admin. Go to server-<instance>-control-Resume to set the run state to running.
You can now use the console to Install your applications.
2.2     The portal throws framework / container exceptions on one managed instance.
Problem symptoms:
If you see exceptions from the classloader re a framework class not found, serialization failure or error etc. In general the symptom is that the container is not stable, running correctly, not making sense or your application works on one managed instance but not on the other.
Fix steps:
1.     Shut down the problem managed instance. Using Bea?s shutdown script takes too long, or doesn?t work if the Admin is down or hung.
a.     Find the PID to kill: lsof ?C | grep <listen port number>
b.     kill <pid>
c.     Run kill again and if the pid is still running then do: kill -9 <pid>
2.     Perform these clean up steps on one or both managed instances:
a.     cd <domain>/config
b.     rm ?rf *
This forces the re-down load of the cleaned up config.xml.
c.     cd <domain>/servers/<managed_name>
d.     rm ?rf tmp cache stage data
This cleans up stale or jammed sideways deploys. Be sure to delete all three directorys: tmp, cache, stage and data.
e.     Start each managed server.
f.     Make sure the managed instance comes up in the running state and not admin. Go to server-<instance>-control-Resume to set the run state to running.
3.     The libraries and applications should auto deploy as the managed instance comes up. Once the managed instance goes into the running state, or you Resume into the running state. Your application should be accessible. Sometimes it takes a few seconds after going into the running state for all applications to be instantiated.
2.3     Content propagation fails on the commit step.
Problem symptoms:
In the log of the managed instance you specified in the propagation ant script you?ll see exceptions regarding not being able to create or instantiate a dynamic delegated role.
There is an underlying bug / robustness issue with WLP9.2 (GA) where periodically you can?t create delegated roles either with the PortalAdmin or via the propagation utility.
Important issue:
This procedure was supplied by Bea which will remove from the internal LDAP and the portal DB your custom / created roles. This will leave your cluster in the new installation state with just the default users and roles: weblogic and portaladmin. The implications are that you?ll have to boot your cluster with console user: weblogic / weblogic. You can then add back your secure console user/password but you?ll have to do this over and over as propagations fail. The observed failure rate is once every 2-3 weeks if you do propagations daily.
Note:
The following assumes that you left the default console user weblogic password to be the default password of: weblogic. The following procedure deletes the local LDAP but leaves the rows in the DB.users table including the SHA-1 hashed passwords. The following procedure should still work if you changed the password for weblogic, but it probably won?t work if you try to substitute your secure console user/pw because there will be no delegated authorization roles mapping to your custom console user. You might experiment with this scenario.
Fix steps:
1.     Shut down the whole cluster. Using Bea?s shutdown script takes too long, or doesn?t work if the Admin is down or hung.
a.     Find the PID to kill: lsof ?C | grep <listen port number>
b.     kill <pid>
c.     Run kill again and if the pid is still running then do: kill -9 <pid>
2.     Admin server:
a.     Remove directory: <domain>/servers/AdminServer/data/ldap
b.     Run this SQL script after you edit it for your schema:
delete from yourschema.P13N_DELEGATED_HIERARCHY;
delete from yourschema.P13N_ENTITLEMENT_POLICY;
delete from yourschema.P13N_ENTITLEMENT_RESOURCE;
delete from yourschema.P13N_ENTITLEMENT_ROLE;
delete from yourschema.P13N_ENTITLEMENT_APPLICATION;
commit;
c.     Start the Admin server
d.     Make sure it comes up in the running state. Use the console: servers ? admin ? control - resume if needed.
3.     Both Managed servers:
a.     cd <domain>/servers/<managed_name>
b.     rm ?rf data
This forces the re-down load of the LDAP directory.
c.     Start each managed server.
d.     Make sure the managed instance comes up in the running state and not admin. Go to server-<instance>-control-Resume to set the run state to running.
2.4     The enterprise portal DB fails and needs to be restored OR switch DB instances
Restoring a portal DB or switching existing DBs are similar scenarios. The issues that you?ll face with WLP9.2 since it now uses a JDBCauthenticator to authenticate and authorize the console / boot user you need to first be able to connect to the DB before the admin and managed instances can boot. If you haven?t properly encrypted the DB user?s password in the <domain>/config/jdbc/*.xml files, then you?ll not be able to boot the admin server since you won?t be able to create a JDBC connection to the DB. The boot messages are not clear as to what the failure is.
You?ll need to know an Admin role user and password that?s in the DB you?re wanting to connect to, to put into boot.properties and on the managed instances in their boot.properties or startManaged scripts. Don?t forget that the managed instances have local credentials which is new for 9.2. They are in the startManaged script in clear text or a local boot.properties.
Note:
The passwords in the DB are SHA-1 hashed and there is no SHA-1 hash generator tool so you can?t change a password via SQL, but you can move the password from one DB to another. This is possible because the domain encryption salt is not used to generate the SHA-1 string. As it turns out, the SHA-1 string is compatible with all 9.2 cluster domains. IE the domain DES3 salt has nothing to do with password verification. The same password SHA-1 string taken from different domains or even same domain but different users will be different, this is just a randomization put into the algorithm yet every domain will be able to validate the given password against the DB?s SHA-1 string. Because of this, I?ve not had any problem moving DB instances between clusters, especially if I?ve given up on security and use weblogic/weblogic as the console user and this user / pw is in every DB.
Steps:
The assumption for restoring a DB is that the DB has been restored but it?s an older version and doesn?t have the console user/pw that is in boot.properties. At this point swaping a DB is the same as restoring an old version of the portal DB.
1.     Edit the console user and password as clear text into: <domain>/servers/AdminServers/security/boot.properties
This is where you may give in and use weblogic/weblogic.
2.     Set the correct DB access password encryption in the jdbc/*.xml files.
a.     cd <domain>/bin
b.     . ./setDomainEnv.sh or if you?re on windows just run: setDomainEnv.cmd
c.     java weblogic.security.Encrypt <the_db_password>
d.     Edit the returned string into every <domain>/config/jdbc/*.xml
e.     Make sure the *.xml files point to the correct DB host, port, schema, DB name, DB user.
3.     Start the Admin server. I should come up. If it doesn?t it has to be not being able to create a connection to the DB, which depends on the *.xml having the correct user and DES-3 encrypted password.
4.     Edit the new console user/password on the managed instance bin/startManaged script or the local boot.properties.
5.     Start the managed instances.
2.5     Install patches on a host that does not have internet access
The short description is to run smart update on a host that does have internet access. Fetch out of the <domain>/utils/bsu/cache_dir     directory the downloaded patch jar and xml. Manually apply the patch to your non-internet accessible hosts.

ODSM installation failing on Solaris Sparc

Hi Guys,
we are trying to install ODSM on a Solaris server (Solaris Sparc 11). However the installer is throwing the following error while creating domain -
[2013-01-10T15:50:52.888-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationStatus] onConfigurationStatus: 92185386-b8be-44a0-9a5f-0d0bc9657eb4
[2013-01-10T15:50:52.888-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationStatus] [OOB IDM CONFIG EVENT] onConfigurationStatus -> Description: Starting Domain.
[2013-01-10T15:50:52.888-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationStatus] [OOB IDM CONFIG EVENT] onConfigurationStatus -> State: START
[2013-01-10T15:50:52.889-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationStatus] [OOB IDM CONFIG EVENT] onConfigurationStatus -> Component Name : StartDomain
[2013-01-10T15:50:52.889-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationStatus] [OOB IDM CONFIG EVENT] onConfigurationStatus -> Component Type : WLSDomain
[2013-01-10T15:50:52.889-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationStatus] ________________________________________________________________________________
[2013-01-10T15:50:52.889-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationStatus] [OOB IDM CONFIG EVENT] onConfigurationStatus ->92185386-b8be-44a0-9a5f-0d0bc9657eb4 StatusMsg:Starting Domain.
[2013-01-10T15:50:52.890-06:00] [as] [NOTIFICATION] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] reportStartConfigAction: EXIT........
[2013-01-10T17:04:38.884-06:00] [as] [ERROR] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0]
[2013-01-10T17:04:38.886-06:00] [as] [ERROR] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [[
oracle.as.provisioning.util.ConfigException:
Error while starting the domain.
Cause:
Starting the Admin_Server timed out.
Action:
See logs for more details.
at oracle.as.provisioning.util.ConfigException.createConfigException(ConfigException.java:123)
at oracle.as.provisioning.weblogic.ASDomain.startDomain(ASDomain.java:3150)
at oracle.as.provisioning.weblogic.ASDomain.startDomain(ASDomain.java:3040)
at oracle.as.provisioning.engine.WorkFlowExecutor._startAdminServer(WorkFlowExecutor.java:1645)
at oracle.as.provisioning.engine.WorkFlowExecutor._createDomain(WorkFlowExecutor.java:635)
at oracle.as.provisioning.engine.WorkFlowExecutor.executeWLSWorkFlow(WorkFlowExecutor.java:391)
at oracle.as.provisioning.engine.Config.executeConfigWorkflow_WLS(Config.java:866)
at oracle.as.idm.install.config.BootstrapConfigManager.doExecute(BootstrapConfigManager.java:690)
at oracle.as.install.engine.modules.configuration.client.ConfigAction.execute(ConfigAction.java:371)
at oracle.as.install.engine.modules.configuration.action.TaskPerformer.run(TaskPerformer.java:88)
at oracle.as.install.engine.modules.configuration.action.TaskPerformer.startConfigAction(TaskPerformer.java:105)
at oracle.as.install.engine.modules.configuration.action.ActionRequest.perform(ActionRequest.java:15)
at oracle.as.install.engine.modules.configuration.action.RequestQueue.perform(RequestQueue.java:64)
at oracle.as.install.engine.modules.configuration.standard.StandardConfigActionManager.start(StandardConfigActionManager.java:160)
at oracle.as.install.engine.modules.configuration.boot.ConfigurationExtension.kickstart(ConfigurationExtension.java:81)
at oracle.as.install.engine.modules.configuration.ConfigurationModule.run(ConfigurationModule.java:86)
at java.lang.Thread.run(Thread.java:662)
[2013-01-10T17:04:38.888-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationError] [OOB IDM CONFIG EVENT] onConfigurationError -> configGUID 92185386-b8be-44a0-9a5f-0d0bc9657eb4
[2013-01-10T17:04:38.889-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationError] [OOB IDM CONFIG EVENT] ErrorID: 35091
[2013-01-10T17:04:38.889-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationError] [OOB IDM CONFIG EVENT] Description: [[
Error while starting the domain.
Cause:
An error occurred while starting the domain.
Action:
See logs for more details.
[2013-01-10T17:04:38.891-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationError] ________________________________________________________________________________
[2013-01-10T17:04:38.892-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationError] [OOB IDM CONFIG EVENT] onConfigurationError -> eventResponse ==oracle.as.provisioning.engine.ConfigEventResponse@50cb14aa
[2013-01-10T17:04:38.892-06:00] [as] [NOTIFICATION] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [OOB IDM CONFIG EVENT] onConfigurationError -> Configuration Status: -1
[2013-01-10T17:04:38.892-06:00] [as] [NOTIFICATION] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [OOB IDM CONFIG EVENT] onConfigurationError -> Asking User for RETRY or ABORT
[2013-01-10T17:04:38.893-06:00] [as] [NOTIFICATION] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [OOB IDM CONFIG EVENT] onConfigurationError -> ActionStep:Create_Domain
[2013-01-10T17:04:38.895-06:00] [as] [TRACE] [] [oracle.as.provisioning] [tid: 12] [ecid: 0000JkaSLdW7a695Nf4Eye1GvnD8000003,0] [SRC_CLASS: oracle.as.idm.install.config.event.IdMProvisionEventListener] [SRC_METHOD: onConfigurationError] [OOB IDM CONFIG EVENT] onConfigurationError -> wait for User Input ....
[2013-01-10T17:21:34.980-06:00] [as] [NOTIFICATION] [] [oracle.as.install.engine.modules.statistics] [tid: 11] [ecid: 0000JkaOCte7a695Nf4Eye1GvnD8000002,0] Writing profile to file:/u01/app/oraInventory/logs/installProfile2013-01-10_03-31-48PM.log
[2013-01-10T17:21:34.981-06:00] [as] [NOTIFICATION] [] [oracle.as.install.engine.modules.statistics] [tid: 11] [ecid: 0000JkaOCte7a695Nf4Eye1GvnD8000002,0] outputFile:/u01/app/oraInventory/logs/installProfile2013-01-10_03-31-48PM.log
[2013-01-10T17:21:34.981-06:00] [as] [NOTIFICATION] [] [oracle.as.install.engine.modules.statistics] [tid: 11] [ecid: 0000JkaOCte7a695Nf4Eye1GvnD8000002,0] in writeProfile method..
[2013-01-10T17:21:34.982-06:00] [as] [NOTIFICATION] [] [oracle.as.install.engine.modules.statistics] [tid: 11] [ecid: 0000JkaOCte7a695Nf4Eye1GvnD8000002,0] Adding Element:INTERVIEW_TIME_ID for writing.
[2013-01-10T17:21:34.983-06:00] [as] [NOTIFICATION] [] [oracle.as.install.engine.modules.statistics] [tid: 11] [ecid: 0000JkaOCte7a695Nf4Eye1GvnD8000002,0] Adding Element:COPY_TIME_ID for writing.
[2013-01-10T17:21:34.983-06:00] [as] [NOTIFICATION] [] [oracle.as.install.engine.modules.statistics] [tid: 11] [ecid: 0000JkaOCte7a695Nf4Eye1GvnD8000002,0] Adding Element:LINK_TIME_ID for writing.
[2013-01-10T17:21:34.983-06:00] [as] [NOTIFICATION] [] [oracle.as.install.engine.modules.statistics] [tid: 11] [ecid: 0000JkaOCte7a695Nf4Eye1GvnD8000002,0] Adding Element:CONFIGURATION_TIME_ID for writing.
We couldn't find any way forward for this error Can anyone please advise if they have seen this error in their environment and what is the way forward? Thanks

khaleel2 wrote:
Hi Gurus,
Too frequent INS-32025 errors. Tried everything possible, finally found in oraInstall2012-05-06_07-50-25PM.err file......
---# Begin Stacktrace #---------------------------
ID: oracle.install.driver.oui.OUISetupDriver:13
oracle.cluster.verification.VerificationException: An internal error occurred within cluster verification framework
<Line 206, Column 12>: XML-20211: (Fatal Error) '--' is not allowed in comments.
<Line 206, Column 12>: XML-20211: (Fatal Error) '--' is not allowed in comments.
at oracle.ops.verification.framework.util.VerificationUtil.isPreReqSupported(VerificationUtil.java:4505)
at oracle.ops.verification.framework.util.VerificationUtil.isPreReqSupported(VerificationUtil.java:4443)
at oracle.cluster.verification.ClusterVerification.isPreReqSupported(ClusterVerification.java:6382)
at oracle.install.driver.oui.OUISetupDriver.verifyEnvironment(OUISetupDriver.java:299)
at oracle.install.driver.oui.OUISetupDriver.load(OUISetupDriver.java:422)
Please help soon. Appreciated if you give main points instead of providing document links.errors indicate that cluster (RAC) is involved.
At which step in cluster configuration, does this failure occur?

Cluster on VM sparc.

Similar Messages

Maybe you are looking for