Sun cluster failed when switching, mount /global/ I/O error .

Hi all,
I am having a problem during switching two Sun Cluster nodes.
Environment:
Two nodes with Solaris 8 (Generic_117350-27), 2 Sun D2 arrays & Vxvm 3.2 and Sun Cluster 3.0.
Porblem description:
scswitch failed , then scshutdown and boot up the both nodes. One node failed because of vxvm boot failure.
The other node is booting up normally but cannot mount /global directories. Manually mount is working fine.
# mount /global/stripe01
mount: I/O error
mount: cannot mount /dev/vx/dsk/globdg/stripe-vol01
# vxdg import globdg
# vxvol -g globdg startall
# mount /dev/vx/dsk/globdg/mirror-vol03 /mnt
# echo $?
0
port:root:/global/.devices/node@1/dev/vx/dsk 169# mount /global/stripe01
mount: I/O error
mount: cannot mount /dev/vx/dsk/globdg/stripe-vol01
Need help urgently
Jeff

I would check your patch levels. I seem to remember there was a linker patch that cause an issue with mounting /global/.devices/node@X
Tim
---

Similar Messages

  • Sun Cluster 3.1u4 - cannot mount global mount points

    I have the following in my /etc/vfstab:
    /dev/md/controlm/dsk/d20 /dev/md/controlm/rdsk/d20 /global/ctmprod ufs 2 no global,logging
    /dev/md/controlm/dsk/d30 /dev/md/controlm/rdsk/d30 /global/ctmprod/oracle ufs 2 no global,logging
    when I try to mount them I get the following message:
    mount: /dev/md/controlm/dsk/d20 or /global/controlm, no such file or directory
    mount: /dev/md/controlm/dsk/d30 or /global/controlm, no such file or directory
    If i rem out the lines in /etc/vfstab, they all work fine. They just will not mount with the lines in /etc/vfstab.
    any help would be great.
    thank you

    thank you for your response. I went back and double checked everything and it turned out that I had created the mount points on one server but no the other.
    again, thank you.

  • QFS Meta data resource on sun cluster failed

    Hi,
    I'm trying to configure QFS on cluster environment, to configure metadata resource faced error. i tried with different type of qfs none of them worked.
    [root @ n1u331]
    ~ # scrgadm -a -j mds -g qfs-mds-rg -t SUNW.qfs:5 -x QFSFileSystem=/sharedqfs
    n1u332 - shqfs: Invalid priority (0) for server n1u332FS shqfs: validate_node() failed.
    (C189917) VALIDATE on resource mds, resource group qfs-mds-rg, exited with non-zero exit status.
    (C720144) Validation of resource mds in resource group qfs-mds-rg on node n1u332 failed.
    [root @ n1u331]
    ~ # scrgadm -a -j mds -g qfs-mds-rg -t SUNW.qfs:5 -x QFSFileSystem=/global/haqfs
    n1u332 - Mount point /global/haqfs does not have the 'shared' option set.
    (C189917) VALIDATE on resource mds, resource group qfs-mds-rg, exited with non-zero exit status.
    (C720144) Validation of resource mds in resource group qfs-mds-rg on node n1u332 failed.
    [root @ n1u331]
    ~ # scrgadm -a -j mds -g qfs-mds-rg -t SUNW.qfs:5 -x QFSFileSystem=/global/hasharedqfs
    n1u332 - has: No /dsk/ string (nodev) in device.Inappropriate path in FS has device component: nodev.FS has: validate_qfsdevs() failed.
    (C189917) VALIDATE on resource mds, resource group qfs-mds-rg, exited with non-zero exit status.
    (C720144) Validation of resource mds in resource group qfs-mds-rg on node n1u332 failed.
    any QFS expert here?

    hi
    Yes we have 5.2, here is the wiki's link, [ http://wikis.sun.com/display/SAMQFSDocs52/Home|http://wikis.sun.com/display/SAMQFSDocs52/Home]
    I have added the file system trough webconsole, and it's mounted and working fine.
    after creating the file system i tried to put under sun cluster's management, but it asked for metadata resource and to create metadata resource I have got the mentioned errors.
    I need the use QFS file system in non-RAC environment, just mounting and using the file system. I could mount it on two machine in shared mode and high available mode, in both case in the second node it's 3 time slower then the node which has metadata server when you write and the same read speed. could you please let me know if it's the same for your environment or not. if so what do you think of the reason, i see both side is writing to the storage directly but why it's so slow on one node.
    regards,

  • Sun Cluster failed to switchover

    Hi,
    I have configured two node sun cluster and was working fine all these days.
    Since yesterday, i am unable to failover the cluster to second node.
    instead, resources are stopped and started again on the first node.
    when i use the command "scswitch -z -g oracle_failover_rg -h MFIN-SOL02" in first node I am getting these messages on the console
    Sep 28 17:53:16 MFIN-SOL01 ip: [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 010.010.007.120:0, remote = 000.000.000.00
    0:0, start = -2, end = 6
    Sep 28 17:53:16 MFIN-SOL01 ip: [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection
    Pl. suggest me to solve this problem.

    Those messages aren't important here. I think that might be related to the fault monitor being stopped.
    As I said in the previous post, you need to diagnose this bit by bit. Try the procedure manually, i.e. stop Oracle on node 1, manually switch-over the disks and storage to node 2, mount the file system, bring up the logical address, start the database.
    I expect there is something wrong with your configuration, e.g. incorrect listener configuration.
    There is also a way of increasing the debug level for the Oracle agent. This is documented in the manuals IIRC.
    Regards,
    Tim
    ---

  • Brconnect fails when switching from SAPR3 user to SAPSR3

    I have a problem when doing a refresh of our development environments. Our Development instance was recently upgraded to 10G from 9.2.0.5. This upgrade was performed by our team in France. During the upgrade they switched the data from SAPR3 to a new owner called SAPSR3. The problem occurs when we try to use the oradbusr.sql script to create our OPS$ users to utilize brconnect. Here are some of the failures and the things I have already tried. I tried to use the tool as intended. @<path>\oradbusr.sql SAPSR3 NT <domain><SID>. This works. Creates two users. The problem is when I issue the following command:
    brconnect -u / -c -f dbstart (Error Message I recieve)
    C:\Documents and Settings\c11adm>brconnect -u \ -c -f dbstart
    BR0801I BRCONNECT 7.00 (13)
    BR0263I Enter password for database user '\' (maximum 30 characters):
    BR0280I BRCONNECT time stamp: 2007-10-18 15.48.58
    BR0301E SQL error -1017 at location db_connect-2
    ORA-01017: invalid username/password; logon denied
    BR0310E Connect to database instance C11 failed
    BR0280I BRCONNECT time stamp: 2007-10-18 15.48.58
    BR0804I BRCONNECT terminated with errors
    so I tried the following
    brconnect -f chpass
    Here is the results:
    C:\Documents and Settings\c11adm>brconnect -f chpass
    BR0801I BRCONNECT 7.00 (13)
    BR0280I BRCONNECT time stamp: 2007-10-18 15.56.53
    BR0263I Enter password for database user 'SAPSR3' (maximum 30 characters):
    BR0280I BRCONNECT time stamp: 2007-10-18 15.56.59
    BR0263I Reenter password for database user 'SAPSR3' (maximum 30 characters):
    BR0280I BRCONNECT time stamp: 2007-10-18 15.57.02
    BR0829I Password changed successfully in database for user SAPSR3
    BR0830I Password inserted successfully into table OPS$C11ADM.SAPUSER for user SA
    PSR3
    BR0280I BRCONNECT time stamp: 2007-10-18 15.57.02
    BR0802I BRCONNECT completed successfully
    It seems that it is looking for the password to the "\" user. I thought this was the sys or system user in Oracle. I changed the passwords.
    alter user system identified by <password>. I did the same for sys. When I tried to issue the brconnect -u \ -c -f dbstart command i get the same problem,. Any ideas what user he is looking for or how to correct this problem????

    R3trans -x output
    4 ETW000 R3trans version 6.05 (release 46D - 27.03.05 - 14:30:00).
    4 ETW000 ===============================================
    4 ETW000
    4 ETW000 control file: <no ctrlfile>
    4 ETW000 R3trans was called as follows: R3trans -x
    4 ETW000 date&time   : 18.10.2007 - 19:15:24
    4 ETW000  trace at level 2 opened for a given file pointer
    4 ETW000  [developertra,00000]  Thu Oct 18 19:15:26 2007                           35378  0.035378
    4 ETW000  [developertra,00000]  db_con_init called                                    41  0.035419
    4 ETW000  [developertra,00000]  create_con (con_name=R/3)                             27  0.035446
    4 ETW000  [developertra,00000]  Loading DB library 'dboraslib.dll' ...                40  0.035486
    4 ETW000  [developertra,00000]  load shared library (dboraslib.dll), hdl 0         79159  0.114645
    4 ETW000  [developertra,00000]  Library 'dboraslib.dll' loaded                        16  0.114661
    4 ETW000  [developertra,00000]  function DbSlExpFuns loaded from library dboraslib.dll
    4 ETW000                                                                              16  0.114677
    4 ETW000  [developertra,00000]  Version of library 'dboraslib.dll' is "46D.00", patchlevel (0.2238)
    4 ETW000                                                                           80510  0.195187
    4 ETW000  [developertra,00000]  function dsql_db_init loaded from library dboraslib.dll
    4 ETW000                                                                              16  0.195203
    4 ETW000  [developertra,00000]  function dbdd_exp_funs loaded from library dboraslib.dll
    4 ETW000                                                                            1040  0.196243
    4 ETW000  [developertra,00000]  New connection 0 created                              16  0.196259
    4 ETW000  [developertra,00000]  db_con_connect (con_name=R/3)                         15  0.196274
    4 ETW000  [developertra,00000]  find_con found the following connection for reuse:
    4 ETW000                                                                              13  0.196287
    4 ETW000  [developertra,00000]  -->oci_initialize                                  40022  0.236309
    4 ETW000  [developertra,00000]  Got TNS_ADMIN=
    SBRA12\SAPMNT\C11\sys\profile\oracle from environment
    4 ETW000                                                                           26753  0.263062
    4 ETW000  [developertra,00000]  Got ORACLE_SID=C11 from environment                   12  0.263074
    4 ETW000  [developertra,00000]  Got NLS_LANG=AMERICAN_AMERICA.WE8DEC from environment
    4 ETW000                                                                              16  0.263090
    4 ETW000  [developertra,00000]  Logon as OPS$-user to get SAPSR3's password           10  0.263100
    4 ETW000  [developertra,00000]  Connecting as /@C11 on connection 0 ...                9  0.263109
    4 ETW000  [developertra,00000]  -->oci_logon(con_hdl=0, user='', dbname='C11')      8272  0.271381
    4 ETW000  [developertra,00000]  Thu Oct 18 19:15:27 2007                          211214  0.482595
    4 ETW000  [dboci.c     ,00000]  *** ERROR => OCI-call 'olog' failed: rc = 1017        25  0.482620
    4 ETW000  [dboci.c     ,00000]  *** ERROR => CONNECT failed with sql error '1017'
    4 ETW000                                                                              23  0.482643
    4 ETW000  [developertra,00000]  Try to connect with default password                   9  0.482652
    4 ETW000  [developertra,00000]  Connecting as SAPSR3/<pwd>@C11 on connection 0 ...
    4 ETW000                                                                              15  0.482667
    4 ETW000  [developertra,00000]  -->oci_logon(con_hdl=0, user='SAPSR3', dbname='C11')
    4 ETW000                                                                              13  0.482680
    4 ETW000  [developertra,00000]  Now I'm connected to ORACLE using OCI_7 API        24270  0.506950
    4 ETW000  [developertra,00000]  Database instance c11 is running on SBRA12 with ORACLE version 10.2.0.2.0 since 20071017
    4 ETW000                                                                           18782  0.525732
    4 ETW000  [developertra,00000]  Connection 0 opened                                41566  0.567298
    4 ETW000 Connected to database.
    4 ETW000  [developertra,00000]  Disconnecting from ALL connections:                23200  0.590498
    4 ETW000  [developertra,00000]  Disconnecting from connection 0 ...                   15  0.590513
    4 ETW000  [developertra,00000]  -->oci_logoff(con_hdl=0)                              90  0.590603
    4 ETW000  [developertra,00000]  Now I'm disconnected from ORACLE                    1001  0.591604
    4 ETW000  [developertra,00000]  Disconnected from connection 0                        23  0.591627
    4 ETW000  [developertra,00000]  statistics db_con_commit (com_total=0, com_forced=0, com_tx=0)
    4 ETW000                                                                              15  0.591642
    4 ETW000  [developertra,00000]  statistics db_con_rollback (roll_total=0, roll_forced=0, roll_tx=0)
    4 ETW000                                                                              13  0.591655
    4 ETW000 Disconnected from database.
    4 ETW000 End of Transport (0000).
    4 ETW000 date&time: 18.10.2007 - 19:15:27

  • Chrome failing when switching users

    This also happens when powering down. Chrome widows shrink and I canot open them.
    Cheers
    SteveW

    Hi there.
    I had a similar problem with the Dock. Try my solution mentioned here (use Terminal to 'kill' the Dock).
    https://discussions.apple.com/message/26090676#26090676
    Killing the Dock might be enough to bring everything else back into line (your background picture looks similar to how mine used to look -- cut off).
    Enjoy!

  • LDOM SUN Cluster Interconnect failure

    I am making a test SUN-Cluster on Solaris 10 in LDOM 1.3.
    in my environment, i have T5120, i have setup two guest OS with some configurations, setup sun cluster software, when executed, scinstall, it failed.
    node 2 come up, but node 1 throws following messgaes:
    Boot device: /virtual-devices@100/channel-devices@200/disk@0:a File and args:
    SunOS Release 5.10 Version Generic_139555-08 64-bit
    Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Hostname: test1
    Configuring devices.
    Loading smf(5) service descriptions: 37/37
    /usr/cluster/bin/scdidadm: Could not load DID instance list.
    /usr/cluster/bin/scdidadm: Cannot open /etc/cluster/ccr/did_instances.
    Booting as part of a cluster
    NOTICE: CMM: Node test2 (nodeid = 1) with votecount = 1 added.
    NOTICE: CMM: Node test1 (nodeid = 2) with votecount = 0 added.
    NOTICE: clcomm: Adapter vnet2 constructed
    NOTICE: clcomm: Adapter vnet1 constructed
    NOTICE: CMM: Node test1: attempting to join cluster.
    NOTICE: CMM: Cluster doesn't have operational quorum yet; waiting for quorum.
    NOTICE: clcomm: Path test1:vnet1 - test2:vnet1 errors during initiation
    NOTICE: clcomm: Path test1:vnet2 - test2:vnet2 errors during initiation
    WARNING: Path test1:vnet1 - test2:vnet1 initiation encountered errors, errno = 62. Remote node may be down or unreachable through this path.
    WARNING: Path test1:vnet2 - test2:vnet2 initiation encountered errors, errno = 62. Remote node may be down or unreachable through this path.
    clcomm: Path test1:vnet2 - test2:vnet2 errors during initiation
    CREATED VIRTUAL SWITCH AND VNETS ON PRIMARY DOMAIN LIKE:<>
    532 ldm add-vsw mode=sc cluster-vsw0 primary
    533 ldm add-vsw mode=sc cluster-vsw1 primary
    535 ldm add-vnet vnet2 cluster-vsw0 test1
    536 ldm add-vnet vnet3 cluster-vsw1 test1
    540 ldm add-vnet vnet2 cluster-vsw0 test2
    541 ldm add-vnet vnet3 cluster-vsw1 test2
    Primary DOmain<>
    bash-3.00# dladm show-dev
    vsw0 link: up speed: 1000 Mbps duplex: full
    vsw1 link: up speed: 0 Mbps duplex: unknown
    vsw2 link: up speed: 0 Mbps duplex: unknown
    e1000g0 link: up speed: 1000 Mbps duplex: full
    e1000g1 link: down speed: 0 Mbps duplex: half
    e1000g2 link: down speed: 0 Mbps duplex: half
    e1000g3 link: up speed: 1000 Mbps duplex: full
    bash-3.00# dladm show-link
    vsw0 type: non-vlan mtu: 1500 device: vsw0
    vsw1 type: non-vlan mtu: 1500 device: vsw1
    vsw2 type: non-vlan mtu: 1500 device: vsw2
    e1000g0 type: non-vlan mtu: 1500 device: e1000g0
    e1000g1 type: non-vlan mtu: 1500 device: e1000g1
    e1000g2 type: non-vlan mtu: 1500 device: e1000g2
    e1000g3 type: non-vlan mtu: 1500 device: e1000g3
    bash-3.00#
    NOde1<>
    -bash-3.00# dladm show-link
    vnet0 type: non-vlan mtu: 1500 device: vnet0
    vnet1 type: non-vlan mtu: 1500 device: vnet1
    vnet2 type: non-vlan mtu: 1500 device: vnet2
    -bash-3.00# dladm show-dev
    vnet0 link: unknown speed: 0 Mbps duplex: unknown
    vnet1 link: unknown speed: 0 Mbps duplex: unknown
    vnet2 link: unknown speed: 0 Mbps duplex: unknown
    -bash-3.00#
    NODE2<>
    -bash-3.00# dladm show-link
    vnet0 type: non-vlan mtu: 1500 device: vnet0
    vnet1 type: non-vlan mtu: 1500 device: vnet1
    vnet2 type: non-vlan mtu: 1500 device: vnet2
    -bash-3.00#
    -bash-3.00#
    -bash-3.00# dladm show-dev
    vnet0 link: unknown speed: 0 Mbps duplex: unknown
    vnet1 link: unknown speed: 0 Mbps duplex: unknown
    vnet2 link: unknown speed: 0 Mbps duplex: unknown
    -bash-3.00#
    and this configuration i give while setting up scinstall
    Cluster Transport Adapters and Cables <<<You must identify the two cluster transport adapters which attach
    this node to the private cluster interconnect.
    For node "test1",
    What is the name of the first cluster transport adapter [vnet1]?
    Will this be a dedicated cluster transport adapter (yes/no) [yes]?
    All transport adapters support the "dlpi" transport type. Ethernet
    and Infiniband adapters are supported only with the "dlpi" transport;
    however, other adapter types may support other types of transport.
    For node "test1",
    Is "vnet1" an Ethernet adapter (yes/no) [yes]?
    Is "vnet1" an Infiniband adapter (yes/no) [yes]? no
    For node "test1",
    What is the name of the second cluster transport adapter [vnet3]? vnet2
    Will this be a dedicated cluster transport adapter (yes/no) [yes]?
    For node "test1",
    Name of the switch to which "vnet2" is connected [switch2]?
    For node "test1",
    Use the default port name for the "vnet2" connection (yes/no) [yes]?
    For node "test2",
    What is the name of the first cluster transport adapter [vnet1]?
    Will this be a dedicated cluster transport adapter (yes/no) [yes]?
    For node "test2",
    Name of the switch to which "vnet1" is connected [switch1]?
    For node "test2",
    Use the default port name for the "vnet1" connection (yes/no) [yes]?
    For node "test2",
    What is the name of the second cluster transport adapter [vnet2]?
    Will this be a dedicated cluster transport adapter (yes/no) [yes]?
    For node "test2",
    Name of the switch to which "vnet2" is connected [switch2]?
    For node "test2",
    Use the default port name for the "vnet2" connection (yes/no) [yes]?
    i have setup the configurations like.
    ldm list -l nodename
    NODE1<>
    NETWORK
    NAME SERVICE ID DEVICE MAC MODE PVID VID MTU LINKPROP
    vnet1 primary-vsw0@primary 0 network@0 00:14:4f:f9:61:63 1 1500
    vnet2 cluster-vsw0@primary 1 network@1 00:14:4f:f8:87:27 1 1500
    vnet3 cluster-vsw1@primary 2 network@2 00:14:4f:f8:f0:db 1 1500
    ldm list -l nodename
    NODE2<>
    NETWORK
    NAME SERVICE ID DEVICE MAC MODE PVID VID MTU LINKPROP
    vnet1 primary-vsw0@primary 0 network@0 00:14:4f:f9:a1:68 1 1500
    vnet2 cluster-vsw0@primary 1 network@1 00:14:4f:f9:3e:3d 1 1500
    vnet3 cluster-vsw1@primary 2 network@2 00:14:4f:fb:03:83 1 1500
    ldm list-services
    VSW
    NAME LDOM MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK
    primary-vsw0 primary 00:14:4f:f9:25:5e e1000g0 0 switch@0 1 1 1500 on
    cluster-vsw0 primary 00:14:4f:fb:db:cb 1 switch@1 1 1 1500 sc on
    cluster-vsw1 primary 00:14:4f:fa:c1:58 2 switch@2 1 1 1500 sc on
    ldm list-bindings primary
    VSW
    NAME MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK
    primary-vsw0 00:14:4f:f9:25:5e e1000g0 0 switch@0 1 1 1500 on
    PEER MAC PVID VID MTU LINKPROP INTERVNETLINK
    vnet1@gitserver 00:14:4f:f8:c0:5f 1 1500
    vnet1@racc2 00:14:4f:f8:2e:37 1 1500
    vnet1@test1 00:14:4f:f9:61:63 1 1500
    vnet1@test2 00:14:4f:f9:a1:68 1 1500
    NAME MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK
    cluster-vsw0 00:14:4f:fb:db:cb 1 switch@1 1 1 1500 sc on
    PEER MAC PVID VID MTU LINKPROP INTERVNETLINK
    vnet2@test1 00:14:4f:f8:87:27 1 1500
    vnet2@test2 00:14:4f:f9:3e:3d 1 1500
    NAME MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK
    cluster-vsw1 00:14:4f:fa:c1:58 2 switch@2 1 1 1500 sc on
    PEER MAC PVID VID MTU LINKPROP INTERVNETLINK
    vnet3@test1 00:14:4f:f8:f0:db 1 1500
    vnet3@test2 00:14:4f:fb:03:83 1 1500
    Any Idea Team, i beleive the cluster interconnect adapters were not successfull.
    I need any guidance/any clue, how to correct the private interconnect for clustering in two guest LDOMS.

    You dont have to stick to default IP's or subnet . You can change to whatever IP's you need. Whatever subnet mask you need. Even change the private names.
    You can do all this during install or even after install.
    Read the cluster install doc at docs.sun.com

  • Cluster Transport Adapter Error - Sun Cluster

    I am installing sun cluster 3.0 and it gives me an error saying:
    failed to add cluster transport adapter - unknown adapter of transport type, trtype=dlpi...
    My network card is syskonnect - interface is skge0.....
    What is wrong....Thanks

    Hi,
    I have a similar problem .
    Get the same error with Sun Cluster 3.0 the card is Phobos quad port.
    Could find a solution to it or had to shell out a few hundred bucks for sun cards ?

  • Fail when config "Internet Directory Configuration Assistant"

    I install oracle9ias( Infrastructure ) in AIX4.3.3 .
    It failed when config "Internet Directory Configuration Assistant"
    error message:
    No output aviaible for this tool

    the solution is
    1- I removed the Network cable
    2- I formatted the server
    3- I Installed the Application Server
    4- I Connected the server with network and I gave a static IP to the server
    Best of Luck
    Edited by: user13011851 on Jan 11, 2011 2:40 AM

  • My ipad fails when trying to update

    why does my ipad fail when try to update

    What is the error?
    Are you doing the update over wifdi? or connected to itunes on a computer? (Computer Reccomeded)

  • 2 node Sun Cluster 3.2, resource groups not failing over.

    Hello,
    I am currently running two v490s connected to a 6540 Sun Storagetek array. After attempting to install the latest OS patches the cluster seems nearly destroyed. I backed out the patches and right now only one node can process the resource groups properly. The other node will appear to take over the Veritas disk groups but will not mount them automatically. I have been working on this for over a month and have learned alot and fixed alot of other issues that came up, but the cluster is just not working properly. Here is some output.
    bash-3.00# clresourcegroup switch -n coins01 DataWatch-rg
    clresourcegroup: (C776397) Request failed because node coins01 is not a potential primary for resource group DataWatch-rg. Ensure that when a zone is intended, it is explicitly specified by using the node:zonename format.
    bash-3.00# clresourcegroup switch -z zcoins01 -n coins01 DataWatch-rg
    clresourcegroup: (C298182) Cannot use node coins01:zcoins01 because it is not currently in the cluster membership.
    clresourcegroup: (C916474) Request failed because none of the specified nodes are usable.
    bash-3.00# clresource status
    === Cluster Resources ===
    Resource Name Node Name State Status Message
    ftp-rs coins01:zftp01 Offline Offline
    coins02:zftp01 Offline Offline - LogicalHostname offline.
    xprcoins coins01:zcoins01 Offline Offline
    coins02:zcoins01 Offline Offline - LogicalHostname offline.
    xprcoins-rs coins01:zcoins01 Offline Offline
    coins02:zcoins01 Offline Offline - LogicalHostname offline.
    DataWatch-hasp-rs coins01:zcoins01 Offline Offline
    coins02:zcoins01 Offline Offline
    BDSarchive-res coins01:zcoins01 Offline Offline
    coins02:zcoins01 Offline Offline
    I am really at a loss here. Any help appreciated.
    Thanks

    My advice is to open a service call, provided you have a service contract with Oracle. There is much more information required to understand that specific configuration and to analyse the various log files. This is beyond what can be done in this forum.
    From your description I can guess that you want to failover a resource group between non-global zones. And it looks like the zone coins01:zcoins01 is reported to not be in cluster membership.
    Obviously node coins01 needs to be a cluster member. If it is reported as online and has joined the cluster, then you need to verify if the zone zcoins01 is really properly up and running.
    Specifically you need to verify that it reached the multi-user milestone and all cluster related SMF services are running correctly (ie. verify "svcs -x" in the non-global zone).
    You mention Veritas diskgroups. Note that VxVM diskgroups are handled in the global cluster level (ie. in the global zone). The VxVM diskgroup is not imported for a non-global zone. However, with SUNW.HAStoragePlus you can ensure that file systems on top of VxVM diskgroups can be mounted into a non-global zone. But again, more information would be required to see how you configued things and why they don't work as you expect it.
    Regards
    Thorsten

  • Sun Cluster 3.2 - Global File Systems

    Sun Cluster has a Global Filesystem (GFS) that supports read-only access throughout the cluster. However, only one node has write access.
    In Linux a GFS filesystem allows it to be mounted by multiple nodes for simultaneous READ/WRITE access. Shouldn't this be the same for Solaris as well..
    From the documentation that I have read,
    "The global file system works on the same principle as the global device feature. That is, only one node at a time is the primary and actually communicates with the underlying file system. All other nodes use normal file semantics but actually communicate with the primary node over the same cluster transport. The primary node for the file system is always the same as the primary node for the device on which it is built"
    The GFS is also known as Cluster File System or Proxy File system.
    Our client believes that they can have their application "scaled" and all nodes in the cluster can have the ability to write to the globally mounted file system. My belief was, the only way this can occur is when the application has failed over and then the "write" would occur from the "primary" node whom is mastering the application at that time. Any input will be greatly appreciated or clarification needed. Thanks in advance.
    Ryan

    Thank you very much, this helped :)
    And how seamless is remounting of the block device LUN if one server dies?
    Should some clustered services (FS clients such as app servers) be restarted
    in case when the master node changes due to failover? Or is it truly seamless
    as in a bit of latency added for duration of mounting the block device on another
    node, with no fatal interruptions sent to the clients?
    And, is it true that this solution is gratis, i.e. may legally be used for free
    unless the customer wants support from Sun (authorized partners)? ;)
    //Jim
    Edited by: JimKlimov on Aug 19, 2009 4:16 PM

  • Cluster errors when switching messagins resources

    Hi
    I have two nodes configured in the sun cluster and runninng sun messaging server
    1) cluster version
    root@bglbbpp1 # scinstall -pvv
    Sun Cluster 3.1u1 for Solaris 9 sparc
    2) bglbbpp2 / # scstat -g
    -- Resource Groups and Resources --
    Group Name Resources
    Resources: mstorebgl-rg mstorebgl-lh-res mstorebgl-hastp-res mstorebgl-msg-res
    -- Resource Groups --
    Group Name Node Name State
    Group: mstorebgl-rg bglbbpp1 Online
    Group: mstorebgl-rg bglbbpp2 Offline
    -- Resources --
    Resource Name Node Name State Status Message
    Resource: mstorebgl-lh-res bglbbpp1 Online Online - LogicalHostname online.
    Resource: mstorebgl-lh-res bglbbpp2 Offline Offline - LogicalHostname offline.
    Resource: mstorebgl-hastp-res bglbbpp1 Online Online
    Resource: mstorebgl-hastp-res bglbbpp2 Offline Offline
    Resource: mstorebgl-msg-res bglbbpp1 Online Online - Start succeeded.
    Resource: mstorebgl-msg-res bglbbpp2 Offline Offline - Stop Succeeded
    bglbbpp2/ #
    3) when i do scswitch to switch messaging resource from pp1 to pp2 I get below errors and it fails back to pp1.
    # scswitch -z -h bglbbpp2-a-fixed.dataone.in -g mstorebgl-rg
    scswitch: Resource group mstorebgl-rg failed to start on chosen node and may fail over to other node(s)
    /var/adm/messages on pp2
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hafoip_prenet_start> for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, timeout <300> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hafoip_prenet_start> completed successfully for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <300 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hastorageplus_prenet_start> for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, timeout <1800> seconds
    bglbbpp2 SC[SUNW.HAStoragePlus:2,mstorebgl-rg,mstorebgl-hastp-res,hastorageplus_prenet_start_private]: [ID 582276 daemon.warning] This node has a lower preference to node 1 for global service datadg associated with path /messaging. Device switchover can still be done to this node.
    bglbbpp2 SC[SUNW.HAStoragePlus:2,mstorebgl-rg,mstorebgl-hastp-res,hastorageplus_prenet_start_private]: [ID 582276 daemon.warning] This node has a lower preference to node 1 for global service datadg associated with path /MMP. Device switchover can still be done to this node.
    bglbbpp2 SC[SUNW.HAStoragePlus:2,mstorebgl-rg,mstorebgl-hastp-res,hastorageplus_prenet_start_private]: [ID 582276 daemon.warning] This node has a lower preference to node 1 for global service datadg associated with path /backup. Device switchover can still be done to this node.
    bglbbpp2 Cluster.Framework: [ID 801593 daemon.notice] stdout: becoming primary for datadg
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hastorageplus_prenet_start> completed successfully for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <1800 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hafoip_start> for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, timeout <500> seconds

    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hafoip_start> completed successfully for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <500 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hastorageplus_start> for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, timeout <90> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hafoip_monitor_start> for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, timeout <300> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hastorageplus_start> completed successfully for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <90 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hastorageplus_monitor_start> for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, timeout <90> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <ims_svc_start> for resource <mstorebgl-msg-res>, resource group <mstorebgl-rg>, timeout <300> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hastorageplus_monitor_start> completed successfully for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <90 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hafoip_monitor_start> completed successfully for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <300 seconds>
    bglbbpp2 SC[SUNW.ims,mstorebgl-rg,mstorebgl-msg-res,ims_svc_start]: [ID 855581 daemon.error] Failed to get the configuration info
    bglbbpp2 Unable to initialize the MTA; compiled configuration version mismatch; recompile the configuration with imsimta cnbuild
    bglbbpp2 SC[SUNW.ims,mstorebgl-rg,mstorebgl-msg-res,ims_svc_start]: [ID 450358 daemon.error] Error retrieving services from iMS configuration
    bglbbpp2 Cluster.RGM.rgmd: [ID 938318 daemon.error] Method <ims_svc_start> failed on resource <mstorebgl-msg-res> in resource group <mstorebgl-rg> [exit code <1>, time used: 0% of timeout <300 seconds>]
    bglbbpp2n Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hafoip_monitor_stop> for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, timeout <300> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hastorageplus_monitor_stop> for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, timeout <90> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <ims_svc_stop> for resource <mstorebgl-msg-res>, resource group <mstorebgl-rg>, timeout <300> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hastorageplus_monitor_stop> completed successfully for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <90 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hafoip_monitor_stop> completed successfully for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <300 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <ims_svc_stop> completed successfully for resource <mstorebgl-msg-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <300 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hastorageplus_stop> for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, timeout <1800> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hastorageplus_stop> completed successfully for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <1800 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hafoip_stop> for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, timeout <300> seconds
    bglbbpp2 ip: [ID 683231 kern.notice] TCP_IOC_ABORT_CONN: local = 010.016.018.026:0, remote = 000.000.000.000:0, start = -2, end = 6
    bglbbpp2 ip: [ID 440816 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hafoip_stop> completed successfully for resource <mstorebgl-lh-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <300 seconds>
    bglbbpp2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <hastorageplus_postnet_stop> for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, timeout <1800> seconds
    bglbbpp2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <hastorageplus_postnet_stop> completed successfully for resource <mstorebgl-hastp-res>, resource group <mstorebgl-rg>, time used: 0% of timeout <1800 seconds>
    bglbbpp2 Cluster.Framework: [ID 801593 daemon.notice] stdout: no longer primary for datadg
    Please advice throw some lights where is something going wrong.
    thanks

  • Apache with PHP Fails to Validate in Sun Cluster

    Greetings,
    I have Sun Cluster 3.2u2 running with two nodes and have Apache 2.2.11 running successfully in failover mode on shared storage. However, I just installed PHP 5.2.10 and added the line "LoadModule php5_module modules/libphp5.so" to httpd.conf. I am now getting "Command {/global/data/local/apache/bin/apachectl configtest >/dev/null 2>&1} failed: httpd cannot parse httpd.conf, Failed to validate configuration." when I try to start the resource. I can start Apache just fine outside of the cluster, and when I run configtest manually, it replies "Syntax OK".
    Anyone have any ideas why the Cluster software doesn't like the PHP module even though configtest passes with Syntax OK?
    Many thanks,
    Tim

    Found it. Sun Cluster was apparently smart enough to know I was missing the correct PHP AddType lines in httpd.conf.

  • Sun Cluster 3.2 upgrade fail

    Dear mate,
    when I upgrade the cluster from 3.1 to 3.2 by using Live Upgrade, the folloing error message came out, any hints or idea will be appreciate.
    The PBE (sol8) is Solaris 8 with sun cluster 3.1. the PBE (sol10) is Solaris 10 and will upgrade to sun cluster 3.2
    # cd /Solaris_sparc/Product/sun_cluster/Solaris_10/Tools
    # ./scinstall -u update -R /sol10
    scinstall: "SUNWesu" is not installed in "/sol10".
    scinstall: scinstall did NOT complete successfully!
    # luupgrade -p -n sol10 -s /mnt/Solaris_10/Product SUNWesu
    Validating the contents of the media </mnt/Solaris_10/Product>.
    Mounting the BE <sol10>.
    ERROR: The boot environment <sol10> supports non-global zones.The current boot environment does not support non-global zones. Releases prior to Solaris 10 cannot be used to maintain Solaris 10 and later releases that include support for non-global zones. You may only execute the specified operation on a system with Solaris 10 (or later) installed.
    cat: cannot open /tmp/.liveupgrade.6951.16469/.lmz.list
    Thanks and Regards,
    Donald

    Hi Tim,
    Thanks for the information.
    I got the following result.
    # pkginfo -R /sol10 SUNWesu
    ERROR: information for "SUNWesu" was not found
    # pkginfo SUNWesu
    system SUNWesu Extended System Utilities
    PBE has SUNWesu while ABE missing this package, does it mean SUNWesu hadn't upgrade from PBE to ABE? if so what will be the alternative to do it?
    thanks and Regards,
    Donald

Maybe you are looking for

  • Cost Center Category

    I am trying to find an IO if exists for cost center category ( KOSAR ) on BI. I did my search in MetaRep & Help. Does anyone know about it ? Message was edited by:         Jr Roberto

  • G5 PPC, OS 10.4.11 The primary volume on main internal HD will not mount.

    The only other volume on the disk does mount, and when starting from this second volumes, the primary volume does mount..  From other startup volumes on my second internal HD, the volume does mount.  Disc utility repair, TechToolPro, and DiskWarrior

  • Start date for Solaris on Intel

    Hi all, Does anyone know the date when Solaris first became available on the Intel Platform? Also, is 2.6 available on Intel?

  • Hp服务器rac数据库添加节点问题

    请给位帮忙看一下,如果有需要的资料,请留言,我上传给大家,谢谢了! 环境: hp-ux 11.31 oracle 10.2.0.5 新增rac节点dt03,dt04 执行addNode脚本复制crs到新增节点成功后,执行root脚本报错如下: root@dt04[] /u01/app/oracle/product/10.2.0/crs_1/root.sh WARNING: directory '/u01/app/oracle/product/10.2.0' is not owned by roo

  • Trouble Detecting Printer During Driver Installation (MP495)

    Upon trying to install the drivers for my Canon MP495, I always get stuck at the same point.  I closed all programs running in the background, pushed the "maintenence" button on the printer until it went to the "G" symbol and then pushed the "color"