Sun Cluster.. Why?

What are the advantages of installing RAC 10.2.0.3 on a Sun Cluster.? Are there any benefits?

Oracle 10g onward, there is no such burning requirement for Sun Cluster (or any third party cluster) as far as you are using all Oracle technologies for your Oracle RAC database. You should Oracle RAC with ASM for shared storage and that would not require any third party cluster. Bear inmind that
You may need to install Sun Cluster in the following scenarios:
1) If there is applicaiton running with in the cluster along with Oracle RAC database that you want to configure for HA and Sun Cluster provide the cluster resourced (easy to use) to manage and monitor the application. THIS can be achieved with Oracle Clusterware but you will have to write your own cluster resource for that.
2) If you want to install cluster file system such as QFS then you will need to install the Sun Cluster. If this cluster is only running the Oracle RAC database then you can rely on Oracle technologies such as ASM, raw devices without installing Sun Cluster.
3) Any certification conflicts.
Any correction is welcome..
-Harish Kumar Kalra

Similar Messages

Didadm: unable to determine hostname. error on Sun cluster 4.0 - Solaris11

Trying to install Sun Cluster 4.0 on Sun Solaris 11 (x86-64).
iscs sharedi Quorum Disk are available in /dev/rdsk/ .. ran
devfsadm
cldevice populate
But don't see DID devices getting populated in /dev/did.
Also when scdidadm -L is issued getting the following error. Has any seen the same error ??
- didadm: unable to determine hostname.
Found in cluster 3.2 there was a Bug 6380956: didadm should exit with error message if it cannot determine the hostname
The sun cluster command didadm, didadm -l in particular, requires the hostname to function correctly. It uses the standard C library function gethostname to achieve this.
Early in the cluster boot, prior to the service svc:/system/identity:node coming online, gethostname() returns an empty string. This breaks didadm.
Can anyone point me in the right direction to get past this issue with shared quorum disk DID.

Let's step back a bit. First, what hardware are you installing on? Is it a supported platform or is it some guest VM? (That might contribute to the problems).
Next, after you installed Solaris 11, did the system boot cleanly and all the services come up? (svcs -x). If it did boot cleanly, what did 'uname -n' return? Do commands like 'getent hosts <your_hostname>' work? If there are problems here, Solaris Cluster won't be able to get round them.
If the Solaris install was clean, what were the results of the above host name commands after OSC was installed? Do the hostnames still resolve? If not, you need to look at why that is happening first.
Regards,
Tim
---

Invalid node name in Sun Cluster 3.1 installation

Dear all,
I need your advice in Sun Cluster 3.1 8/05 installation.
My colleague was installing Sun Cluster 3.1 8/05 on 2 servers Sun Netra 440 that given hostname 01-in-01 and 01-in-02. But when he want to configuring the cluster, the problem occured.
The error message is:
running scinstall: invalid node name
And when we changed the host name to in-01 and in-02, the cluster can be configured well.
Why did this problem happened?
Is it related with the given hostname that using numeric in the beginning? If yes, can you give the documentation that state about that?
Or maybe you have another explanation?
Thank you for your help.
regards,
Henry

A bug is being logged against this. (though obviously you could manually fix the shell script yourself if you were in a hurry).
The problem partly stems from the restriction on hostnames being relaxed by RFC 1123 which relaxed RFC 952's limitation of the first character to only alpha characters.). See man hosts for more info. I guess our code didn't catch up :-)
Tim
---

Failed to create resource - Error in Sun cluster 3.2

Hi All,
I have a 2 node cluster in place. When i trying to create a resource, i am getting following error.
Can anybody tell me why i am getting this. I have Sun Cluster 3.2 on Solaris 10.
I have created zpool called testpool.
clrs create -g test-rg -t SUNW.HAStoragePlus -p Zpools=testpool hasp-testpool-res
clrs: sun011:test011z - : no error
clrs: (C189917) VALIDATE on resource hasp-testpool-res, resource group test-rg, exited with non-zero exit status.
clrs: (C720144) Validation of resource hasp-testpool-res in resource group test-rg on node sun011:test011z failed.
clrs: (C891200) Failed to create resource "hasp-testpool-res".
Regards
Kumar

Thorsten,
testpool created in one of the cluster nodes and is accessible from both the nodes in the cluster. But if it is imported in one node and will not be access from other node. If other node want to get access we need to export and import testpool in other node.
Storage LUNs allocated to testpool are accessible from all the nodes in the cluster and able import and export testpool from all the nodes in the cluster.
Regards
Kumar

RAW disks for Oracle 10R2 RAC NO SUN CLUSTER

Yes you read it correctly....no Sun cluster. Then why am I on the Forum right? Well we have one Sun Cluster and another that is RAC only for testing. Between Oracle and Sun, neither accept any fault for problems with their perfectly honed products. Currently, I have multipathed fiber hba's to a Storedge 3510, and I've tried to get Oracle to use a raw lun for the ocr and voting disks. It doesn't see the disk. I've made sure they are stamped for oracle:dba, and tried oracle:oinstall. When presenting /dev/rdsk/C7t<long number>d0s6 for the ocr, I get a "can not find disk path." Does Oracle raw mean SVM raw? Should I create metadisks?

"Between Oracle and Sun, neither accept any fault for problems with their perfectly honed products"...more specific:
Not that the word "fault" is characterization of any liability, but a technical characterization of acting like a responsible stakeholder when you sell your product to a corporation. I've been working on the same project for a year, as an engineer. Not withstanding a huge expanse of management issues over the project, when technical gray areas have been reached, whereas our team has tried to get information to solve the issue. The area has become a big bouncing hot potato. Specifically, when Oracle has a problem reading a storage device, according to Oracle, that is a Sun issue. According to Sun, they didn't certify the software on that piece of equipment, so go talk to Oracle. In the sun cluster arena, if starting the database creates a node eviction from the cluster, good luck getting any specific team to say, that's our problem. Sun will say that Oracle writes crappy cluster verify scripts, and Oracle will say that Sun has not properly certified the device for use with their product. Man, I've seen it. The first time I said O.K. how do we avoid this in the future, the second time I said how did I let this happen again, and after more issues, money spent, hours lost, and customers, pissed --do the math. I've even went as far as say, find me a plug and play production model for this specific environment, but good luck getting two companies to sign the specs for it...neither wants to stamp their name on the product due to the liability. Yes your right, I should beat the account team, but as an engineer, man that's not my area, and I have other problems that I was hired to deal with. I could go on. What really is a slap in face is no one wants to work on these projects, if given the choice with doing a Windows deployment, because they can pop out mind bending amounts of builds why we plop along figuring out why clusterware doesn't like slice 6 of a /device/scsi_vhci/ . Try finding good documentation on that. ~You can deploy faster, but you can't pay more!

Veritas required for Oracle RAC on Sun Cluster v3?

Hi,
We are planning a 2 node Oracle 9i RAC cluster on Sun Cluster 3.
Can you please explain these 2 questions?
1)
If we have a hardware disk array RAID controller with LUNs etc, then why do we need to have Veritas Volume Manager (VxVM) if all the LUNS are configured at a hardware level?
2)
Do we need to have VxFS? All our Oracle database files will be on raw partitions.
Thanks,
Steve

> We are planning a 2 node Oracle 9i RAC cluster on Sun
Cluster 3.Good. This is a popular configuration.
Can you please explain these 2 questions?
1)
If we have a hardware disk array RAID controller with
LUNs etc, then why do we need to have Veritas Volume
Manager (VxVM) if all the LUNS are configured at a
hardware level?VxVM is not required to run RAC. VxVM has an option (separately
licensable) which is specifically designed for OPS/RAC. But if
you have a highly reliable, multi-pathed, hardware RAID platform,
you are not required to have VxVM.
2)
Do we need to have VxFS? All our Oracle database
files will be on raw partitions.No.
IMHO, simplify is a good philosophy. Adding more software
and layers into a highly available design will tend to reduce
the availability. So, if you are going for maximum availabiliity,
you will want to avoid over-complicating the design. KISS.
In the case of RAC, or Oracle in general, many people do use
raw and Oracle has the ability to manage data in raw devices
pretty well. Oracle 10g further improves along these lines.
A tenet in the design of highly available systems is to keep
the data management as close to the application as possible.
Oracle, and especially 10g, are following this tenet. The only
danger here is that they could try to get too clever, and end up
following policies which are suboptimal as the underlying
technologies change. But even in this case, the policy is
coming from the application rather than the supporting platform.
-- richard

What are typical failover times for application X on Sun Cluster

Our company does not yet have any hands-on experience with clustering anything on Solaris, although we do with Veritas and Miscrosoft. My experience with MS is that it is as close to seemless (instantaneous) as possible. The Veritas clustering takes a little bit longer to activate the standby's. A new application we are bringing in house soon runs on Sun cluster (it is some BEA Tuxedo/WebLogic/Oracle monster). They claim the time it takes to flip from the active node to the standby node is ~30minutes. This to us seems a bit insane since they are calling this "HA". Is this type of failover time typical in Sun land? Thanks for any numbers or reference.

This is a hard question to answer because it depends on the cluster agent/application.
On one hand you may have a simple Sun Cluster application that fails over in seconds because it has to do a limited amount of work (umount here, mount there, plumb network interface, etc) to actually failover.
On the other hand these operations may, depending on the application, take longer than another application due to the very nature of that application.
An Apache web server failover may take 10-15 seconds but an Oracle failover may take longer. There are many variables that control what happens from the time that a node failure is detected to the time that an application appears on another cluster node.
If the failover time is 30 minutes I would ask your vendor why that is exactly.
Not in a confrontational way but a 'I don't get how this is high availability' since the assumption is that up to 30 minutes could elapse from the time that your application goes down to it coming back on another node.
A better solution might be a different application vendor (I know, I know) or a scalable application that can run on more than one cluster node at a time.
The logic with the scalable approach is that if a failover takes 30 minutes or so to complete it (failover) becomes an expensive operation so I would rather that my application can use multiple nodes at once rather than eat a 30 minute failover if one node dies in a two node cluster:
serverA > 30 minute failover > serverB
seems to be less desirable than
serverA, serverB, serverC, etc concurrently providing access to the application so that failover only happens when we get down to a handful of nodes
Either one is probably more desirable than having an application outage(?)

Sun cluster and oracle RAC

my client is insisting on putting oracle RAC on top of sun cluster, the thing is i don't know what is the benifit of creating such an architecture, i tries to search on the internet but couldn't understand the benifits nor the advantage, Kindly if some one has implemented such an architecture to let us know the use and the benifits

Hi,
Here on my environment, there is SUNCluster only for single instances. The Clusterware Oracle is complete providing high availability and load balance for your database, you don't need sun cluster with this case, no advantages gained. The only thing controlled by SUNCluster on my RAC environment is QFS (Cluster file system from SUN).
If search on this forum, there are many discussion by this topic (suncluster X RAC) like this post Re: Sun Cluster.. Why?
Regards,
Rodrigo Mufalani

QFS Meta data resource on sun cluster failed

Hi,
I'm trying to configure QFS on cluster environment, to configure metadata resource faced error. i tried with different type of qfs none of them worked.
[root @ n1u331]
~ # scrgadm -a -j mds -g qfs-mds-rg -t SUNW.qfs:5 -x QFSFileSystem=/sharedqfs
n1u332 - shqfs: Invalid priority (0) for server n1u332FS shqfs: validate_node() failed.
(C189917) VALIDATE on resource mds, resource group qfs-mds-rg, exited with non-zero exit status.
(C720144) Validation of resource mds in resource group qfs-mds-rg on node n1u332 failed.
[root @ n1u331]
~ # scrgadm -a -j mds -g qfs-mds-rg -t SUNW.qfs:5 -x QFSFileSystem=/global/haqfs
n1u332 - Mount point /global/haqfs does not have the 'shared' option set.
(C189917) VALIDATE on resource mds, resource group qfs-mds-rg, exited with non-zero exit status.
(C720144) Validation of resource mds in resource group qfs-mds-rg on node n1u332 failed.
[root @ n1u331]
~ # scrgadm -a -j mds -g qfs-mds-rg -t SUNW.qfs:5 -x QFSFileSystem=/global/hasharedqfs
n1u332 - has: No /dsk/ string (nodev) in device.Inappropriate path in FS has device component: nodev.FS has: validate_qfsdevs() failed.
(C189917) VALIDATE on resource mds, resource group qfs-mds-rg, exited with non-zero exit status.
(C720144) Validation of resource mds in resource group qfs-mds-rg on node n1u332 failed.
any QFS expert here?

hi
Yes we have 5.2, here is the wiki's link, [ http://wikis.sun.com/display/SAMQFSDocs52/Home|http://wikis.sun.com/display/SAMQFSDocs52/Home]
I have added the file system trough webconsole, and it's mounted and working fine.
after creating the file system i tried to put under sun cluster's management, but it asked for metadata resource and to create metadata resource I have got the mentioned errors.
I need the use QFS file system in non-RAC environment, just mounting and using the file system. I could mount it on two machine in shared mode and high available mode, in both case in the second node it's 3 time slower then the node which has metadata server when you write and the same read speed. could you please let me know if it's the same for your environment or not. if so what do you think of the reason, i see both side is writing to the storage directly but why it's so slow on one node.
regards,

Sun Cluster & 6130/6140 thru switch with cross-connections not supported?

Hi:
I noticed that the 6140 does not support cross-connecting the 2 controllers to 2 switches for higher availability when using Sun Cluster:
http://docs.sun.com/source/819-7497-10/chapter3.html
Does anyone know why this restriction is there?
Thanks!

Since there was no real answer to the question in this forum, I cross posted this issue to the cluster forum.
See http://forum.java.sun.com/thread.jspa?threadID=5261282&tstart=0 for the full thread.
Basically, the restriction against cross-connections is no longer valid and the documentation should be updated to remove the note.
This is all a good thing, because I had my 6140's wired into my sun cluster environment via the 'cross-connections' method diagramed in figure 3-4. :-)

Sun Cluster and Interconnect IP ranges

Can someone explain why Sun Cluster requires such large subnets for its interconnects ?
Yes, they use non-routable IPs but there are some cases where even these collide with corporate admin networks. I had one cluster I had to use that Microsoft automatic IP nework range to avoid IP conflict with corporate networks.

You dont have to stick to default IP's or subnet . You can change to whatever IP's you need. Whatever subnet mask you need. Even change the private names.
You can do all this during install or even after install.
Read the cluster install doc at docs.sun.com

Errors after initial Sun Cluster install

- SunOS conch 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Fire-V210
- Sun Cluster 3.2
I've gone through the scinstall process using the standard answers to questions. The only exception is that when it came to quorum, I answered I would set it up later, as I want to try to the quorum server. There's no shared storage - I'm seeing if it's possible to create a cluster using IP based replication.
I'm getting these error messages every 30 seconds (looks like a result of:
# svcs lrc:/etc/rc3_d/S91initgchb_resd
STATE STIME FMRI
legacy_run 16:19:29 lrc:/etc/rc3_d/S91initgchb_resd
Feb 8 16:38:59 conch Cluster.GCHB_resd: Unable to open door descriptor /var/run/rgmd_receptionist_door
Feb 8 16:38:59 conch Cluster.GCHB_resd: GCHB system error: scha_cluster_open failed with 18
Feb 8 16:38:59 conch : Bad file number
Feb 8 16:39:29 conch Cluster.GCHB_resd: Unable to open door descriptor /var/run/rgmd_receptionist_door
Feb 8 16:39:29 conch Cluster.GCHB_resd: GCHB system error: scha_cluster_open failed with 18
Feb 8 16:39:29 conch : Bad file number
Feb 8 16:39:59 conch Cluster.GCHB_resd: Unable to open door descriptor /var/run/rgmd_receptionist_door
Feb 8 16:39:59 conch Cluster.GCHB_resd: GCHB system error: scha_cluster_open failed with 18
Feb 8 16:39:59 conch : Bad file number
Feb 8 16:40:29 conch Cluster.GCHB_resd: Unable to open door descriptor /var/run/rgmd_receptionist_door
Feb 8 16:40:29 conch Cluster.GCHB_resd: GCHB system error: scha_cluster_open failed with 18
Feb 8 16:40:29 conch : Bad file number
There's no file system errors, and I'm at a complete loss as to why there appears to be this problem. Can anyone offer any advice?
Cheers,
Iain

Hi,
there are 2 issues here.
1. THe error messages that you see. I get them on my freshly installed cluster as well. What did I do? I used the JES installer and installed SC3.2 and SCGeo 3.2 - to be configured later. Ithink that it should only install the packages but not configure any part of them. It seems that it does oitherwise. To me ghcb sound like global cluster heartbeat.. I'll follow up with the developers to get this clarified.
2. Replication within a cluster and no shared storage. THis has several aspects. I, too, see more and more customer demand to have this. If you get it to work let us know. I am not sure though, why you installed the SC Geo edition to achieve this, as I do not think it well help you here.
In any case I can only recommend to set up the quorum server before proceeding, otherwise your whole cluster will panic as soon as you do a single reboot. That is per design..
Regards
Hartmut

Does SUN CLUSTER WARE support ASM?

Does SUN CLUSTER WARE support ASM?
Where can I find the answer ? Thanks.

I am not an expert but here it goes. Sun Cluster is used for clustering machines processes but NOT really used for clustering disks. This has to be done through third party software like Veritas. BUT why Sun Cluster the machines when you can cluster via Oracle CRS? CRS then uses ASM for clustered disks. Bingo you save money on Sun Cluster Software and Veritas software.
One day Oracle will rule the world!

Apache with PHP Fails to Validate in Sun Cluster

Greetings,
I have Sun Cluster 3.2u2 running with two nodes and have Apache 2.2.11 running successfully in failover mode on shared storage. However, I just installed PHP 5.2.10 and added the line "LoadModule php5_module modules/libphp5.so" to httpd.conf. I am now getting "Command {/global/data/local/apache/bin/apachectl configtest >/dev/null 2>&1} failed: httpd cannot parse httpd.conf, Failed to validate configuration." when I try to start the resource. I can start Apache just fine outside of the cluster, and when I run configtest manually, it replies "Syntax OK".
Anyone have any ideas why the Cluster software doesn't like the PHP module even though configtest passes with Syntax OK?
Many thanks,
Tim

Found it. Sun Cluster was apparently smart enough to know I was missing the correct PHP AddType lines in httpd.conf.

Sun Cluster 3.0 and VxVM 3.2 problems at boot

i 've a little problem with a two node cluster (2 x 480r + 2 x 3310 with a single raid ctl.)
Every 3310 has 3 (raid5) luns .
I've mirrored these 3 luns with VxVM, and i've mirror also the 2 internal (o.s.) disks.
One of the disk of the first 3310 is the quorum disk.
Every time i boot the nodes , i read an error at "block 0" of the quorum disk and then starts a fastidious synchronization of the mirrors. (sometimes also of the os mirror..)
Why does it happen?
Thanks.
Regards,
Mauro.

We did another test today and again the resource group went into a STOP_FAILED state. On this occasion, the export for the corresponding ZFS pool timed-out. We were able to successfully bring the resource group online on the desired cluster node. Subsequent failovers worked fine. There's something strange happening when the zpool is being exported (eg error correction?). Once the zpool is exported, further imports of it seem to work fine.
When we first had the problem, we were able to manually export and import the zpools, though they did take quite some time to export/import.
"zpool list" shows we have a total of 7 zpools.
"zfs list" shows we have a total of 27 zfs file systems.
Is there any specific Sun or otherwise links to any problems with Sun Cluster and ZFS?

Sun Cluster.. Why?

Similar Messages

Maybe you are looking for