Novell Cluster Services on VMWARE
I am currently running a 3 node cluster (oes2sp2 Linux) connected to an EMC SAN. This cluster was commissioned, fully patched, etc 250 days ago. It's been running smoothly and hasn't missed a beat :-) The nodes are bare-metal installs.
Now I've been tasked with investigating the possibility of virtualizing the nodes using VMWare (and making use of VMotion).
This will give us the ability to have 6 virtual nodes rather than 3 physical nodes.
What is the general feeling of the community with regards to:
1. Virtualizing cluster nodes
2. How complex is the setup on VMWare
3. Could I do a rolling migration from physical to virtual
Many thanks in advance for any comments, tips, advice, etc
On 11.03.2011 14:06, laurabuckley wrote:
>
> Now I've been tasked with investigating the possibility of virtualizing
> the nodes using VMWare (and making use of VMotion).
>
> This will give us the ability to have 6 virtual nodes rather than 3
> physical nodes.
>
> What is the general feeling of the community with regards to:
>
> 1. Virtualizing cluster nodes
> 2. How complex is the setup on VMWare
> 3. Could I do a rolling migration from physical to virtual
lot of talks about this, many things to consider.. here are few atleast;
If you have (or buy) VMware HA and vMotion, do you even need Novell
Cluster Services anymore?
if you want vMotion, you need to use vmdk for storage, which might be an
issue for large volumes.. or not?
For large storage you could use RDM LUNs directly from SAN. But then you
cannot use vMotion or run many nodes (of the same cluster) on one
physical server.
We are running multiple three node OES2 NCS clusters on three VMware
servers, using RDM LUNs from EMC SAN. Did a rolling migration from
physical Netware clusters using the same LUNs.
-sk
Similar Messages
-
CFMX7 - Linux - Novell Cluster Services
I have a 2 node cluster set up with Suse Linux and Novell
Cluster Services. There is a cluster resource of apache that is set
up in the cluster, as well as a shared volume as a cluster
resource. THe apache running as the cluster resource is using the
shared volume for the web root home. Furthermore, apache is running
individually on each node. Can anyone offer any opinions on setting
up CFMX7 in this environment. We are not worried about load
balancing, this is strictly needed for failover. My concerns are
which install config I could be using, Individual Server,
MultiServer, or J2EE? Can we use 1 CFADMIN for both nodes to
minimize the overhead of keeping the configs in sync? During the
install, can we use the cluster resource instance of apache to
install to? My concern here is when we install it on the second
node and point to the shared volume for the web root home, it's
going to have a problem since it already exists? Any input is
appreciated. Thanks.On 11.03.2011 14:06, laurabuckley wrote:
>
> Now I've been tasked with investigating the possibility of virtualizing
> the nodes using VMWare (and making use of VMotion).
>
> This will give us the ability to have 6 virtual nodes rather than 3
> physical nodes.
>
> What is the general feeling of the community with regards to:
>
> 1. Virtualizing cluster nodes
> 2. How complex is the setup on VMWare
> 3. Could I do a rolling migration from physical to virtual
lot of talks about this, many things to consider.. here are few atleast;
If you have (or buy) VMware HA and vMotion, do you even need Novell
Cluster Services anymore?
if you want vMotion, you need to use vmdk for storage, which might be an
issue for large volumes.. or not?
For large storage you could use RDM LUNs directly from SAN. But then you
cannot use vMotion or run many nodes (of the same cluster) on one
physical server.
We are running multiple three node OES2 NCS clusters on three VMware
servers, using RDM LUNs from EMC SAN. Did a rolling migration from
physical Netware clusters using the same LUNs.
-sk -
Novell Cluster Services - help shape the roadmap
Dear Community Members,
We have a NCS survey going for two weeks - take a look and give us your feedback. The survey link is at the bottom of the blog post.
Important Notice
Thanks,
GlenGlen,
It appears that in the past few days you have not received a response to your
posting. That concerns us, and has triggered this automated reply.
Has your problem been resolved? If not, you might try one of the following options:
- Visit http://support.novell.com and search the knowledgebase and/or check all
the other self support options and support programs available.
- You could also try posting your message again. Make sure it is posted in the
correct newsgroup. (http://forums.novell.com)
Be sure to read the forum FAQ about what to expect in the way of responses:
http://forums.novell.com/faq.php
If this is a reply to a duplicate posting, please ignore and accept our apologies
and rest assured we will issue a stern reprimand to our posting bot.
Good luck!
Your Novell Product Support Forums Team
http://forums.novell.com/ -
I am experiencing this error with one of our cluster environment. Can anyone help me in this issue.
The Cluster Service function call 'ClusterResourceControl' failed with error code '1008(An attempt was made to reference a token that does not exist.)' while verifying the file path. Verify that your failover cluster is configured properly.
Thanks,
Venu S.
Venugopal S ----------------------------------------------------------- Please click the Mark as Answer button if a post solves your problem!Hi Venu S,
Based on my research, you might encounter a known issue, please try the hotfix in this KB:
http://support.microsoft.com/kb/928385
Meanwhile since there is less information about this issue, before further investigation, please provide us the following information:
The version of Windows Server you are using
The result of SELECT @@VERSION
The scenario when you get this error
If anything is unclear, please let me know.
Regards,
Tom Li -
Here is the description of the PRD cluster scenario. ( windows 2008 + oracle)
We have 2 nodes .
1. host-erpn01 ( Have ASCS , Database instance, Enqueue and Dialog
Instance installed)
2. host-erp02 ( Have Central Instance, Dialog Instance and Enqueue installed)
When we move "SAP SID" service using "failover cluster management tool" from one node to another its fails and we have to manually select the "SAP SID cluster service" and "SAP SID cluster instance" to online.
These both service and instance were coming online after manual selection, however after some time in the mmc console of node 2 the sap instances hosted on node1 are in red cross and are giving " cannot connect to sap service dcom interface error 800706BA"
We replaced the sapstartsrv.exe from working directory of ASCS instance to CI executable directory.
Now the disp+work is stopped for CI instance. Also in the CI instance executable directory we can see five files with name of sapstartsrv i.e
sapstartsrv.exe.new , sapstartsrv.exe.tmp, sapstartsrv.new, sapstartsrv.pdb and actual sapstartsrv.exe file.
Here is the log of sapstartsrv.log CI work directory from node2.
trc file: "sapstartsrv.log", trc level: 0, release: "701"
pid 1968
Mon Oct 11 15:55:33 2010
SAP HA Trace: Build in SAP Microsoft Cluster library '701, patch 32, changelist 1046543' initialized
Initializing SAPControl Webservice
SapSSLInit failed => https support disabled
Starting WebService Named Pipe thread
Starting WebService thread
Webservice named pipe thread started, listening on port
.\pipe\sapcontrol_01
Webservice thread started, listening on port 50113
GCCIA\csrvadmin is starting SAP System at 2010/10/11 16:09:07
SAP HA Trace: FindClusterResource: SAP resource not found [sapwinha.cpp, line 334]
SAP HA Trace: SAP_HA_FindSAPInstance returns: SAP_HA_NOT_CLUSTERED [sapwinha.cpp, line 907]"
or you can view other logs from the work directory dump at
http://s000.tinyupload.com/index.php?file_id=45384422007535688902
Now when we try to start the SAPSID_00 service manually its giving error "The SAPSID_00 service failed to start due to the following error: The system cannot find the path specified.
Please advice.
Regards
Edited by: Tech GCCIA on Oct 11, 2010 3:27 PM
Edited by: Tech GCCIA on Oct 11, 2010 3:28 PMHi Sunil ,
On node 1 there is no listener.trc at /oracle_home/network/trace folder , here is the log of listener.log file in case if it is helpful.
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 10:37:37
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=3116
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=gccia-erpn01.gccia.com.sa)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 11:59:37
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=5036
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60592)) * establish * GCP * 0
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60593)) * establish * GCP * 0
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60594)) * establish * GCP * 0
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60595)) * establish * GCP * 0
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60596)) * establish * GCP * 0
10-OCT-2010 13:01:19 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61336)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61340)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61341)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61342)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61343)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61344)) * establish * GCP * 0
10-OCT-2010 13:08:27 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61485)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61489)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61490)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61491)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61492)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61493)) * establish * GCP * 0
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:09:57
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=2336
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:14:34
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=4948
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:38:12
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=2456
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 14:03:35
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=2756
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 14:10:42
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=4812
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 11-OCT-2010 09:34:05
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=1920
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 11-OCT-2010 21:12:29
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=1952
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE -
The Cluster service is shutting down because quorum was lost
Hi, we recently experienced the above issue and after looking for explanations I haven't been able to find any satisfying answers when other people have posted this issue.
Our problem is as follows:
2 node 2008R2 cluster running SQL 2012
Each node is a HP BL460c running in a HP C7000 Blade Chassis.
We were updating the flexfabric cards on one of the chassis. The other chassis had been patched the previous week with no problems.
During the update process the flexfabric cards, which hold the Ethernet and FC connections, reboot so before work had begun all active cluster services had been failed over to the node in the chassis not being worked on. However despite this the cluster
service shut down on this one particular cluster. All other clusters running across these 2 chassis continued to run as expected.
As other people have posted before we saw the following errors in the system log.
1564: File share witness resource 'File Share Witness' failed to arbitrate for the file share
1069: Cluster resource 'File Share Witness' in clustered service or application 'Cluster Group' failed.
1172: The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected
such as hubs, switches, or bridges.
However we cant understand what could cause this to happen when the service is running on the node in the chassis not being updated, especially when the same update was performed the week before with no issues. How can both nodes lose connectivity
to the File Share Witness at the same time?
Cluster Validation tests run fine and don't highlight any issues. The file share witness is accessible from both servers.Hi,
Please confirm you have install the Recommended hotfixes and updates for Windows Server 2008 R2 SP1 Failover Clusters update, especially the following hotfix.
The network location profile changes from "Domain" to "Public" in Windows 7 or in Windows Server 2008 R2
http://support.microsoft.com/kb/2524478/EN-US
A hotfix is available that adds two new cluster control codes to help you determine which cluster node is blocking a GUM update in Windows Server 2008 R2 and Windows Server
2012
http://support.microsoft.com/kb/2779069/EN-US
Hope this helps.
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place. -
Cluster services UNKNOWN state
Hi,
I am having two node cluster database. I have some doubt
If cluster services will go UNKNOWN state in first node existing connection will failover to second node?
New connections will try to connect first node?user2017273 wrote:
Hi,
I am having two node cluster database. I have some doubtQuit doubting and TEST it for yourself. Also actually reading the documentation will help
>
If cluster services will go UNKNOWN state in first node existing connection will failover to second node?
Maybe...
New connections will try to connect first node?If nodex is down any connection attempt should go to the remaining nodes. -
Error in coherence-- stopping cluster service.
i do have found the error in one of my coherence server log files can some one explain me what does it mean?
Coherence Logger@9272718 3.4.2/411 ERROR 2009-06-01 16:08:31.396/1217.130 Oracle Coherence GE 3.4.2/411 <Error> (thread=Cluster, member=3): Received cluster heartbeat from the senior Member(Id=7, Timestamp=2009-04-24 12:29:25.802, Address=xx.xxx.xx.xxx:8093, MachineId=55400, Location=machine:server72,process:11324, Role=WeblogicServer) that does not contain this Member(Id=3, Timestamp=2009-06-01 15:48:09.18, Address=xx.xxx.xxx.xx:8091, MachineId=47428, Location=site:ops.company.org,machine:cohserverbox1,process:14401, Role=CoherenceServer); stopping cluster service.
Thanks MuchHi,
This error essentially means what it says: The process received a cluster heartbeat that did not include the process as a member of the cluster. The process, therefore, stops its cluster service and will attempt to join the cluster again when appropriate. There are few reasons that the senior member may not have included the process in its heartbeat. Based on the timestamps and roles, I would first want to confirm the intent to cluster these processes. If the intent is not to cluster these processes, I would adjust their configurations appropriately (eg. use a distinct port) to form separate clusters. If the intent is to cluster these processes and the error (with the timestamp spread) reproduces, I would want to examine the network topology and look for reasons the members are being dropped from the cluster.
Regards,
Harv -
Configure the ADMIN and CLUSTER service connections to be SSL
Can you configure the ADMIN and CLUSTER service connections to be SSL
rather than tcp?
I was wondering about the present or future ability to secure other
connection services with SSL. Can you now or are there future plans
to configure the ADMIN and CLUSTER service connections to be SSL
rather than tcp? I suppose I should add the PORTMAPPER to that list.
My primary interest is for an SSLCLUSTER service in the case where
two brokers are connected over a non-trusted network. It may
not be too difficult to secure all the services the same way, but
perhaps that is on the TODO list.
A related question is if there are plans to add SSL with client
authentication as a stronger authentication mechanism than 'simple'
username and password. I believe you could get the username from
the client certificate's DN and continue to use the same LDAP user
repository for access control. I think this is similar to the way
that BEA's Weblogic server does it.
Finally should it be possible to deploy the HTTP tunnel servlet to
a webserver (such as iPlanet Web Server) configured to do SSL with
client authentication as a work-around to get stronger authentication
with the current release of the product? Or am I perhaps missing some
obvious and important detail? :) I guess I would like to know it's been
done already or is at least possible before I try and do it myself.3 scenarios involving SSL are:
1: JMS client <------- SSL -------> iMQ broker
2: iMQ admin <------- SSL -------> iMQ broker
3: iMQ broker <------- SSL -------> iMQ broker (i.e clusters)
(1) is currently supported in iMQ 2.0
(2) and (3) is not supported in iMQ 2.0. No concrete plans yet to support
it in the near future but we'll definitely consider doing it if we
hear a lot of demand for it.
]A related question is if there are plans to add SSL with client
]authentication as a stronger authentication mechanism than 'simple'
]username and password. I believe you could get the username from
]the client certificate's DN and continue to use the same LDAP user
]repository for access control. I think this is similar to the way
]that BEA's Weblogic server does it.
This is on our todo list, but due to other more pressing issues we
have not been able to address it. We will continue to keep it
on our potential list of new features.
Sorry if I sound pretty wishy-washy in my responses above, but the fact
is that the things you mentioned above had to take a backseat
to other more critical features. That and the usual time/resource
constraints caused them not to be implemented.
]Finally should it be possible to deploy the HTTP tunnel servlet to
]a webserver (such as iPlanet Web Server) configured to do SSL with
]client authentication as a work-around to get stronger authentication
]with the current release of the product? Or am I perhaps missing some
]obvious and important detail? :) I guess I would like to know it's been
]done already or is at least possible before I try and do it myself.
Yes, this should be possible (although I don't believe we've tried it here).
The client authentication here is really only between the JMS client and the
web server (not between the tunnel servlet and the iMQ broker) and should
be similar in setup to any other java application talking to iPlanet Web
Server. -
Why virtual interfaces added to ManagementOS not visible to Cluster service?
Hello All,
I"m starting this new thread since the one before is answered by our friend Udo. My problem in short is following. Diagram will be enough to explain what I'm trying to achieve. I've setup this lab to learn Hyper-V clustering with 2 nodes. It is Hyper-V
server 2012. Both nodes have 3x physical NIcs, 1 in each node is dedicated to managing the Node. Rest of the two are used to create a NIC team. Atop of that NIC team, a virtual switch is created with -AllowManagementOS
$False. Next I created and added following virtual interfaces to host partition, and plugged them into virtual switch created atop of teamed interface. These virtual interfaces should serve the purpose of various networks available.
For SAN i'm running a Linux VM which has iSCSI target server and clustering service has no problem with that. All tests pass ok.
The problem is......when those virtual interfaces added to hosts; do not appear as available networks
to cluster service; instead it only shows the management NIC as the available network to leverage.
This is making it difficult to understand how to setup a cluster of 2x Hyper-V Server nodes. Can someone help please?
Regards,
Shahzad.Shahzad,
I've read this thread a couple of times and I don't think I'm clear on the exact question you're asking.
When the clustering service goes out to look for "Networks", what it does is scan the IP addresses on each node. Every time it finds an IP in a unique subnet, that subnet is listed as a network. It can't see virtual switches and doesn't care about
virtual vs. teamed vs. physical adapters or anything like that. It's just looking at IP addresses. This is why I'm confused when you say, "it won't show virtual interfaces available as networks". "Networks" in this context are IP subnets.
I'm not aware of any context where a singular interface would be treated like a network.
If you've got virtual adapters attached to the management operating system
and have assigned IPs to them, the cluster should have discovered those networks. If you have multiple adapters on the same node using IPs in the same subnet, that network will only appear once and the cluster service will only use
one adapter from that subnet on that node. The one it picked will be visible on the "Network Connections" tab at the bottom of Failover Cluster Manager when you're on the Networks section.
Eric Siron Altaro Hyper-V Blog
I am an independent blog contributor, not an Altaro employee. I am solely responsible for the content of my posts.
"Every relationship you have is in worse shape than you think."
Hello Eric and friends,
Eric, much appreciated about your interest about the issue and yes I agree with you when you said... "When the clustering service goes out to look for "Networks",
what it does is scan the IP addresses on each node. Every time it finds an IP in a unique subnet, that subnet is listed as a network. It can't see virtual switches and doesn't care about virtual vs. teamed vs. physical adapters or anything like that. It's
just looking at IP addresses. This is why I'm confused when you say, "it won't show virtual interfaces available as networks". "Networks" in this context are IP subnets. I'm not aware of any context where a singular interface would be treated
like a network."
By networks I meant to say subnets. Let me explain what I've configured so far:
Node 1 & Node 2 installed with 3x NICs. All 3 NICs/node plugged into same switch.
Node1: 131.107.0.50/24
Node2: 131.107l.0.150/24
A Core Domain controller VM running on Node 1: 131.107.0.200/24
A JUMPBOX (WS 2012 R2 Std.) VM running on Node 1: 131.107.0.100/24
A Linux SAN VM running on Node 2: 10.1.1.100/8
I planed to configured following networks:
(1) Cluster traffic: 10.0.0.50/24 (IP given to virtual interface for Cluster traffic in Node1)
Cluster traffic: 10.0.0.150/24 (IP given to virtual interface for Cluster traffic in Node2)
(2) SAN traffic: 10.1.1.50/8 (IP given to virtual interfce for SAN traffic in Node1)
SAN traffic: 10.1.1.150/8 (IP given to virtual interfce for SAN traffic in Node2)
Note: Cluster service has no problem accessing the SAN VM (10.1.1.100) over this network, it validates SAN settings and comes back OK. This is an indication that virtual interface is
working fine.
(3) Migration traffic: 172.168.0.50/8 (IP given to virtual interfce for
Migration traffic in Node1)
Migration traffic: 172.168.0.150/8 (IP given to virtual interfce for
Migration traffic in Node2)
All these networks (virtual interfaces) are made available through two virtual switches which are configured EXACTLY identical on both Node1/Node2.
Now after finishing the cluster validation steps (which comes all OK), when create cluster wizard starts, it only shows one network; i.e. network of physical Layer 2 switch i.e. 131.107.0.0/24.
I wonder why it won't show IPs of other networks (10.0.0.0/8, 10.1.1.0/8 and 172.168.0.0/8)
Regards,
Shahzad -
DAG issue - Unable to start cluster service
Hello,
Let me brief my environment.
- 2 Sites
- 1 DAG, 4 Servers
- 2 Servers each site
Situation:
I have just updated the OS for all the servers in DAG and update all the Exchange Versions to SP3 RU3. One of the server in the DAG/cluster is down/unavailable. I figured that the cluster service on that server is disabled and not able to Start.
Errors:
Event ID:
- 7024 - The Cluster Service service terminated with service-specific error. The system cannot find the file specified..
- 7031 -
- in to ensure that this machine is a member of a cluster. If you intend to add this machine to an existing cluster use the Add Node Wizard. Alternatively, if this machine has been configured as a member of a cluster, it will be necessary to restore the
missing configuration data that is necessary for the Cluster Service to identify that it is a member of a cluster. Perform a System State Restore of this machine in order to restore the configuration data.
Please help meee....Hi,
From the error you provided above, it seems that a node(one server in DAG you mentioned above) doesn't belong to existing cluster. I recommend you join this node to cluster using
cluster node nodename /forcecleanup cmdlet and then restart cluster service.
Best regards,
Belinda
Belinda Ma
TechNet Community Support -
How do I restart Cluster services?
Can some one tell me ho do I restart Cluster Services?
Name Type Target State Host
ora....DB1.srv application ONLINE OFFLINE
ora....MSDB.cs application ONLINE OFFLINE
ora....B1.inst application ONLINE ONLINE fms-db1
ora....B2.inst application ONLINE ONLINE fms-db2
ora.FMSDB.db application ONLINE ONLINE fms-db2
ora....B1.lsnr application ONLINE ONLINE fms-db1
ora....db1.gsd application ONLINE OFFLINE
ora....db1.ons application ONLINE ONLINE fms-db1
ora....db1.vip application ONLINE ONLINE fms-db1
ora....B2.lsnr application ONLINE ONLINE fms-db2
ora....db2.gsd application ONLINE OFFLINE
ora....db2.ons application ONLINE ONLINE fms-db2
ora....db2.vip application ONLINE ONLINE fms-db2
????What did you mean Cluster Service?
If you mean Oracle Cluster,
1. You must root user.
2. use crsctl command-line
./crsctl stop crs
./crsctl start crs
Your Database and listener , they have resisted in Oracle Cluster, that down.
If you mean database service. You can use srvctl command-line to help you -
Cluster Service 1146 & 1230 event id
Dear Team,
I am facing a cluster problem in server 2012 r2 its showing me error event id 1146 & 1230
i am not able to start my cluster service my production is total down please help
Here is log with this link pls help
https://onedrive.live.com/redir?resid=4A228E11EF76B735!193&authkey=!AKCOUxUeE4FEu8A&ithint=file%2ctxt
Ravi Tandon
8400414038Hi,
The log is incomplete. The error 1146 or 1230 is not included in the log file you uploaded.
According to my search result, error 1146 & 1230 could be caused by dll crash issue. You can search in your local log file to see if you can find such entry:
Error server.domain.com 1230 Microsoft-Windows-FailoverClustering Cluster resource 'AA_BBBB' (resource type '', DLL 'XXXXX.dll') either crashed or deadlocked.
If so, search for the dll file to see if you can find any detailed information. Sometimes it could belong to a third party application and you can try to uninstall it to see the result. Or if it belong to a Role or Service, you can try to repair/reinstall
it.
And as Tim said, analysis log on TechNet forum is a little difficult as log files are large and almost all log files contain company information. You can try to submit a case to Microsoft for an efficient response.
If you have any feedback on our support, please send to [email protected] -
Cluster Service Monitoring - Is There An Alert When a Volume is Available?
We've seen some alerts that show a shared volume is no longer available. They look something like this
Alert: Shared Volume IO is paused
Source: Cluster Service
Path: Host.domain.com
Last modified by: System
Last modified time: 2/14/2011 7:16:10 AM Alert description: Cluster Shared Volume 'Volume4' ('Exchange Mail Data') is no longer available on this node because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to
the volume is reestablished.
We're wondering if there is a way to generate an alert that tells us the volume is available again.
Orange County District AttorneyHi,
Based on my research, this monitor is based on the Cluster Shared Volume related Events:
Event Log Rules
http://technet.microsoft.com/en-us/library/dd491018.aspx
Please also see the Events listed:
Cluster Shared Volume Functionality
http://technet.microsoft.com/en-us/library/ee830309(WS.10).aspx
However, I could not find the Events means the “cluster
shared volume is available again”; therefore, I suspect this cannot be monitored based on Event Log.
In addition, I just noticed the status of a cluster shared volume can be queried by PowerShell script. Hope this can give you some hints:
Get-ClusterSharedVolume
http://technet.microsoft.com/en-us/library/ee460981.aspx
Thanks.
Nicholas Li - MSFT
Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. -
Dear Technet,
Windows could not start the Cluster Service on Local computer. For more information, review the System Event Log. If this is a non-Microsoft service, contact the service vendor, and refer to service-specific error code 2.
My cluster suddenly went disappear. and tried to restart the cluster service. When trying to restart service this above mention error comes up.
even i tried to remove the cluster through power-shell still couldn't happen because of cluster service not running.
Help me please.. thank you.
Regards
ShamilHi,
Could you confirm which account when you start the cluster service? The Cluster service is a service that requires a domain user account.
The server cluster Setup program changes the local security policy for this account by granting a set of user rights to the account. Additionally, this account is made a member
of the local Administrators group.
If one or more of these user rights are missing, the Cluster service may stop immediately during startup or later, depending on when the Cluster service requires the particular
user right.
Hope this helps.
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place.
Maybe you are looking for
-
Cannot login my Windows 8.1 Pro with my live ID, don't have the Microsoft Logon Option
My laptop was installed with a Windows 8.1 Pro, and it worked very well before, i can login with my domain account and Windows Live ID, but today when i startup my laptop and i cannot find the option "Login with Microsoft/Live ID", i only can login w
-
How to get the SR NO in a report
Hi all, I am new to report and was wondering how I get the SrNO in a report like below example. SrNO Name Dept 1 ABC Sales 2 xyz Admin Should i use CF or CP etc ??? Regards Sunny
-
256k MP3 files; I want them as 192kbps AAC files in iTunes
I thought when I imported files via the FILE > IMPORT menu, that it would convert the files according to my Import Settings, but it doesn't. This is also true for ADVANCED > CONVERT SELECTION TO AAC... If I don't want 256kbps, how do I force iTunes t
-
Internal error: Too many points in plot
Labview keeps crashing and upon relaunch I get a message saying "... an internal error or crash occurred at plotsupp.cpp, line 4049" The failure log generated is: #Date: Thu, Sep 13, 2007 3:37:07 PM #OSName: Windows NT #OSVers: 5.1 #AppName: LabVIEW
-
HELP! Pre is crashing when I try to delete clips - project due in 4 hrs
I have a 20 min SD video that I need to cut down to 10 minutes - I am trying to remove clips from the beginning of the project, but after I delete 1 clip, pre crashes - what to do?? Eric