Cluster Service 1146 & 1230 event id
Dear Team,
I am facing a cluster problem in server 2012 r2 its showing me error event id 1146 & 1230
i am not able to start my cluster service my production is total down please help
Here is log with this link pls help
https://onedrive.live.com/redir?resid=4A228E11EF76B735!193&authkey=!AKCOUxUeE4FEu8A&ithint=file%2ctxt
Ravi Tandon
8400414038
Hi,
The log is incomplete. The error 1146 or 1230 is not included in the log file you uploaded.
According to my search result, error 1146 & 1230 could be caused by dll crash issue. You can search in your local log file to see if you can find such entry:
Error server.domain.com 1230 Microsoft-Windows-FailoverClustering Cluster resource 'AA_BBBB' (resource type '', DLL 'XXXXX.dll') either crashed or deadlocked.
If so, search for the dll file to see if you can find any detailed information. Sometimes it could belong to a third party application and you can try to uninstall it to see the result. Or if it belong to a Role or Service, you can try to repair/reinstall
it.
And as Tim said, analysis log on TechNet forum is a little difficult as log files are large and almost all log files contain company information. You can try to submit a case to Microsoft for an efficient response.
If you have any feedback on our support, please send to [email protected]
Similar Messages
-
Dear Technet,
Windows could not start the Cluster Service on Local computer. For more information, review the System Event Log. If this is a non-Microsoft service, contact the service vendor, and refer to service-specific error code 2.
My cluster suddenly went disappear. and tried to restart the cluster service. When trying to restart service this above mention error comes up.
even i tried to remove the cluster through power-shell still couldn't happen because of cluster service not running.
Help me please.. thank you.
Regards
ShamilHi,
Could you confirm which account when you start the cluster service? The Cluster service is a service that requires a domain user account.
The server cluster Setup program changes the local security policy for this account by granting a set of user rights to the account. Additionally, this account is made a member
of the local Administrators group.
If one or more of these user rights are missing, the Cluster service may stop immediately during startup or later, depending on when the Cluster service requires the particular
user right.
Hope this helps.
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place. -
Here is the description of the PRD cluster scenario. ( windows 2008 + oracle)
We have 2 nodes .
1. host-erpn01 ( Have ASCS , Database instance, Enqueue and Dialog
Instance installed)
2. host-erp02 ( Have Central Instance, Dialog Instance and Enqueue installed)
When we move "SAP SID" service using "failover cluster management tool" from one node to another its fails and we have to manually select the "SAP SID cluster service" and "SAP SID cluster instance" to online.
These both service and instance were coming online after manual selection, however after some time in the mmc console of node 2 the sap instances hosted on node1 are in red cross and are giving " cannot connect to sap service dcom interface error 800706BA"
We replaced the sapstartsrv.exe from working directory of ASCS instance to CI executable directory.
Now the disp+work is stopped for CI instance. Also in the CI instance executable directory we can see five files with name of sapstartsrv i.e
sapstartsrv.exe.new , sapstartsrv.exe.tmp, sapstartsrv.new, sapstartsrv.pdb and actual sapstartsrv.exe file.
Here is the log of sapstartsrv.log CI work directory from node2.
trc file: "sapstartsrv.log", trc level: 0, release: "701"
pid 1968
Mon Oct 11 15:55:33 2010
SAP HA Trace: Build in SAP Microsoft Cluster library '701, patch 32, changelist 1046543' initialized
Initializing SAPControl Webservice
SapSSLInit failed => https support disabled
Starting WebService Named Pipe thread
Starting WebService thread
Webservice named pipe thread started, listening on port
.\pipe\sapcontrol_01
Webservice thread started, listening on port 50113
GCCIA\csrvadmin is starting SAP System at 2010/10/11 16:09:07
SAP HA Trace: FindClusterResource: SAP resource not found [sapwinha.cpp, line 334]
SAP HA Trace: SAP_HA_FindSAPInstance returns: SAP_HA_NOT_CLUSTERED [sapwinha.cpp, line 907]"
or you can view other logs from the work directory dump at
http://s000.tinyupload.com/index.php?file_id=45384422007535688902
Now when we try to start the SAPSID_00 service manually its giving error "The SAPSID_00 service failed to start due to the following error: The system cannot find the path specified.
Please advice.
Regards
Edited by: Tech GCCIA on Oct 11, 2010 3:27 PM
Edited by: Tech GCCIA on Oct 11, 2010 3:28 PMHi Sunil ,
On node 1 there is no listener.trc at /oracle_home/network/trace folder , here is the log of listener.log file in case if it is helpful.
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 10:37:37
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=3116
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=gccia-erpn01.gccia.com.sa)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 11:59:37
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=5036
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60592)) * establish * GCP * 0
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60593)) * establish * GCP * 0
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60594)) * establish * GCP * 0
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60595)) * establish * GCP * 0
10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60596)) * establish * GCP * 0
10-OCT-2010 13:01:19 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61336)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61340)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61341)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61342)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61343)) * establish * GCP * 0
10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61344)) * establish * GCP * 0
10-OCT-2010 13:08:27 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61485)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61489)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61490)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61491)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61492)) * establish * GCP * 0
10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61493)) * establish * GCP * 0
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:09:57
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=2336
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:14:34
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=4948
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:38:12
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=2456
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 14:03:35
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=2756
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 14:10:42
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=4812
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCP.WORLDipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
.\pipe\GCPipc)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 11-OCT-2010 09:34:05
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=1920
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 11-OCT-2010 21:12:29
Copyright (c) 1991, 2007, Oracle. All rights reserved.
System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
Log messages written to D:\oracle\GCP\102\network\log\listener.log
Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
Trace level is currently 0
Started with pid=1952
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE -
DAG issue - Unable to start cluster service
Hello,
Let me brief my environment.
- 2 Sites
- 1 DAG, 4 Servers
- 2 Servers each site
Situation:
I have just updated the OS for all the servers in DAG and update all the Exchange Versions to SP3 RU3. One of the server in the DAG/cluster is down/unavailable. I figured that the cluster service on that server is disabled and not able to Start.
Errors:
Event ID:
- 7024 - The Cluster Service service terminated with service-specific error. The system cannot find the file specified..
- 7031 -
- in to ensure that this machine is a member of a cluster. If you intend to add this machine to an existing cluster use the Add Node Wizard. Alternatively, if this machine has been configured as a member of a cluster, it will be necessary to restore the
missing configuration data that is necessary for the Cluster Service to identify that it is a member of a cluster. Perform a System State Restore of this machine in order to restore the configuration data.
Please help meee....Hi,
From the error you provided above, it seems that a node(one server in DAG you mentioned above) doesn't belong to existing cluster. I recommend you join this node to cluster using
cluster node nodename /forcecleanup cmdlet and then restart cluster service.
Best regards,
Belinda
Belinda Ma
TechNet Community Support -
Cluster Service Monitoring - Is There An Alert When a Volume is Available?
We've seen some alerts that show a shared volume is no longer available. They look something like this
Alert: Shared Volume IO is paused
Source: Cluster Service
Path: Host.domain.com
Last modified by: System
Last modified time: 2/14/2011 7:16:10 AM Alert description: Cluster Shared Volume 'Volume4' ('Exchange Mail Data') is no longer available on this node because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to
the volume is reestablished.
We're wondering if there is a way to generate an alert that tells us the volume is available again.
Orange County District AttorneyHi,
Based on my research, this monitor is based on the Cluster Shared Volume related Events:
Event Log Rules
http://technet.microsoft.com/en-us/library/dd491018.aspx
Please also see the Events listed:
Cluster Shared Volume Functionality
http://technet.microsoft.com/en-us/library/ee830309(WS.10).aspx
However, I could not find the Events means the “cluster
shared volume is available again”; therefore, I suspect this cannot be monitored based on Event Log.
In addition, I just noticed the status of a cluster shared volume can be queried by PowerShell script. Hope this can give you some hints:
Get-ClusterSharedVolume
http://technet.microsoft.com/en-us/library/ee460981.aspx
Thanks.
Nicholas Li - MSFT
Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. -
I am experiencing this error with one of our cluster environment. Can anyone help me in this issue.
The Cluster Service function call 'ClusterResourceControl' failed with error code '1008(An attempt was made to reference a token that does not exist.)' while verifying the file path. Verify that your failover cluster is configured properly.
Thanks,
Venu S.
Venugopal S ----------------------------------------------------------- Please click the Mark as Answer button if a post solves your problem!Hi Venu S,
Based on my research, you might encounter a known issue, please try the hotfix in this KB:
http://support.microsoft.com/kb/928385
Meanwhile since there is less information about this issue, before further investigation, please provide us the following information:
The version of Windows Server you are using
The result of SELECT @@VERSION
The scenario when you get this error
If anything is unclear, please let me know.
Regards,
Tom Li -
How open a service order using event handling
HOw to open a service order using event handling
Hi,
Can you explain your requirement elaborately.
I understand from it as, you want to open Service order creation page, based on some event(may be submit button).
For that technically you can use navigation->goto_page('Provide the URL').
or you can use inbound-plug and out-bound plug concept for naviagation.
Regards,
Devender V -
The Cluster service is shutting down because quorum was lost
Hi, we recently experienced the above issue and after looking for explanations I haven't been able to find any satisfying answers when other people have posted this issue.
Our problem is as follows:
2 node 2008R2 cluster running SQL 2012
Each node is a HP BL460c running in a HP C7000 Blade Chassis.
We were updating the flexfabric cards on one of the chassis. The other chassis had been patched the previous week with no problems.
During the update process the flexfabric cards, which hold the Ethernet and FC connections, reboot so before work had begun all active cluster services had been failed over to the node in the chassis not being worked on. However despite this the cluster
service shut down on this one particular cluster. All other clusters running across these 2 chassis continued to run as expected.
As other people have posted before we saw the following errors in the system log.
1564: File share witness resource 'File Share Witness' failed to arbitrate for the file share
1069: Cluster resource 'File Share Witness' in clustered service or application 'Cluster Group' failed.
1172: The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected
such as hubs, switches, or bridges.
However we cant understand what could cause this to happen when the service is running on the node in the chassis not being updated, especially when the same update was performed the week before with no issues. How can both nodes lose connectivity
to the File Share Witness at the same time?
Cluster Validation tests run fine and don't highlight any issues. The file share witness is accessible from both servers.Hi,
Please confirm you have install the Recommended hotfixes and updates for Windows Server 2008 R2 SP1 Failover Clusters update, especially the following hotfix.
The network location profile changes from "Domain" to "Public" in Windows 7 or in Windows Server 2008 R2
http://support.microsoft.com/kb/2524478/EN-US
A hotfix is available that adds two new cluster control codes to help you determine which cluster node is blocking a GUM update in Windows Server 2008 R2 and Windows Server
2012
http://support.microsoft.com/kb/2779069/EN-US
Hope this helps.
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place. -
Web Services Round Robin Service Load Balancer Event Endpoint Failure
I keep seeing these errors in the UlsTraceLogs:
SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure Process Name: OWSTIMER Process ID: 3748 AppDomain Name: DefaultDomain AppDomain ID: 1 Service Application Uri: urn:schemas-microsoft-com:sharepoint:service:9b3095eda69947b299d2f873bbfee5ad#authority=urn:uuid:a01381a61b244525ab4fec30cde9dc5f&authority=https://ApplicationServerName:port/Topology/topology.svc
Active Endpoints: 2 Failed Endpoints:1 Affected Endpoint:
http://WFEserverName:port/9b3095eda69947b299d2f873bbfee5ad/ProfileService.svc
what do these errors mean?ok, thanks, I'll have a look at that.
Going back to my issue... Since I stopped the User Profile Service on the Application server, now I'm getting these non-stop messages in the log:
SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure Process Name: w3wp Process ID: 6088 AppDomain Name: /LM/W3SVC/261708640/ROOT-1-130709594108226406 AppDomain ID: 2 Service Application Uri: urn:schemas-microsoft-com:sharepoint:service:9b3095eda69947b299d2f873bbfee5ad#authority=urn:uuid:a01381a61b244525ab4fec30cde9dc5f&authority=https://ApplicationServerName:port/Topology/topology.svc
Active Endpoints: 2 Failed Endpoints:1 Affected Endpoint:
http://ApplicationServerName:port/9b3095eda69947b299d2f873bbfee5ad/ProfileService.svc
SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure Process Name: OWSTIMER Process ID: 8304 AppDomain Name: DefaultDomain AppDomain ID: 1 Service Application Uri: urn:schemas-microsoft-com:sharepoint:service:9b3095eda69947b299d2f873bbfee5ad#authority=urn:uuid:a01381a61b244525ab4fec30cde9dc5f&authority=https://ApplicationServerName:port/Topology/topology.svc
Active Endpoints: 2 Failed Endpoints:1 Affected Endpoint:
http://ApplicationServerName:port/9b3095eda69947b299d2f873bbfee5ad/ProfileService.svc
This time, the messages are referring to the same server - the Application Server. In my original question, I should've differentiated the server names when I pasted the message. Originally the message was referring to the Application
Server and Affected Endpoint was referring to a WFE. I'll edit my original post to make it correct. -
Call GET_SEARCH_REASULT service from scheduler event filter Iin UCM
Hi,
In our application, the mail should get sent to the content author on content revised date. For that, we have written a scheduler event filter component which will get invoked after every five minutes. In filter class I want to call GET_SEARCH_REASULTS service to get the list of contents and its authors to send mail.
Can anybody please tell me how to call GET_SEARCH_REASULTS service from scheduler event filter?
Thanks in advance.Hi Nitin
Why cant you try writing custom query and custom service ?
Please refer idoc script reference guide for getting the parametrs to be passed when using Get_search_results. -
Errors for excel - excel service unavailable. Event Viewer has error event ids - 5239 and 5231.
We restart the excel service app and it solves. Looking for permanent solution.
Regards,
KunalTo resolved the issue do a simple restart.
Restart the server
Before restarting, verify that this problem occurs often. It may be an intermittent problem that is automatically corrected and does not require you to restart the server.
If the problem occurs often, restart the server running Excel Services Application.
If the problem continues to occur often, and restarting the server did not correct the problem, confirm that the hardware of the server is functioning correctly, or reinstall Excel Services Application and re-add the server to the server farm.
Here's the article with the explanation: Error communicating with Excel Services
Application - Events 5231 5239 5240
Please 'propose as answer' if it helped you, also 'vote helpful' if you like this reply. -
Novell Cluster Services on VMWARE
I am currently running a 3 node cluster (oes2sp2 Linux) connected to an EMC SAN. This cluster was commissioned, fully patched, etc 250 days ago. It's been running smoothly and hasn't missed a beat :-) The nodes are bare-metal installs.
Now I've been tasked with investigating the possibility of virtualizing the nodes using VMWare (and making use of VMotion).
This will give us the ability to have 6 virtual nodes rather than 3 physical nodes.
What is the general feeling of the community with regards to:
1. Virtualizing cluster nodes
2. How complex is the setup on VMWare
3. Could I do a rolling migration from physical to virtual
Many thanks in advance for any comments, tips, advice, etcOn 11.03.2011 14:06, laurabuckley wrote:
>
> Now I've been tasked with investigating the possibility of virtualizing
> the nodes using VMWare (and making use of VMotion).
>
> This will give us the ability to have 6 virtual nodes rather than 3
> physical nodes.
>
> What is the general feeling of the community with regards to:
>
> 1. Virtualizing cluster nodes
> 2. How complex is the setup on VMWare
> 3. Could I do a rolling migration from physical to virtual
lot of talks about this, many things to consider.. here are few atleast;
If you have (or buy) VMware HA and vMotion, do you even need Novell
Cluster Services anymore?
if you want vMotion, you need to use vmdk for storage, which might be an
issue for large volumes.. or not?
For large storage you could use RDM LUNs directly from SAN. But then you
cannot use vMotion or run many nodes (of the same cluster) on one
physical server.
We are running multiple three node OES2 NCS clusters on three VMware
servers, using RDM LUNs from EMC SAN. Did a rolling migration from
physical Netware clusters using the same LUNs.
-sk -
Cluster services UNKNOWN state
Hi,
I am having two node cluster database. I have some doubt
If cluster services will go UNKNOWN state in first node existing connection will failover to second node?
New connections will try to connect first node?user2017273 wrote:
Hi,
I am having two node cluster database. I have some doubtQuit doubting and TEST it for yourself. Also actually reading the documentation will help
>
If cluster services will go UNKNOWN state in first node existing connection will failover to second node?
Maybe...
New connections will try to connect first node?If nodex is down any connection attempt should go to the remaining nodes. -
Error in coherence-- stopping cluster service.
i do have found the error in one of my coherence server log files can some one explain me what does it mean?
Coherence Logger@9272718 3.4.2/411 ERROR 2009-06-01 16:08:31.396/1217.130 Oracle Coherence GE 3.4.2/411 <Error> (thread=Cluster, member=3): Received cluster heartbeat from the senior Member(Id=7, Timestamp=2009-04-24 12:29:25.802, Address=xx.xxx.xx.xxx:8093, MachineId=55400, Location=machine:server72,process:11324, Role=WeblogicServer) that does not contain this Member(Id=3, Timestamp=2009-06-01 15:48:09.18, Address=xx.xxx.xxx.xx:8091, MachineId=47428, Location=site:ops.company.org,machine:cohserverbox1,process:14401, Role=CoherenceServer); stopping cluster service.
Thanks MuchHi,
This error essentially means what it says: The process received a cluster heartbeat that did not include the process as a member of the cluster. The process, therefore, stops its cluster service and will attempt to join the cluster again when appropriate. There are few reasons that the senior member may not have included the process in its heartbeat. Based on the timestamps and roles, I would first want to confirm the intent to cluster these processes. If the intent is not to cluster these processes, I would adjust their configurations appropriately (eg. use a distinct port) to form separate clusters. If the intent is to cluster these processes and the error (with the timestamp spread) reproduces, I would want to examine the network topology and look for reasons the members are being dropped from the cluster.
Regards,
Harv -
Configure the ADMIN and CLUSTER service connections to be SSL
Can you configure the ADMIN and CLUSTER service connections to be SSL
rather than tcp?
I was wondering about the present or future ability to secure other
connection services with SSL. Can you now or are there future plans
to configure the ADMIN and CLUSTER service connections to be SSL
rather than tcp? I suppose I should add the PORTMAPPER to that list.
My primary interest is for an SSLCLUSTER service in the case where
two brokers are connected over a non-trusted network. It may
not be too difficult to secure all the services the same way, but
perhaps that is on the TODO list.
A related question is if there are plans to add SSL with client
authentication as a stronger authentication mechanism than 'simple'
username and password. I believe you could get the username from
the client certificate's DN and continue to use the same LDAP user
repository for access control. I think this is similar to the way
that BEA's Weblogic server does it.
Finally should it be possible to deploy the HTTP tunnel servlet to
a webserver (such as iPlanet Web Server) configured to do SSL with
client authentication as a work-around to get stronger authentication
with the current release of the product? Or am I perhaps missing some
obvious and important detail? :) I guess I would like to know it's been
done already or is at least possible before I try and do it myself.3 scenarios involving SSL are:
1: JMS client <------- SSL -------> iMQ broker
2: iMQ admin <------- SSL -------> iMQ broker
3: iMQ broker <------- SSL -------> iMQ broker (i.e clusters)
(1) is currently supported in iMQ 2.0
(2) and (3) is not supported in iMQ 2.0. No concrete plans yet to support
it in the near future but we'll definitely consider doing it if we
hear a lot of demand for it.
]A related question is if there are plans to add SSL with client
]authentication as a stronger authentication mechanism than 'simple'
]username and password. I believe you could get the username from
]the client certificate's DN and continue to use the same LDAP user
]repository for access control. I think this is similar to the way
]that BEA's Weblogic server does it.
This is on our todo list, but due to other more pressing issues we
have not been able to address it. We will continue to keep it
on our potential list of new features.
Sorry if I sound pretty wishy-washy in my responses above, but the fact
is that the things you mentioned above had to take a backseat
to other more critical features. That and the usual time/resource
constraints caused them not to be implemented.
]Finally should it be possible to deploy the HTTP tunnel servlet to
]a webserver (such as iPlanet Web Server) configured to do SSL with
]client authentication as a work-around to get stronger authentication
]with the current release of the product? Or am I perhaps missing some
]obvious and important detail? :) I guess I would like to know it's been
]done already or is at least possible before I try and do it myself.
Yes, this should be possible (although I don't believe we've tried it here).
The client authentication here is really only between the JMS client and the
web server (not between the tunnel servlet and the iMQ broker) and should
be similar in setup to any other java application talking to iPlanet Web
Server.
Maybe you are looking for
-
Can't connect to database or Database Home Page (after installation)
sorry that i have to make another thread for this problem, but i didnt found an equal problem to mine. after installing oracleXE 10, i get an error message in my firefox, if want to start 127.0.0.1:8080/apex. i have win xp pro. in command line i trye
-
DVD won't play in my mac, but it will in the DVD player
So we made this great little film about our son swimming and burned it to a disk. The disk playes just fine in regular dvd players. However, when we tried to view it on the computer, it says that it is a blank disk. Why would this be and how can I vi
-
How do I get a Metalic look to a surface?
I'm working on making a 3D logo for a client. They've generated a 3D image in Solidworks (Two letters of text). I import it into Photoshop and convert it to a psd and then into Shake for further processing as Photoshop won't let you apply filters to
-
DataGrid tooltip doesn't work for customized itemRenderer
I have a DataGrid with one column is a customized renderer, when I try to use the dataTripFunc, it is not showing tooltips, same function works for the other non-customized column, here is the code: <mx:DataGrid id="myDataGrid" dataProvider="{dataCon
-
Whenever I try to erace something with the delete button it acts like "tab" and sends me all over the page when all I want to do is erace what I wrote...How do I fix this?