Cluster node networking

I have five node Windows Server 2008 R2 Hyper-V cluster. I put one node to Maintance mode and all VMs migrated to other hosts. I pulled out LAN cables form that node for testing (one out, waited a litte, put it back and pulled second and so on) and put
them right back on.
After that I had a lot of cluster errors and some VMs restarted.
I have put many times nodes on maintance mode and restarted / shut down them and never had any cluster problems. Why did I have now when I pulled out LAN cables?

Hi antesl,
The
failover behavior occurs because the cluster node has detect the cluster resource or node fail, such as network, storage, please refer the following related KB to confirm there have no potential single point failure configuration in your
cluster.
Failover Cluster
http://msdn.microsoft.com/en-us/library/ff650328.aspx
Failover Cluster Step-by-Step Guide: Configuring the Quorum in a Failover Cluster
http://technet.microsoft.com/zh-cn/library/cc770620(v=ws.10).aspx
How a Server Cluster Works
http://technet.microsoft.com/en-us/library/cc738051(v=ws.10).aspx
HYPER-V 2008 R2 SP1 Best Practices (In Easy Checklist Form)
http://blogs.technet.com/b/askpfeplat/archive/2012/11/19/hyper-v-2008-r2-sp1-best-practices-in-easy-checklist-form.aspx
I’m glad to be of help to you!
Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Support, contact [email protected]

Similar Messages

Regarding Pulling out network cable from cluster node

I have two cluster nodes installed with my application.
I have pulled out the Network cable from the primary where my application is running. So the primary is not reachle from remote box.(cannot ping primary)
I have found the following error messages
SUNW,hme0 : No response from Ethernet network : Link down -- cable problem?
I have found the device group and resource group online the primary and the sun cluster does not failover to secondary node. Does Sun cluster support this scenario ?
Or do i need to any additional configuration? Can i get clarification on this

Hi Sudheer,
if you have two interfaces in your ipmpgroup, I am missing the test address.
http://docs.sun.com/app/docs/doc/819-3000/emybr?l=en&q=ipmp&a=view
states a hostname.hme0 as:
192.168.85.19 netmask + broadcast + group testgroup1 up \
addif 192.168.85.21 deprecated -failover netmask + broadcast + up
and for hotname.hme1
192.168.85.20 netmask + broadcast + group testgroup1 up \
addif 192.168.85.22 deprecated -failover netmask + broadcast + up
you can safely replace the addresses by names if they are in /etc/hosts
In this case the -failover flag for the physical of your example is wrong.
If you only have one adapter,
One line in /etc/hostname.hme0 like you stated in your example is correct.
this is from one of my clusters.
deulwork20 group sc_ipmp0 -failover
it is the ipmpgroup Sun Cluster creates for you if you do not specify anything else. so for one single adapter one line like "hadev1 group sc_ipmp0 -failover" is correct.
DEtlef

SCVMM losing connection to cluster nodes

Hey guys'n girls, I hope this is the right forum for this question. I already opened a ticket at MS support as well because it's impacting our production environment indirectly, but even after a week there's been no contact. Losing faith in MS support there
The problem we're having is that scvmm is that a host enters the 'needs attention' state, with a winrm error 0x80338126. I guess it has something to do with the network or with Kerberos, and I've found some info on it, but I still haven't been able to solve
it. Do you guys have any ideas?
Problem summary:
We are seeing an issue on our new hyper-v platform. The platform should have been in production last week, but this issue is delaying our project as we can't seem to get it stable.
The problem we are experiencing is that SCVMM loses the connection to some of the Hyper-V nodes. Not one
specific node. Last week it happened to two nodes, and today it happened to another node. I see issues with WinRM, and I expect something to do with kerberos. See the bottom of this post for background details and software versions.
The host gets the status 'needs attention', and if you look at the status of the machine, WinRM gives an error. The error is:
Error (2916)
VMM is unable to complete the request. The connection to the agent cc1-hyp-10.domaincloud1.local was lost.
WinRM: URL: [http://cc1-hyp-10.domaincloud1.local:5985], Verb: [ENUMERATE], Resource: [http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Service], Filter: [select * from Win32_Service where Name="WinRM"]
Unknown error (0x80338126)
Recommended Action
Ensure that the Windows Remote Management (WinRM) service and the VMM agent are installed and running and that a firewall is not blocking HTTP/HTTPS traffic. Ensure that VMM server is able to communicate with cc1-hyp-10.domaincloud1.local over WinRM by successfully
running the following command:
winrm id –r:cc1-hyp-10.domaincloud1.local
This
problem can also be caused by a Windows Management Instrumentation (WMI) service crash. If the server is running Windows Server 2008 R2, ensure that KB 982293 (http://support.microsoft.com/kb/982293)
is installed on it.
If the error persists, restart cc1-hyp-10.domaincloud1.local and then try the operation again. /nRefer to
http://support.microsoft.com/kb/2742275 for more details.
Doing a simple test from the VMM server to the problematic cluster node shows this error:
PS C:\> hostname
CC1-VMM-01
PS C:\> winrm id -r:cc1-hyp-10.domaincloud1.local
WSManFault
    Message = WinRM cannot complete the operation. Verify that the specified computer name is valid, that the computer is accessible over the network, and that a firewall exception for the WinRM service is enabled and allows access from this
computer. By default, the WinRM firewall exception for public profiles limits access to remote computers within the same local subnet.
Error number: -2144108250 0x80338126
WinRM cannot complete the operation. Verify that the specified computer name is valid, that the computer is accessible over the network, and that a firewall exception for the WinRM service is enabled and allows access from this computer. By default, the WinRM
firewall exception for public profiles limits access to remote computers within the same local subnet.
I CAN connect from other hosts to this problematic cluster node:
PS C:\> hostname
CC1-HYP-16
PS C:\> winrm id -r:cc1-hyp-10.domaincloud1.local
IdentifyResponse
    ProtocolVersion =
http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd
    ProductVendor = Microsoft Corporation
    ProductVersion = OS: 6.3.9600 SP: 0.0 Stack: 3.0
    SecurityProfiles
        SecurityProfileName =
http://schemas.dmtf.org/wbem/wsman/1/wsman/secprofile/http/spnego-kerberos
And I can connect from the vmm server to all other cluster nodes:
PS C:\> hostname
CC1-VMM-01
PS C:\> winrm id -r:cc1-hyp-11.domaincloud1.local
IdentifyResponse
    ProtocolVersion =
http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd
    ProductVendor = Microsoft Corporation
    ProductVersion = OS: 6.3.9600 SP: 0.0 Stack: 3.0
    SecurityProfiles
        SecurityProfileName =
http://schemas.dmtf.org/wbem/wsman/1/wsman/secprofile/http/spnego-kerberos
So at this point only the test from the cc1-vmm-01 to cc1-hyp-10 seems to be problematic.
I followed the steps in the page
https://support.microsoft.com/kb/2742275 (which is referred to above). I tried the VMMCA, but it can't really get it working the way I want, or it seems to give outdated recommendations.
I tried checking for duplicate SPN's by running setspn -x on affected machines. No results (although I do not understand
what an SPN is or how it works). I rebuilt the performance counters.
It tried setting 'sc config winrm type= own' as described in [http://blinditandnetworkadmin.blogspot.nl/2012/08/kb-how-to-troubleshoot-needs-attention.html].
If I reboot this cc1-hyp-10 machine, it will start working perfectly again. However, then I can't troubleshoot the issue, and it will happen again.
I want this problem to be solved, so vmm never loses connection to the hypervisors it's managing again!
Background information:
We've set up a platform with Hyper-V to run a VM workload. The platform consists of the following hardware:
2 Dell R620's with 32GB of RAM, running hyper-v to virtualize the cloud management layer (DC's, VMM, SQL). These machines are called cc1-hyp-01 and cc1-hyp-02. They run the management vm's like cc1-dc-01/02, cc1-sql-01, cc1-vmm-01, etc. The names are self-explanatory.
The VMM machine is NOT clustered.
8 Dell M620 blades with 320GB of RAM, running hyper-v to virtualize the customer workload. The machines are
called cc1-hyp-10 until cc1-hyp-17. They are in a cluster.
2 Equallogic units form a SAN (premium storage), and we have a Dell R515 running iscsi target (budget storage).
We have Dell Force10 switches and Cisco C3750X switches to connect everything together (mostly 10GB links).
All hosts run Windows Server 2012R2 Datacenter edition. The VMM server runs System Center Virtual Machine Manage 2012 R2.
All the latest Windows updates are installed on every host. There are no firewalls between any host (vmm and hypervisors) at this level. Windows firewalls are all disabled. No antivirus software is installed, no symantec software is installed.
The only non-standard software that is installed is the Dell Host Integration Tools 4.7.1, Dell Openmanage Server Administrator, and some small stuff like 7-zip, bginfo, net-snap, etc.
The SCVMM service is running under the domain account DOMAINCLOUD1\scvmm. This machine is in the local administrators group of each cluster node.
On top of this cloud layer we're running the tenant layer with a lot of vm's for a specific customer (although they are all off now).

I think I found the culprit, after an hour of analyzing wireshark dumps I found the vmm had jumbo frames enabled on the management interface to the hosts (and the underlying infrastructure does not).. Now my winrm commands started working again.

QMASTER hints 4 usual trouble (QM NOT running/CLUSTEREd nodes/Networks etc

All, I just posted this with some hints & workaround with very common issues people have on this forum and keep asking concerning the use of APPLE QMASTER with FCP, SHAKE, COMPRESSOR and MOTION. I've had many over the last 2 years and see them coming up frequently.
Perhaps these symptoms are fixed in FCS2 at MAY 2007 (now). However if not here's some ROTS that i used for FCP to compressor via QMASTER cluster for example. NO special order but might help someone get around the stuff with QMASTER V2.3, FCP V5.1.4, compressor.app V2.3
I saw the latest QMASTER UI and usage at NAB2007 and it looked a little more solid with some "EASY SETUP" stuff. I hope it has been reworked underneath.. I guess I will know soon if it has.
For most FCP/COMPRESSOR, SHAKE. MOTION and COMPRESSOR:
• provide access from ALL nodes to ALL the source and target objects (files) on their VOLUMES. Simply MOUNT those volumes through the APPLE file system (via NFS) using +k (cmd+k) or finder/go/connect to server. OR using an SSAFS such as XSAN™ where the file systems are all shared over FC not the network. YOu will notice the CPU's going very busy for a small while. THhis is the APPLE FILE SYSTEM task,,, I guess it's doing 'spotlight stuff". This goes away after a few minutes.
• set the COMPRESSOR preferences for "CLUSTER OPTIONS" to "Never copy source to Cluster". This means that all nodes can access your source and target objects (files) over NFS (as above). Failure to to this means LENGTHY times to COPY material back an forth, in some cases undermining the pleasure gained from initially using clustering (reduced job times)
• DONT mix the PHYSICAL or LOGICAL networks in your local cluster. I dont know why but I could never get this to work. Physical mean stick with eother ETHERNET or FIREWIRE or your other (airport etc whic will be generally way to slow and useless), Logical measn leepin all nodes on the SAME subnet. You can do this siply by setting theis up in the system preferences/QMASTER/advanced tab under "Use Network Interfaces". In my currnet QUAd I set this to use BUILT IN ETHERNET1 and in the MPBDC's I set this to their BUILTIN ETHERNET.
• LOGICAL NETWORKS (Subnet): simply HARDCODE an IP address on the ETHERNET (for eample) for your cluster nodes andthe service controller. FOr eample 3.1.1.x .... it will all connect fine.
• Physical Networks: As above (1) DONT MIX firewire (IPoFW) and Ethernet(IPoE). (2) if more than extra service node USE A HUB or SWITCH. I went and bought a 10 port GbE HUB for about $HK400 (€40) and it worked fine. I was NEVER able to get a stable system of QMASTER mixing FW and ETHERNET. (3) fwiw using IP of FW caused me a LOAD of DISK errors and timouts (I/O errors) on thosse DISKs that were FW400 (al gone now) but it showed this was not stable overall
• for the cluster controller node MAKE SURE you set the CLUSTER STORAGE (system preferences/QMASTER/shared cluster storage) for the CLUSTER CONTROLLER NODE IS ON A SHARED volume (See above). This seems essential for SHAKE to work. (if not check the Qmaster errors in the console.app [see below] ). IF you have an SSAFS like XSAN™ then just add this cluster storage on a share file path. NOte that QMASTER does not permit the cluster storage to be on a NETWORK NODE for some reason. So in short just MOUNT the volume where the SHARED CLUSTER file is maintained for the CLUSTER controller.
• FCP - avoid EXPORT to COMPRESSOR from the TIMELINE - it never seems to work properly (see later). Instead EXPORT FROM SEQUENCE in the BROWSER - consistent results
• FCP - "media missing " messages on EXPORT to COMPRESSOR.. seems a defect in FCP 5.1 when you EXPORT using a sequence that is NOT in the "root" or primary trry in the FCP PROJECT BROWSER. Simply if you have browser/bin A contains(Bin B (contains Bin C (contains sequence X))) this will FAIL (wont work) for "EXPORT TO COMPRESSOR" if you use EXPORT to COMPRESSOR in a FCP browser PANE that is separately OPEN. To get around this, simply OPEN/EXPOSE the triangles/trees in the BROWSER PANE for the PROJECT and select the SEQUENCE you want and "EXPORT to COMPRESSOR" from there. This has been documented in a few places in this forum I think.
• FCP -> COMPRESSOR -> .M2V (for DVDSP3): some things here. EXPORTING from an FCP SEQUENCE with CHAPTER MARKERS to an MPEG2 .M2V encoding USING A CLUSTER causes errors in the placement of the chapter makers when it is imported to DVDSP3. In fact CONSISTENTLY, ALL the chapter markers are all PLACED AT THE END of the TRACK in DVD SP# - somewhat useless. This seems to happen ALSO when the source is an FCP reference movie, although inconsistent. A simple work around if you have the machines is TRUN OF SEGMENTING in the COMPRESSOR ENCODER inspector. let each .M2V transcode run on the same service node. FOr the jobs at hand just set up a CLUSTER and controller for each machine and then SELECT the cluster (myclusterA, hisclusterb, herclusterc) for each transcode job.. anyway for me.. the time spent resolving all this I could have TRANSCODED all this on my QUAD and it would all have ben done by sooner! (LOL)
• CONSOLE logs: IF QMASTER fails, I would suggest your fist port of diagnosis should be /Library/Logs/Qmaster in there you will see (on the controller node) compressor.log, jobcontroller.com.apple.qmaster.cluster.admin.log, and lots of others including service controller.com.apple.qmaster.executorX.log (for each cpu/core and node) andd qmasterca.log. All these are worth a look and for me helped me solve 90% of my qmaster errors and failures.
• MOTION 3 - fwiw.. EXPORT USING COMPRESSOR to a CLUSTER seems to fail EVERY TIME.. seems MOTION is writing stuff out to a /var/spool/qmaster
TROUBLESHOOTING QMASTER: IF QMASTER seems buggered up (hosed), then follow these steps PRIOR to restarting you machines.
go read the TROUBLE SHOOTING in the published APPLE docs for COMPRESSOR, SHAKE and "SET UP FOR DISTRIBUTED PROCESSING" and serach these forums CAREFULLY.. the answer is usually there somewhere.
ELSE THEN,, try these steps....
You'll feel that QMASTER is in trouble when you
• see that the QMASTER ICON at the top of the screen says 'NO SERVICES" even though that node is started and
• that the APPLE QMASTER ADMINSTRATOR is VERY SLOW after an 'APPLY" (like minutes with SPINNING BEACHBALL) or it WONT LET YOU DELETE a cluster or you see 'undefined' nodes in your cluster (meaning that one was shut down or had a network failure)..... all this means it's going to get worse and worse. SO DONT submit any more work to QAMSTER... best count you gains and follow this list next.
(a) in COMPRESSOR.app / RESET BACKGROUND PROCESSES (its under the COMPRESSOR name list box) see if things get kick started but you will lose all the work that has been done up to that point for COMPRESSOR.app
b) if no OK, then on EACH node in that cluster, STOP the QMASTER (system preferences/QMASTER/setup [set 0 minutes in the prompt and OK). Then when STOPPED, RESET the shared services my licking OPTION+CLICK on the "START" button to reveal the "RESET SERVICES". Then click "START" on each node to start the services. This has the actin of REMOVING or in the case where the CLUSTER CONTROLLER node is "RESET" f terminating the cluster that's under its control. IF so Simply go to APPLE QMASTER ADMINISTRATOR and REDFINE it. Go restart you cluster.
c) if step (b) is no help, consult the QMASTER logs in /Library/Logs/Qmaster (using the cosole.app) for any FILE MISSING or FILE not found or FILE ERROR . Look carefully for the NODENAME (the machine_name.local) where the error may have occured. Sometimes it's very chatty. Others it is not. ALso look in the BATCH MONITOR OUTPUT for errors messages. Often these are NEVER written (or I cant find them) in the /var/logs... try and resolve any issues you can see (mostly VOLUME or FILE path issues from my experience)
(d) if still no joy then - try removing all the 'dead' cluster files from /var/tmp/qmaster , /var/sppol/qmaster and also the file directory that you specified above for the controller to share the clustering. FOR shake issues, go do the same (note also where the shake shared cluster file path is - it can be also specified in the RENDER FILEOUT nodes prompt).
e) if all this WONT help you, its time to get the BIG hammer out. Simply, STOP all nodes of not stopped. (if status/mode is "STOPPING" then it [QMASTER] is truly buggered). DISMOUNT the network volumes you had mounted. and RESTART ALL YOUR NODES. Tis has the affect of RESTARTING all the QMASTERD tasks. YEs sure you can go in and SUDO restart them but it is dodgy at best because they never seem to terminate cleanly (Kill -9 etc) or FORCE QUIT.... is what one ends up doing and then STILL having to restart.
f) after restart perform steps from (B) again and it will be usually (but not always) right after that
LAstly - here's some posts I have made that may help others for QMASTER 2.3 .. and not for the NEW QMASTER as at MAy 2007...
Topic "qmasterd not running" - how this happened and what we did to fix it. - http://discussions.apple.com/message.jspa?messageID=4168064#4168064
Topic: IP over Firewire AND Ethernet connected cluster? http://discussions.apple.com/message.jspa?messageID=4171772#4171772
LAstly spend some DEDICATED time to using OBJECTIVE keywords to search the FINAL CUT PRO, SHAKE, COMPRESSOR , MOTION and QMASTER forums
hope thats helps.
G5 QUAD 8GB ram w/3.5TB + 2 x 15in MBPCore Mac OS X (10.4.9) FCS1, SHAKE 4.1

Warwick,
Thanks for joining the forum and for doing all this work and posting your results for our benefit.
As FCP2 arrives in our shop, we will try once again to make sense of it and to see if we can boost our efficiencies in rendering big projects and getting Compressor to embrace five or six idle Macs.
Nonetheless, I am still in "Major Disbelief Mode" that Apple has done so little to make this software actually useful.
bogiesan

INS-40925 - One or more nodes have interfaces not configured with a subnet that is common across all cluster nodes.

Hi All,
I am facing the below error while installing Oracle RAC in Silent Mode.
SEVERE: There are no common subnets represented by network interfaces across all cluster nodes.
SEVERE: [FATAL] [INS-40925] One or more nodes have interfaces not configured with a subnet that is common across all cluster nodes.
   CAUSE: Not all nodes have network interfaces that are configured on subnets that are common to all nodes in the cluster.
   ACTION: Ensure all cluster nodes have a public interface defined with the same subnet accessible by all nodes in the cluster.
My /etc/hosts is given below.
127.0.0.1        localhost    localhost.localdomain
#Public
192.168.1.101      rac1        rac1.localdomain
192.168.1.102    rac2        rac2.localdomain
#Private
192.168.2.101    rac1-priv    rac1-priv.localdomain
192.168.2.102    rac2-priv    rac2-priv.localdomain
#Virtual
192.168.1.103      rac1-vip    rac1-vip.localdomain
192.168.1.104    rac2-vip    rac2-vip.localdomain
#SCAN
192.168.1.105    rac-scan    rac-scan.localdomain
Could you please help me to get rid of the error INS-40925....Any Idea...???

Hi Ramesh,
Please find the result of ifconfig -a from both nodes RAC1 & RAC2.
ifconfig -a in RAC1
[oracle@rac1 Desktop]$ ifconfig -a
eth0      Link encap:Ethernet HWaddr 08:00:27:17:7A:D5
          inet addr:192.168.1.101 Bcast:192.168.1.255 Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe17:7ad5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:102 errors:0 dropped:0 overruns:0 frame:0
          TX packets:48 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:25472 (24.8 KiB) TX bytes:3322 (3.2 KiB)
          Interrupt:19 Base address:0xd020
eth1      Link encap:Ethernet HWaddr 08:00:27:C0:AC:DB
          inet addr:192.168.2.101 Bcast:192.168.2.255 Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fec0:acdb/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:240 (240.0 b) TX bytes:816 (816.0 b)
          Interrupt:16 Base address:0xd240
lo        Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:56 errors:0 dropped:0 overruns:0 frame:0
          TX packets:56 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:6394 (6.2 KiB) TX bytes:6394 (6.2 KiB)
virbr0    Link encap:Ethernet HWaddr 52:54:00:CC:BD:FB
          inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
          UP BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
virbr0-nic Link encap:Ethernet HWaddr 52:54:00:CC:BD:FB
          BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
ifconfig -a in RAC2
[oracle@rac2 Desktop]$ ifconfig -a
eth0      Link encap:Ethernet HWaddr 08:00:27:C9:38:82
          inet addr:192.168.1.102 Bcast:192.168.1.255 Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fec9:3882/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:122 errors:0 dropped:0 overruns:0 frame:0
          TX packets:59 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:32617 (31.8 KiB) TX bytes:5157 (5.0 KiB)
          Interrupt:19 Base address:0xd020
eth1      Link encap:Ethernet HWaddr 08:00:27:90:B5:A0
          inet addr:192.168.2.102 Bcast:192.168.2.255 Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe90:b5a0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:11 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:240 (240.0 b) TX bytes:746 (746.0 b)
          Interrupt:16 Base address:0xd240
lo        Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:56 errors:0 dropped:0 overruns:0 frame:0
          TX packets:56 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:6390 (6.2 KiB) TX bytes:6390 (6.2 KiB)
virbr0    Link encap:Ethernet HWaddr 52:54:00:CC:BD:FB
          inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
          UP BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
virbr0-nic Link encap:Ethernet HWaddr 52:54:00:CC:BD:FB
          BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

Help with a Blind Configuration of a G5 Cluster node

So I bought 2 G5 Cluster Nodes to dedicate some audiovisual processes to them. My only other mac computer is a Core 2 Duo Macbook Pro.
Using Pacifist, I was able to do a clean install of Mac OSX onto the internal drive by putting it into an external enclosure.
Now here is my problem: The cluster nodes have no videocard.
I plan on using them through the OSX Screen Sharing function, when they will be conencted to the network, but I don't know how to do the initial configuration of Mac OS X on them, since I can not boot from a system using the Apple Partition Map on my Macbook pro, and the Cluster node will not boot from the GIUD partition scheme.
Can anyone please help me?
Thanks,
Chuck

Assuming you're running Mac OS X Server on the cluster node, just boot the server normally - it will run a special first-time-boot process that sets up a network listener.
You can then install the Server Admin tools on your MacBook Pro and run Server Assistant. Server Assistant will look out over the network and find the new servers, then give you the opportunity to configure them remotely (assign account data, IP address, etc.).
(note you can also do this as part of the initial install process - boot the server from the Install DVD and run the entire OS installation and configuration remotely via Server Assistant)
Note: If you're not running Mac OS X Server on the cluster nodes then the above doesn't apply

How to set time on cluster nodes

We have setup of two cluster node. Both having the same time runing . Now i have to change this time to 15-20 mnts. backward on both nodes. I want to know whether online with date command can set the date on indivisual nodes or any precautions need to take ?
Thanks in advance
sanjay

use ntp: network time protocol. set up ntpd correctly to get time from external ntp server that is open to the public, or a ntp pool

Best Practice: Application runs on Extend Node or Cluster Node

Hello,
I am working within an organization wherein the standard way of using Coherence is for all applications to run on extend nodes which connect to the cluster via a proxy service. This practice is followed even if the application is a single, dedicated JVM process (perhaps a server, perhaps a data aggregater) which can easily be co-located with the cluster (i.e. on a machine which is on the same network segment as the cluster). The primary motivation behind this practice is to protect the cluster from a poorly designed / implemented application.
I want to challenge this standard procedure. If performance is a critical characteristic then the "proxy hop" can be eliminated by having the application code execute on a cluster node.
Question: Is running an application on a cluster node a bad idea or a good idea?

Hello,
It is common to have application servers join as cluster members as well as Coherence*Extend clients. It is true that there is a bit of extra overhead when using Coherence*Extend because of the proxy server. I don't think there's a hard and fast rule that determines which is a better option. Has the performance of said application been measured using Coherence*Extend, and has it been determined that the performance (throughput, latency) is unacceptable?
Thanks,
Patrick

How to use SVM metadevices with cluster - sync metadb between cluster nodes

Hi guys,
I feel like I've searched the whole internet regarding that matter but found nothing - so hopefully someone here can help me?!?!?
<b>Situation:</b>
I have a running server with Sol10 U2. SAN storage is attached to the server but without any virtualization in the SAN network.
The virtualization is done by Solaris Volume Manager.
The customer has decided to extend the environment with a second server to build up a cluster. According our standards we
have to use Symantec Veritas Cluster, but I think regarding my question it doesn't matter which cluster software is used.
The SVM configuration is nothing special. The internal disks are configured with mirroring, the SAN LUNs are partitioned via format
and each slice is a meta device.
d100 p 4.0GB d6
d6 m 44GB d20 d21
d20 s 44GB c1t0d0s6
d21 s 44GB c1t1d0s6
d4 m 4.0GB d16 d17
d16 s 4.0GB c1t0d0s4
d17 s 4.0GB c1t1d0s4
d3 m 4.0GB d14 d15
d14 s 4.0GB c1t0d0s3
d15 s 4.0GB c1t1d0s3
d2 m 32GB d12 d13
d12 s 32GB c1t0d0s1
d13 s 32GB c1t1d0s1
d1 m 12GB d10 d11
d10 s 12GB c1t0d0s0
d11 s 12GB c1t1d0s0
d5 m 6.0GB d18 d19
d18 s 6.0GB c1t0d0s5
d19 s 6.0GB c1t1d0s5
d1034 s 21GB /dev/dsk/c4t600508B4001064300001C00004930000d0s5
d1033 s 6.0GB /dev/dsk/c4t600508B4001064300001C00004930000d0s4
d1032 s 1.0GB /dev/dsk/c4t600508B4001064300001C00004930000d0s3
d1031 s 1.0GB /dev/dsk/c4t600508B4001064300001C00004930000d0s1
d1030 s 5.0GB /dev/dsk/c4t600508B4001064300001C00004930000d0s0
d1024 s 31GB /dev/dsk/c4t600508B4001064300001C00004870000d0s5
d1023 s 512MB /dev/dsk/c4t600508B4001064300001C00004870000d0s4
d1022 s 2.0GB /dev/dsk/c4t600508B4001064300001C00004870000d0s3
d1021 s 1.0GB /dev/dsk/c4t600508B4001064300001C00004870000d0s1
d1020 s 5.0GB /dev/dsk/c4t600508B4001064300001C00004870000d0s0
d1014 s 8.0GB /dev/dsk/c4t600508B4001064300001C00004750000d0s5
d1013 s 1.7GB /dev/dsk/c4t600508B4001064300001C00004750000d0s4
d1012 s 1.0GB /dev/dsk/c4t600508B4001064300001C00004750000d0s3
d1011 s 256MB /dev/dsk/c4t600508B4001064300001C00004750000d0s1
d1010 s 4.0GB /dev/dsk/c4t600508B4001064300001C00004750000d0s0
d1004 s 46GB /dev/dsk/c4t600508B4001064300001C00004690000d0s5
d1003 s 6.0GB /dev/dsk/c4t600508B4001064300001C00004690000d0s4
d1002 s 1.0GB /dev/dsk/c4t600508B4001064300001C00004690000d0s3
d1001 s 1.0GB /dev/dsk/c4t600508B4001064300001C00004690000d0s1
d1000 s 5.0GB /dev/dsk/c4t600508B4001064300001C00004690000d0s0
<b>The problem is the following:</b>
The SVM configuration on the second server (cluster node 2) must be the same for the devices d1000-d1034.
Generally spoken the metadb needs to be in sync.
- How can I manage this?
- Do I have to use disk sets?
- Will a copy of the md.cf/md.tab and an initialization with metainit do it?
I would be great to have several options how one can manage this.
Thanks and regards,
Markus

Dear Tim,
Thank you for your answer.
I can confirm that Veritas Cluster doesn't support SVM by default. Of course they want to sell their own volume manager ;o).
But that wouldn't be the big problem. With SVM I expect the same behaviour as with VxVM, If I do or have to use disk sets,
and for that I can write a custom agent.
My problem is not the cluster implementation. It's more likely a fundamental problem with syncing the SVM config for a set
of meta devices between two hosts. I'm far from implementing the devices into the cluster config as long as I don't know how
how to let both nodes know about both devices.
Currently only the hosts that initialized the volumes knows about them. The second node doesn't know anything about the
devices d1000-d1034.
What I need to know in this state is:
- How can I "register" the alrady initialized meta devices d1000-d1034 on the second cluster node?
- Do I have to use disk sets?
- Can I only copy and paste the appropriate lines of the md.cf/md.tab
- Generaly speaking: How can one configure SVM that different hosts see the same meta devices?
Hope that someone can help me!
Thanks,
Markus

Cluster node fails after testing removing both interconnects in a two node

Hi,
cluster node panics and fails to join cluster after testing removing both interconnects in a two node cluster. cluster is up on one node , but the panic'ed node fails to rejoin cluster saying no sufficient quorum yet and both clinterconn failed (even after conencting the interconn). Quorum device used is a shared disk.
Is this a bug?
Any workaround or solution?
Cluster is 3.2 SPARC
Thanking you
Ushas Symon

Sounds like a networking problem to me. If the failed node genuinely can't communicate with the remaining node then it will not be allowed to join the cluster, hence the quorum message. I would suspect either:
* Misconnected cables
* A switch that has block or disabled the port
* A failed auto-negotiation
This is of course without knowing anything about what your network infrastructure actually is!
Tim
---

Add cluster nodes from multiple machines to WebLogic domain in OEM 10.2.0.5

Hello,
I want to monitor a WebLogic domain in Oracle Enterprise Manager 10.2.0.5 with the following layout:
- Admin server on machine 1
- managed server, cluster node a on machine 2
- managed server, cluster node b on machine 3
How can I do this?
When I go to "Add Weblogic Domain", I can enter the admin adress (machine 1) and tick the box to say that there is an agent running on another host (where I specify machine 2). However I do not see a possibility to discover managed servers from machine 3.
Does anyone know how to do this?
Thanks,
Nadja

LSNRCTL> status
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
Alias LISTENER
Version TNSLSNR for Linux: Version 11.1.0.6.0 - Production
Start Date 28-JAN-2010 00:36:10
Uptime 0 days 17 hr. 11 min. 52 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /oracle/app/oracle/product/11.1.0/db/network/admin/listener.ora
Listener Log File /oracle/app/oracle/diag/tnslsnr/corp1052/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=corp1052)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM2", status READY, has 1 handler(s) for this service...
Service "+ASM_XPT" has 1 instance(s).
Instance "+ASM2", status READY, has 1 handler(s) for this service...
Service "dex.example.com" has 2 instance(s).
Instance "dex1", status READY, has 1 handler(s) for this service...
Instance "dex2", status READY, has 2 handler(s) for this service...
Service "dexXDB.example.com" has 2 instance(s).
Instance "dex1", status READY, has 1 handler(s) for this service...
Instance "dex2", status READY, has 1 handler(s) for this service...
Service "dex_XPT.example.com" has 2 instance(s).
Instance "dex1", status READY, has 1 handler(s) for this service...
Instance "dex2", status READY, has 2 handler(s) for this service...
The command completed successfully
The output of SQLPlus:
[oracle@dbhost: db]$ bin/sqlplus dex@DEX
SQL*Plus: Release 11.1.0.6.0 - Production on Thu Jan 28 18:40:11 2010
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Enter password:
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

Cluster Shared Volume is no longer accessible from cluster node

Hello,
We have a 3 nodes Hyper-v Cluster running Windows Server 2012. Recently we start having error below intermittently on a node, and the VMs running on this host and LUN will power off.
Alert: Cluster Shared Volume is no longer accessible from cluster node
Source: Cluster Service
Path: HV01.itl.local
Last modified by: System
Last modified time: 12/1/2013 12:27:18 AM
Alert description: Cluster Shared Volume 'Volume1' ('Cluster_Vol1_R6') is no longer accessible from this cluster node because of error 'ERROR_TIMEOUT(1460)'. Please troubleshoot this node's connectivity to the storage device and network connectivity.
The only changes made recently is we installed VEEAM on test basis for DR replication. We switched off the Veeam server and stop the Veeam Services on the Hyper-V Hosts but we are still having same issue.
We are using an EMC SAN connected via FC as Shared storage and Powerpath as Multi-Pathing. No errors were found on the SAN.
I don't think the issue is related to the number of IO as we also experienced the issue at midnight during the week-end where no one was working.
Any help would be very much appreciated.
Thanks.
Irfan
Irfan Goolab SALES ENGINEER (Microsoft UC) MCP, MCSA, MCTS, MCITP, MCT

Hi,
Also, try to install the following recommend KBs.
Recommended hotfixes and updates for Windows Server 2012-based Failover Clusters
http://support.microsoft.com/kb/2784261
Also, there please confirm your VSS provider have the correct version.
The third party article:
VSS Provider with 2012 HyperV and CSV
https://community.emc.com/thread/170636
Thanks.
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place.

Restart of Cluster node

Hi!
I have restarted one of the windows cluster. Now I cannot log in again.
Should any steps on the other cluster node be executed (moving groups)?
If yes, how?
Thank you very much!
regards
Thom

Hi
I have restarted one of the windows cluster.
---- Node A or Node B, then you should still be able to login to other node, did you try to do a ping test / network traceroute test from your system to the active node which is online.
there will be total 5 IP address assigned to your Cluster server ping test all IP address and if they available then try to see if the physical server is up and running at the server terminal, try to login to the server directly at the server level not from network
then check if all services are up and running, very important is the clusterservices is up and running.
try all the above steps and feedback
regards
Raj

2012R2 Cluster "Protected Network" not working / no failover

hi,
I have two 2012R2 hosts running in a cluster. Hardware is exactly the same. I created a Virtual Network on both Hyper-V managers. Live migration is correctly working.
However i was testing one of the new features "Protected Network" where the VM would failover to the other node when the virtual network fails. When i try it and disconnect the cables on the node where the VM is running i see it going to error
state but nothing happens (other then the VM being not available anymore). I've waited for hours but no failover.
Am i missing something?

Hi,
Network health detection and recovery is now available at the virtual machine level for a Hyper-V host cluster. If a network disconnection occurs on a protected virtual network, the cluster live migrates the affected virtual machines to a host where that
external virtual network is available. For this to occur there must be multiple network paths between cluster nodes.
More information:
What's New in Failover Clustering in Windows Server 2012 R2
http://technet.microsoft.com/en-us/library/dn265972.aspx
Or please rerun the validation test then post the error and the warning part, that will quickly location the issue.
Thanks.
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place.

Moving a Cluster Node

Hi,
We got 1 SunFire12K and 15K and 2 clusters are configured, But the cluster nodes in each cluster are from the same server (15K or 12K), which we feel is not very good. We need to move the cluster nodes across SunFire Servers Following are my queries
1) Is it possible to Move a node from 12K to 15K provided all the IO boards are moved from 12K to 15K
along with all Network and FC interface and the root disk.
In this case do we need to reconfigure the cluster?
2) Is there any other way to implement this with minimal outage on services
Thanks in adv
Raj

I know your post is over 3 years old now - but did you find any problem leading to this behaviour?
I get this error on two different 2008R2 two-node-Clusters holding lots of DFSR resources when failing over or back.
service packs and hotfixes (including all dfsr) are up to date, evering is set up using microsoft best practice.
hardware-specs are fine- cluster1: 78GB memory 8 core, cluster 2: 128GB memory 32 core, emc-storage connected using multiple failsafe and loadbalancing FC8-connections.
storage does not see any unusual load when failing over, disk cue length on cluster nodes <=1 when failing over
debug logs show those entries:
+             [Error:9101(0x238d)
FrsReplicator::GetReplicaSetConfiguration frsreplicatorserver.cpp:2836 2892 C Der Registrierungsschlüssel wurde nicht gefunden.]
+             [Error:9101(0x238d)
Config::XmlReader::ReadReplicaSetConfig xml.cpp:3034 2892 C Der Registrierungsschlüssel wurde nicht gefunden.]
+             [Error:9101(0x238d)
Config::RegReader::ReadReplicaConfigValues reg.cpp:1201 2892 C Der Registrierungsschlüssel wurde nicht gefunden.]
+             [Error:9101(0x238d)
Config::RegConfig::TranslateWin32StatusToConfigFrsStatus reg.cpp:650 2892 C Der Registrierungsschlüssel wurde nicht gefunden.]
+             [Error:2(0x2)
BaseRegKey::Open regkey.cpp:165 2892 W Das System kann die angegebene Datei nicht finden.]
+             [Error:9116(0x239c)
FrsReplicator::GetReplicaSetConfiguration frsreplicatorserver.cpp:2831 2892 C Die Konfiguration wurde nicht gefunden.]
registry keys do exist with correct permissions.
it seems there is some kind of timing issue with unloading/loading
dfsr-related cluster registry entries.
maybe you found some solution for this problem.
------------------ Roman Fischer, AUT

Cluster node networking

Similar Messages

Maybe you are looking for