Windows 2003 Cluster IP Conflict during failover. Possible Spilt Brain Syndrome ?

Dear all,
I would like to ask if anyone had a case of IP Conflict on the Virtual Cluster IP Address resource ? 
It was observed that both Windows cluster nodes presented the Virtual IP address resource at the same time.
Could I know how can such a thing happen, there are may safeguards to prevent such spilt brain symptom.
Thanks a lot.

Dear Tim & Alex
Apologies for the late reply . Forgot about the alert me feature...
Our switches are configured with a feature to lock-out (disable) any port with a duplicate IP address/MAC address.
May I know would there actually be a time for maybe for a split second where both Virtual Cluster IP addresses are presented on the NICs of both nodes ?
This incident happened again but this time the situation was during a restart of the passive node. What happened was that upon restart, the resources on the active node attempted to fail over to the passive node when I restarted the passive node. Upon login,
I then realised that the port on the restarted node was portlocked and the services stayed on the restarted node, even though It was strange for it to failover as it had not reason to ( it was running fine on the other node) . I tried to open the cluster administrator
on the node that was not restarted and it had cluster administrator in a hanged state however the service was indicated as started. The cluster administrator on the restarted node indicated cluster IP has failed while the ownership was all on the restarted
node. 
I find it curious that the cluster administrator would be unresponsive on the node that was not restarted as it did a failover smoothly just 30 min ago.
Thanks for your time!

Similar Messages

  • OS Migration from Windows 2003 cluster  to Windows 2008 R2 MSCS

    We want to go for windows 2008 R2 server ( 64-Bit)  MSCS by migrating from windows 2003 Cluster ( 32- Bit) . We are running on ECC 6.0 and oracle 10.2.0.2 DB.
    Is it possible ?
    Is it traditional System copy if so how to migrate the cluster ?
    Regards

    > 2. If the oracle version is 10.2.0.2 then does it will support Windows 2008 R2.
    You need at least 10.2.0.4 (you need to install that version with a separate DVD, you can't install with the normal Oracle DVD (10.2.0.1). See
    Note 1303262 - Oracle on Windows Server 2008
    Be aware of the fact that the normal standard support for 10.2.x is ending at end of july (you'll have to pay an extra maintenance fee if you continue to run that version) - see
    Note 1110995 - Extended maintenance for Oracle Version 10.2
    So if you now do a system copy to new hardware with a new os anyway, I would upgrade to 11.2.
    Markus

  • Oracle 9i installation on Windows 2003 Cluster

    Hi
    I Need any good documentation for Oracle 9i installation on Windows 2003 Cluster.

    Complete Guide is here
    http://download-uk.oracle.com/docs/cd/B19306_01/install.102/b14207/toc.htm

  • Installing Oracle9i Database on Windows 2003 Cluster

    Can anyone tell me if there are any additional steps needed to install an Oracle DB on a Windows 2003 cluster besides just running the installation and taking the defaults? I was hoping that the software would be cluster aware but it doesn't appear that it is.

    Okay,
    I have gotten a little further with this. I have my 2 node active/passive windows cluster installed and working. I also install Oracle Fail Safe and have it setup and working. I then install my Oracle9i database on node 1 using a local disk for the home directory and the DB/logs files etc on shared disk (SAN back end). I then mirrored the install on my 2nd node using the same home name and location on a local disk. What other steps do I need to take to verify that my new Oracle database will function correctly should a node fail and fail over occur. I am not familiar with any of this at all and am just trying to get to a point where I have my DB installed and working correctly within the cluster before I hand this of to another person. I guess what I am saying is I need to make sure my new DB is cluster aware and will fail over correctly. Any help would be greatly appreciated! Thank you.

  • Disaster Recovery in Windows 2003/Cluster, SQL 2000 and R3

    Hi,
    Can someone share experience/knowledge of disaster recovery scenarios in MSCS/SQL Server/SAP. One of our customer has R3/SQL Server2000/Win 2003 (Cluster).
    We would like to evaluate best possible options for the Disaster Recovery which are supported by SAP.
    We have thought about
    1. Log shipping
    2. Standby Database
    3. Restore backup on new cluster
    4. Homogeneous System copy.
    We do not want to go for first two and would like to explore on 3rd and 4th option.
    Any links to documents/blogs will be helpful.
    Thanks,
    Manoj

    > I am confused. Option 3 will be restoring backup
    Yes - but what will you restore? Everything? If you're running on a cluster it's unlikely that both nodes will fail at the same time so there is still one node that can and will run the software, no?
    > and 4 will be sapinst. Isn't it? Are both options supported by SAP?
    Yes.
    > Is there a SAP standard documentation for building cluster from scratch and build SAP system from backup or sapinst for DR?
    The standard installation documentation cover a cluster installation.
    > I am sure there will be installation document if it is a fresh installation. But not sure if there is one for DR.
    If you have a cluster then you have a high availability already. If a node fails, you will "just" reinstall that node and put it back into the cluster.
    What kind of DR scenario are you thinking about?
    Markus

  • Disk Management shows 'unallocated' and 'online' basic disk on a 4 node Windows 2003 Cluster - Is this disk reclaimable?

    We've a 4 node Windows 2003 File Share Cluster. I logged onto one of the nodes and found there are a lot of SAN connected disks that show 'Unallocated' in Disk Management as below,
    Please could someone advice if these disks are unused? and reclaimable? From what I heard from Adminstrator is that, its the default behavior of Cluster and will be in use by a different node on the same cluster. If so, is there an easier way to identify
    which nodes are using these disks? since it appears as though these disks are mapped to server but not being used, many thanks.

    As expected.
    Things a bit clearer in current versions of Windows Server, but back in the 2003 days, that was how the shared disk was shown on the nodes that did not own the disk.  If you go to each node in the cluster and look at the same thing in each node, you
    will have the same number of disks.  On the node that owns the disk, you will see it represented as you would expect.  On the nodes that do not own the disk, you will see it displayed as you have shown in your screen shot.
    . : | : . : | : . tim

  • Windows terminal server 2003 cluster

    We have a FOG with three Solaris 10 x86 servers, SRSS 4.2, SRWC 2.3 with the latest SRSS 4.2 patches. All Sunrays have the latest firmware.
    Kiosk settings points to a windows 2003 terminal server cluster. The two nodes in the windows cluster have latest SRWC 2.0.
    The cluster entitles, two terminal servers and a session manager server, we use roaming profiles.
    When pointing kiosk settings to the cluster FQDN the sunray sessions are routed to just one server in the windows cluster. But if we RDP to Cluster's FQDN from a regular desktop the windows cluster load balances properly.
    I do need help understanding how SRSS handles sessions? specifically if nodes are behind a cluster with session manager. Could it be that the session manager it's incompatible with SRSS in kiosk mode since it tries to route sessions at the mean time SRSS is doing the same task?
    Thanks,
    RE
    Edited by: estrar on Dec 5, 2010 5:43 PM

    Hi Estrar,
    Short answer: 2003 Session Directory does not do load balancing.
    Long answer (to the above and your other questions):
    The Sun Ray Server doesn't do anything with regards to load balancing the windows sessions. To the Sun Ray Server, or more appropriately the windows connector, it's just making the connection you tell it to.
    With regards to the windows client "working", am I correct in guessing that you ran subsequent instances of MS-RDC to "prove" that it worked?
    Most likely what happened there is that the PC, thus MS-RDC, received different IP addresses back from DNS for the FQDN with each launch. This is basically round robin, not load balancing. But I'd have to understand how you've done the FQDN of the cluster in DNS.
    The most common initial reaction is that the DNS server is setup incorrectly (I'll plead guilty that I've had the same reaction). It is true that some DNS servers can be configured will return the same IP on the first dns query. This is called a cyclic ordering and results in the IP address of the resolved name every client to be identical. For example, if a DNS server is set for cyclic ordering, every first request for the IP of server A for xyz.com will be the same on all clients. The second query from the same client should result in server B being resolved, if not, then you do have a DNS problem.. However, in the case of Sun Ray Server, the "second query from the same client" would be the second session coming up, since, as far as the DNS server is concerned, all the Sun Ray session are coming from the same client. Therefore, even with cyclic ordering, round robin should work even better in a Sun Ray environment then a "fat" client environment. You can check the documentation for specific DNS servers on how to set a random order for round robin (which should be the default), but chasing down cyclic ordering of RR entries as a cause to this "issue" is basically a red herring in a Sun Ray environment.
    More than likely the same IP being returned everytime is a side effect of the name service caching daemon (nscd). The job of nscd is to speed up name resolution. However, ncsd will break any round robin scheme as it caches the first server returned by the first query from the caller (i.e. the host doing the name lookup, aka the Sun Ray server) and all applications on the caller will use that address for the lifetime of the cache. The default lifetime of the nscd cache is 3600 seconds (an hour). To disable the name service caching daemon, run the following command on every Sun Ray server in the host group (aka FOG): svcadm disable system/name-service-cache
    An über smart colleague mentioned the possibility of another problem that could exist even if nscd is disabled and the hosts are on the same subnet as the caller . This has to do with default behavior of the Solaris resolver library putting servers that reside on the same subnet as the caller at the top of the sort order. You can read the notes in the man page for "gethostbyname" and also the man page for nss(4) for more information, but both reference how this behavior can break round robin. The fix is to edit /etc/default/nss and uncomment (or add) the statement SORT_ADDRS=FALSE
    While those two things will allow a Solaris based Sun Ray server (or any Solaris client) to properly recognize round robin entries, it still holds true that 2003 Session Directory does not do load balancing. The only role of session directory is to ensure that you don't get routed to a different terminal server in case you have an existing session on another server. Round Robin + Session Directory may be all you need, but bear in mind that it is not "load balancing". Users could still connect directly to one of the IP addresses vs the round robin DNS name and that would defeat the round robin scheme in place.
    Load balancing is a separate component in 2003, either using an external device like F5's BigIP or using Windows Network Load Balancing (NLB) service.
    The type of load balancing solution that can be used depends on what is meant when you use the term "cluster". If the cluster is a real implementation of Windows 2003 Clustering (i.e.the type of cluster you'd use for SQL, Exchange, etc) then you can't use NLB and would have to use an external load balancer. Sun Ray Windows Connector supports either IP based or token based load balancers, though we don't certify any third party products in this regard.
    "Cluster confusion" occurs because Session Directory itself is supported by Windows 2003 Clustering in order to provide a highly available directory service. Easy to see how this would be desirable with a large number of terminal servers. However, NLB does not work, nor is it supported on Windows 2003 Cluster.
    To further the confusion, when NLB is enabled on the NIC that is handling RDP traffic, one of the configuration tabs is "Cluster Parameters". But it is important to realize that this has nothing to do with Windows 2003 Clustering and is in fact incompatible with it. NLB makes use of independent servers (i.e. not clustered), which the primary requirement of those servers is that they are on the same subnet.
    Finally, it is easy to get confused between 2003 and 2008 session directory when it comes to load balancing, but they are very different. 2008 "session directory" does do load balancing. In fact, to underscore this distinction/feature, Microsoft has renamed Session Directory to TS Session Broker. (See http://technet.microsoft.com/en-us/library/cc772418%28WS.10%29.aspx)
    To summarize, with Windows 2003 you have two choices when it comes to load balancing and session directory.
    1) Have a HA implementation of Session Directory and use an external load balancer
    2) Have non-HA implementation of session directory and use NLB to do load balancing.
    Good how to article on setting NLB on 2003 with terminal services here:
    http://www.brianmadden.com/blogs/brianmadden/archive/2004/11/29/how-to-configure-windows-network-load-balancing-for-pure-terminal-server-environments.aspx

  • Step by Step change of IP's in Two node Cluster WIndows 2003 server

    Hi,
    I am looking for step by step information for changing IP's in Two Node windows 2003 cluster setup. Information on KB on technet as well on windows site, not covering step by step details. If anyone had experience of change IP's in WIndows 2003 cluster environment,
    Please share, it will be great help.
    Regards,
    Avnish

    If you have done the setup correctly and have two separate networks, changing IPs is pretty straightforward. 
    Make sure both networks can carry cluster communications.
    Change the NICs on ONE network (usually the private network) only.
    Validate the nodes can communicate over that network.
    Change the NICs on the PUBLIC network.
    Change the clustered IP addresses. Note that some clustered services may require a restart to accept the new IP address.
    Geoff N. Hiten Architect Microsoft SQL Server MVP

  • Win 2003 Cluster + Oracle Fail Safe + Dataguard (physical & Logical)

    Hello,<br>
    <br>
    It´s my first post (sorry for my bad english)...I am mounting a high availability solution for test purpose. For the moment i mount the following and runs ok, but i´ve a little problem with the logical database:<br>
    <br>
    Configuration<br>
    ESX Server 2.0 with this machines:<br>
    Windows 2003 Cluster (Enterprise Edition R2, 2 nodes)<br>
    * NODE 1 - Oracle 10gR2 + Patch 9 + Oracle Fail Safe 3.3.4<br>
    * NODE 2 - Oracle 10gR2 + Patch 9 + Oracle Fail Safe 3.3.4<br>
    c:/Windows Software<br>
    e:/Oracle Software/ (pfile -> R:/spfile)<br>
    <br>
    Virtual SAN<br>
    * Datafile, Redos.. are in Virtual SAN.<br>
    R:/ Datafiles & Archivers & dump files & spfile<br>
    S:/ , T:/ ,U:/ -> Redos<br>
    V:/ Undo<br>
    <br>
    Data Guard<br>
    * NODE3 Physical Database<br>
    * NODE4 Logical Database<br>
    <br>
    The Oracle Fail Safe and windows cluster run OK, the switchs... <br>
    The physical database runs OK... (redo aply, switchover, failover, all ok) but the logical receives the redos ok but it has a problem when goes to apply the redo.<br>
    <br>
    The error is the following:<br>
    ORA-12801: error señalizado en el servidor P004 de consultas paralelas<br>
    ORA-06550: linea 1, columna 536:<br>
    PLS-00103: se ha encontrado el simbolo "," cuando se esperaba uno de los siguientes:<br>
    (- + case mod new not null <an identifier>
    <a double-quoted delimited-identifier><a bind variable><avg count current exists max min prior sql stddev sum variance execute forall merge time timestamp interval date <a string literal with character set specification><a number> > a single-quoted SQL string> pipe <an alternatively quoted string literal with character set specification> <an alternativel.<br>
    update "SYS"."JOB$" set "LAST_DATE"=TO_DATE('11/09/07','DD/MM/RR'),<br>
    <br>
    This sql statement i saw in dba_logstdby_events and was joined with the error in alert log and dba_logstdby_events.<br>
    <br>
    I´m a bit lost with this error. I don´t understand why the logical database can´t start to apply the redos received from primary database.<br>
    <br>
    The database has two tables with two columns one integer and the other a varchar2(25). She hasn´t rare types of columns.<br>
    <br>
    Thanks a lot for any help,<br>
    Roberto Marotta<br>

    I recreate the logical database OK, no problem, no errors.<br>
    <br>
    The redo aply run ok. I have done logfile switch in primary database and they were applied in logical and standby databases. But...<br>
    <br>
    When I created a tablespace in primary database when i did a switch logfile in primary the changes transfers ok to standby database, but to logical NO!!!, the redo are in they path in logical ok, but when the process tried to apply, reports me the same error.<br>
    <br>
    SQL> select sequence#, first_time, next_time, dict_begin, dict_end, applied from dba_logstdby_log order by 1;<BR>
    <BR>
    SEQUENCE# FIRST_TI NEXT_TIM DIC DIC APPLIED<BR>
    --------- -------- -------- --- --- -------<BR>
    138 14/09/07 14/09/07 NO NO CURRENT<BR>
    139 14/09/07 14/09/07 NO NO CURRENT<BR>
    <br>
    SQL> select event_time, status, event from dba_logstdby_events order by event_time, timestamp, commit_scn;<br>
    <br>
    14/09/07<br>
    ORA-16222: reintento automatico de la base de datos logica en espera de la ultima accion<br>
    14/09/07<br>
    ORA-16111: extraccion de log y configuracion de aplicacion<br>
    14/09/07<br>
    ORA-06550: linea 1, columna 536:<br>
    PLS-00103: Se ha encontrado el simbolo "," cuando se esperaba uno de los siguientes:<br>
    ( - + case mod new not null <an identifier><br>
    <a double-quoted delimited-identifier> <a bind variable> avg<br>
    count current exists max min prior sql stddev sum variance<br>
    execute forall merge time tiemstamp interval date<br>
    <a string literal with character set specification><br>
    <a number><a single-quoted SQL string> pipe
    <an alternatively-quoted string literal with charactert set specificastion><br>
    <an alternativel<br>
    update "SYS"."JOB$" set "LAST_NAME" = TO_DATE('14/09/07','DD/MM/RR'),<br>
    <br>
    The alert.log report the same message that the dba_logstdby_events view.<br>
    <br>
    Any idea¿?<br>
    <br>
    I´m a bit frustrated. It´s the third time that recreate the logical database OK and reproduce the same error when i create a tablespace in primary database, and i haven´t got any idea because of that.

  • SAP R/3 4.6C with Oracle 9.2 on Windows 2003-Server

    Hi,
    I need to find out whether it's possible to install SAP R/3 4.6C
    with Oracle 9.2 on a Windows 2003 - Server.
    Is is only possible to install this sap-system with Oracle 8.1.7 on a Windows 2000-Server? Or are there some other ways?
    If no, please tell me the possibilities for the correct installation of the desired sap-system.
    Awaiting for your feedback.
    Kind Regards

    ->SAP R/3 4.6C with Oracle 9.2 on Windows 2003-Server
    GreetZ, AH

  • Windows 2003 server loses security tab for folders created in OS X

    We have a centralised storage system that is running on a Windows 2003 cluster. This server has folders for each department so they can store colaborative work. They can create their own folders and files within their departments folder.
    When files are created using a Mac on this service and then the folder properties are viewed under Windows XP or 2003 the security tab seems to be missing. NTFS permissions still exist on the folder as items beneath that are inherriting permissions still have those inheritted permissions applied to them, but the security tab has gone.
    This can also mean that Windows users cannot see these folders and therefore the whole idea of a collaborative storage server is lost.
    We have investigated this issue as best we can so far and tried copying folders onto the server with whitespace at the end of the folder name to see if this was the problem, it does not appear to cause the issue.
    We have also tried copying folders up to the server from a mac using invalid characters \ / * ? " | were all tried, on the server the symbol is simply substituted with a space and the folder works correctly.
    We have tried copying a folder with space at the end of its name and a file within to the server and this also works correctly.
    The only common factor seems to be that a lot of the folders with a missing security tab have a single space at the end of the folder name, however this does not seem to cause the issue.
    One way around the problem is to use the subinacl.exe tool at the command prompt as the server administrator to regain ownership of the folder and then delete it, however this is not a solution for folders that contain data. This workaround was from the microsoft article http://support.microsoft.com/kb/320081 in step 6.
    If anyone has any idea what may be causing this on the Mac side it would be good to know.

    We have narrowed down the problem and found the culprit.
    The issue is only repeatable on 10.4.x (we used 10.4.11 for testing, but it may affect earlier versions as well)
    10.4.11 Local folder created with trailing space in the name, then copied to the Windows 2003 server = loss of security tab on this folder
    10.4.11 existing folder on the server has a trailing space added to the folder name = loss of security tab on this folder
    10.4.11 existing folder on the server has the trailing space REMOVED from the end of the folder name = security tab returns with permissions in tact
    10.5.x Local folder created with trailing space in the name, then copied to the Windows 2003 server = no problem
    10.5.x existing folder on the server has a trailing space added to the folder name = no problem
    10.5.x existing folder (created under 10.4.11) with trailing space in the folder name has the space removed = Error code 43, unknown error occured
    Also interesting is that the illegal characters for Windows \ / * : ? " | in the folder name prevent the folder being copied to the Windows 2003 server in 10.4.11, but are allowed in 10.5.x - all expect the : which is an illegal character in OS-X as well. None of these illegal characters cause issues on the Windows 2003 server as they are simply translated to a space.
    The solution would be either as AJ mentioned to purchase Thursby AdmitMac or to upgrade all the users to 10.5.x (which I'm sure would make Apple happier!)

  • Error Export DB 9.2.0.8 Windows 2003

    I have DB Oracle 9.2.0.8 in windows 2003 cluster (Active/Passive)
    export db, error:
    EXP-00008: errore ORACLE 37002 rilevato
    ORA-37002: Oracle OLAP failed to initialize, please contact Oracle Support.
    ORA-06512: a "SYS.DBMS_AW", line 18
    ORA-06512: a "SYS.DBMS_AW", line 38
    ORA-06512: a "SYS.DBMS_AW", line 248
    ORA-06512: a "SYS.DBMS_AW", line 488
    ORA-06512: a "SYS.DBMS_AW_EXP", line 270
    ORA-06512: a line 1
    EXP-00083: Problema precedente rilevato durante la chiamata a SYS.DBMS_AW_EXP.schema_info_exp
    help????

    OWNER,COUNT(*)
    ADT,28
    ANAGRAFI,2
    ATTOE,1
    CCO,2
    DATAANA,91
    ESELWEB,35
    ICTSG,1
    MAD,7
    ODM,3
    OLAPSYS,108
    PASTIDIETE,1
    SACS,17
    SELIN,2
    SOWDATI,14
    SYS,16
    WEBAPP,83
    WKSYS,41
    exp system/system full=y file=...... log=.......

  • Windows 2003 clustering

    Has anyone used the Xserver RAID as a SAN for a Windows 2003 cluster.
    If so could you let me know what additonal s/w and h/w you used as we are very keen on implenting a new one.
    Thanks in advance.

    i need to establish if the XServe RAID supports Microsofts "mpio" in Win2003 server.
    Has anyone got any info regarding this?
    If it does then there shouldn't be any problems building a win2k3 cluster with XServe RAID.

  • Satellite L10-102 windows 2003 drivers

    My laptop does not connect to my wireless adsl modem even no wpa or wep password is enabled on windows 2003 server!!!

    Hello
    In my opinion you should try to make connection on the easiest way without any passwords or firewall settings. This unit is not supported for Windows 2003 server and it is not possible to find any driver designed by Toshiba.
    In my opinion driver designed for WXP should works too but try to identify which WLAN card is listed and check if you can find an actual driver from manufacturers site.
    If you are able to see your network but it can not be connected check the WLAN settings.
    Bye

  • RAC using OCFS on windows 2003

    Hi,
    i am building a 2 node windows 2003 cluster (enterprise edition) and planning to install ORACLE RAC on it. What are the steps required to do this?. Do I need to use the Microsoft Cluster service to deploy a cluster and then install oracle rac on it. Or does oracle provide a clustring service on it own. Currently i have 2 servers each running windows 2003 connected together via a cross over cable.
    What kind of configuration is reuired with the file system, OCFS or raw partition.
    thanks

    I don't think MCS is required. Take a look at:
    http://download-west.oracle.com/docs/cd/B19306_01/install.102/b14207/prewin.htm#sthref198
    and check the certified combinations in Metalink.
    For your storage options, read:
    http://download-west.oracle.com/docs/cd/B19306_01/install.102/b14207/storage.htm#sthref252

Maybe you are looking for