Unexpected Call Failures during physical network failure

Our layout:
We have a private network for Mediation Server<->PSTN Gateway traffic.  (This is a legacy that we are moving away from.)
We have a multi-node FE Pool.  The FE Pool has redundant "corp" NICs using Windows Teaming (~failover) that serves clients and our SIP trunk, but a single NIC to the private network for the PSTN Gateway traffic.
The Mediation Seervice is listening on both "corp" and "pstn" NICs.
What happened:
The PSTN network cable for ONE of the mediation servers ("LyncFE1") was cut inadvertently.
The experience:
Approximately 5 minutes after the cable was cut, "LyncFE1" recognized all gateways were offline. (expected)
Users did not lose connectivity to the server, as the "corp" NICs were unaffected. (expected)
Inbound calls routed around the failure, as the PSTN gateways are configured to do. (expected)
Outbound calls for an indeterminate group of people would fail consistently with 503 errors and the logs clearly indicated no routes available.  These users were not all hosted on LyncFE1 (per a get-csuserpoolinfo of some of the affected users).
 (NOT expected)
I expected that the Mediation Service on "LyncFE1" would recognize that it had no routes, that other servers (let's say "LyncFE2") did have available routes, and that it would then route outbound calls to the other servers via the "corp"
NIC.
Thoughts?

Hi,
 As the available route is associated with Lync client policy. So If you want to use the other route, you need to associate these users client policy with those route using Lync Server Control Panel or Lync Server Management Shell.
Best Regards,
Eason Huang
Eason Huang
TechNet Community Support

Similar Messages

  • Handling Network failure during data transfer from ECC to PI

    Hi,
    We have created a custom program in ECC to push invoice header information to PI using the proxy call.
    We're able to successfully now push these data to PI without any issues.
    However, there is a comment from client like if there is an unexpected error such as system or network failure during transferring data from ECC to PI, then how is it being handled from ECC side?
    I'm left with no answers. I know the messages will be stucked in SMQ2 or SMQ1 but we need to find out how can resend those data to PI in case of such failures. Documents in ECC are either posted or parked or payment totally based on real time process and based on each ECC event such as parked or posted or reveresed, we are sending one message to PI respectively. But how can we resend these invoices incase of an unexpected error?
    If we go with an ad hoc process, how do we implement this process for each document which is being created or rejected.
    Thanks,
    Shamim

    Hi,
    The queue will automatically restart if the problem was connection issue. In case of system failure you need to check and start them manually.
    check http://help.sap.com/saphelp_nw04/helpdata/en/f3/df5f3cb0decb09e10000000a114084/content.htm
    To check for any problems you can use CCMS monitoring.
    check http://help.sap.com/saphelp_470/helpdata/en/52/12f73b7803b009e10000000a114084/content.htm
    Hope this will help you.
    Regards
    Vinit

  • After Effects can't continue: unexpected failure during application startup

    Im running 10.10, Mac OSX, it was running fine before a minor update came out, and now I can not open AE CS6. I have uninstalled, reinstalled it, installed the trial of CC 2014. No messages beside "After Effects can’t continue: unexpected failure during application startup" come up.
    System log:
    Jun 22 15:30:49 Users-iMac.local AdobeCrashDaemon[976]: WARNING: The Gestalt selector gestaltSystemVersion is returning 10.9.0 instead of 10.10.0. Use NSProcessInfo's operatingSystemVersion property to get correct system version number.
                Call location:
    Jun 22 15:30:49 Users-iMac.local AdobeCrashDaemon[976]: 0   CarbonCore                          0x00007fff82bb637d ___Gestalt_SystemVersion_block_invoke + 113
    Jun 22 15:30:49 Users-iMac.local AdobeCrashDaemon[976]: 1   libdispatch.dylib                   0x00007fff8bf70fa2 _dispatch_client_callout + 8
    Jun 22 15:30:49 Users-iMac.local AdobeCrashDaemon[976]: 2   libdispatch.dylib                   0x00007fff8bf70f00 dispatch_once_f + 79
    Jun 22 15:30:49 Users-iMac.local AdobeCrashDaemon[976]: 3   CarbonCore                          0x00007fff82b5e932 _Gestalt_SystemVersion + 987
    Jun 22 15:30:49 Users-iMac.local AdobeCrashDaemon[976]: 4   CarbonCore                          0x00007fff82b5e51f Gestalt + 144
    Jun 22 15:30:49 Users-iMac.local AdobeCrashDaemon[976]: 5   AdobeCrashDaemon                    0x0000000100002e69 -[MyDaemon GetOSVersionMajor] + 33
    Jun 22 15:30:49 Users-iMac.local AdobeCrashDaemon[976]: 6   AdobeCrashDaemon                    0x0000000100002d4a -[MyDaemon isRunningOnLeopard] + 25

    Well, looks like a bug in Yosemite. You might wanan read the pertinent announcements, anyway. At this point Adobe apps are not compatible with OSX 10.10.
    Mylenium

  • Thanks this was very useful.. I have a brand new iphone 4s unlocked version from singapore and I am having the same problem constant network fluctuations, no service, call failed, invalid sim, sim failure, bars fluctuating from 5 - 3 - 2 - 1, no signal/ba

    I have a brand new iphone 4s unlocked version from singapore and I am experiencing constant network fluctuations on my iphone , no service, call failed, invalid sim, sim failure, bars fluctuating from 5 - 3 - 2 - 1, no signal/bars searching etc... Ifeel miserable after trying out all the options changed multiple sims, tried multiple carriers, restored the phone back and have set-up as new device but the problem does not get solved... its is so frustrating that after investing on a world's most expensive phone you cannot enjoy the calling features which the phone offers which is the main thing you own the phone for.. Rest of the things are working.. The funniest part I am not able to make calls or receive calls but can access the internet..  I have tried every possible thing and really very frustrated with this iphone of mine...I hope the problem gets solved.. would anyone have any solutions to this problem.. Does it mean one should never update software as I hear that after updating to IOS.5.1 the problem has started coming and same has been with there.... Apple says its a hardware failure which is very difficult to digest as sometimes I get 5 bars and able to make and receive calls but 95% of the time I am unable to. If anyone has any solution I will be very happy if he or share can share the same. Many thanks. its so disgusting to experience this on top of this apple does not support global warranty for the phone as you need to go to the country from where you bought the phone for replacement.. Truly sad state of affairs.. very very very dissappointing.

    Have a look at this it might help
    http://support.apple.com/kb/TS4148

  • System failure, during call of function module RSWR_RFC_SERVICE_TEST

    Hi Team,
         I am working with BW and Portal Integration, with the Netweaver
    2004s SP 11 version. I have a issue when i run the RSPOR_SETUP program
    to test the configuration.
    The error is on status 5 and 12, i get the following error:
    System failure, during call of function module RSWR_RFC_SERVICE_TEST,
    and when i enter at the dev_jrfc.trc log file, i have the following
    error:
    Exception thrown [Tue Jul 10 16:12:16,687]:Exception thrown by
    application running in JCo Server
    java.lang.RuntimeException: call FM RSWR_RFC_SERVICE_TEST to ProgId
    smxpedvc_PORTAL_EPD on host smxpedvc.grupoempresarialangeles.com.mx
    with SSO not authorized: No login module succeeded.
            at
    com.sap.engine.services.rfcengine.RFCDefaultRequestHandler.handleRequest
    (RFCDefaultRequestHandler.java:79)
            at com.sap.engine.services.rfcengine.RFCJCOServer.handleRequest
    (RFCJCOServer.java:156)
            at com.sap.mw.jco.JCO$Server.dispatchRequest(JCO.java:7785)
            at com.sap.mw.jco.MiddlewareJRfc$Server.dispatchRequest
    (MiddlewareJRfc.java:2405)
            at com.sap.mw.jco.MiddlewareJRfc$Server.listen
    (MiddlewareJRfc.java:1728)
            at com.sap.mw.jco.JCO$Server.listen(JCO.java:8145)
            at com.sap.mw.jco.JCO$Server.work(JCO.java:8265)
            at com.sap.mw.jco.JCO$Server.loop(JCO.java:8212)
            at com.sap.mw.jco.JCO$Server.run(JCO.java:8128)
            at com.sap.engine.core.thread.impl3.ActionObject.run
    (ActionObject.java:37)
            at java.security.AccessController.doPrivileged(Native Method)
            at com.sap.engine.core.thread.impl3.SingleThread.execute
    (SingleThread.java:100)
            at com.sap.engine.core.thread.impl3.SingleThread.run
    (SingleThread.java:170)
    Could you help me to solve this issue.
    Thanks so much.

    Have you checked the user that is used to connect from J2EE back to ABAP?
    I had a similar problem, went into the Visual Administrator and found the incorrect password (or possibly outdated password) was being used to communicate back to ABAP, and updating that sorted out my problem.
    Hope this helps.
    Cheers,
    Andrew

  • Error in BPM: "COMMUNICATION FAILURE" during JCo call. Error opening an RFC

    hello experts
    i am receiving the above error incase of BPM scenario where i am having a Transform step and synchronous RFC step.
    i referred to the few threads discussing such problems and was trying to find whether my mapping is correct and i tested my mappings using my payload which looks good.
    When looked into the Mapping trace of the BPM "Show container" i found error ""COMMUNICATION FAILURE" during JCo call. Error opening an RFC connection" and it seems that when BPM is attempting to call interface mapping it is throwing this error.
    Thanks in advance.
    Regards
    rajeev

    hi,
    I think no problen with the mapping part,
    jco connection requird when xi try to stablish the connection with the adapter .
    please look the link provided.
    Setup and test SAP Java Connector outbound connection
    please also check the following parameter at the exchange profile
    com.sap.aii.rwb.server.centralmonitoring.r3.ashost
    com.sap.aii.rwb.server.centralmonitoring.r3.client
    com.sap.aii.rwb.server.centralmonitoring.r3.sysnr
    com.sap.aii.rwb.server.centralmonitoring.httpport
    these parameter must be given properly.
    if every thing is ok than and problem still exist than try to restart the system.
    for us after restarting its worked fine.
    regards,
    navneet

  • Problem while determining receivers using interface mapping: "SYSTEM FAILURE" during JCo call. Bean SMPP_CALL_JAVA_RUNTIME3 not found

    We have a SOAP to PROXY scenario Which is in Production.
    We keep getting the Error:
    " Problem while determining receivers using interface mapping: "SYSTEM FAILURE" during JCo call. Bean SMPP_CALL_JAVA_RUNTIME3 not found on host XXXXXX, ProgId =AI_RUNTIME_XXX.
    We are using Standard Receiver Determination with single receiver without any condition. And no mapping being used in interface determination.
    What are all the possible situation where we face such as this issue in Production.

    Please check the SAP note
    # 1706936 - messages fails with error java.lang.RuntimeException Bean SMPP_CALL_JAVA_RUNTIME3 not found
    1944248 - PI unstable due to JCO_SYSTEM_FAILURE mapping issues

  • System failure during call of function module RSWR_RFC_SERVICE_TEST in BW

    Hello All,
                I am doing configuration with BW and EP portal when we run the program (RSPOR_SETUP) in step - 12 Maintain User Assignment in Portal = (System failure during call of function module RSWR_RFC_SERVICE_TEST) giving this error SSO is working fine.
    RFC connection is working fine with BW and portal
    When i execute the FM - RSWR_RFC_SERVICE_TEST it is giving error (SYSTEM_FAILURE) and in Portal side log file it is giving below error.
    dev_jrfc.trc
    Exception thrown [Fri Sep 05 15:14:57,928]:Exception thrown by application running in JCo Server
    java.lang.RuntimeException: Bean RSWR_RFC_SERVICE_TEST not found on host (Hostname), ProgId=EPP_PORTAL_SID: Object not found in lookup of RSWR_RFC_SERVICE_TEST.
            at com.sap.engine.services.rfcengine.RFCDefaultRequestHandler.handleRequest(RFCDefaultRequestHandler.java:138)
            at com.sap.engine.services.rfcengine.RFCJCOServer$J2EEApplicationRunnable.run(RFCJCOServer.java:269)
            at com.sap.engine.core.thread.impl3.ActionObject.run(ActionObject.java:37)
            at java.security.AccessController.doPrivileged(Native Method)
            at com.sap.engine.core.thread.impl3.SingleThread.execute(SingleThread.java:104)
            at com.sap.engine.core.thread.impl3.SingleThread.run(SingleThread.java:176)
    Please help me find the solution.
    Thanks
    Gurpal

    Hi Gurpal,
    Please go through the below SAP notes
    878455
    - Information broadcasting: No portal user exists
    814083
    - Inf. broadcasting: Changed RFC settings as of NW 04 SP11
    1135947 - Bean not found
    Hope this solves your problem.
    Regards,
    Prithviraj.

  • "SYSTEM FAILURE" during JCo call.java.lang.reflect.UndeclaredThrowableExcep

    Hi All
    I have developed Java mapping program where I am calling three BAPI in sequence and trying to map all three bapi data to single Target XML file or Multiple target xml files depends on the in coming data.
    Now I want these files name should be genereted dynamically .So I have used below  Dynamic Configuration code in my java mapping program.********************************************************************************************************************************************************************
    try
         String currDate = new String();
         String currTime= new String();
         DateFormat dFormat=new SimpleDateFormat("yyyyMMdd");
         DateFormat tFormat = new SimpleDateFormat("HHmmss");
         java.util.Date date = new Date();
         TimeZone cetTimeZone = TimeZone.getTimeZone("CET");
         tFormat.setTimeZone(cetTimeZone);
         currDate = dFormat.format(date);
         currTime= tFormat.format(date);
         String pubDate=currDate + currTime;
         String ext=".xml";
         String event="-1_1-";
         trace.addInfo("********  Before  Dynamic Configuration ***************" );
         DynamicConfiguration conf =(DynamicConfiguration)container.getTransformationParameters().get(StreamTransformationConstants.DYNAMIC_CONFIGURATION);
                                            DynamicConfigurationKey key = DynamicConfigurationKey.create( "http://sap.com/xi/XI/System/File","FileName" );
         trace.addInfo("********  After  Dynamic Configuration ***************" );
         String tempFileName="NL09-"eventponum+ "-" pubDateext;
         trace.addInfo("The name of the file is  : " + tempFileName);
         conf.put(key, "tempFileName");
                catch (Exception e)
                     trace.addWarning("Error While creating File Name"+e.getMessage());
                     throw new Exception("Error While creating File Name",e);
    Now the problem is when I am using above code I am getting following error
    "SYSTEM FAILURE" during JCo call.
    java.lang.reflect.UndeclaredThrowableException
    - <SAP:Error xmlns:SAP="http://sap.com/xi/XI/Message/30" xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/" SOAP:mustUnderstand="">
      <SAP:Category>XIServer</SAP:Category>
      <SAP:Code area="MAPPING">JCO_SYSTEM_FAILURE</SAP:Code>
      <SAP:P1>java.lang.reflect.UndeclaredThrowableException</SAP:P1>
      <SAP:P2 />
      <SAP:P3 />
      <SAP:P4 />
      <SAP:AdditionalText />
      <SAP:Stack>&quot;SYSTEM FAILURE&quot; during JCo call.
    java.lang.reflect.UndeclaredThrowableException</SAP:Stack>
      <SAP:Retry>A</SAP:Retry>
      </SAP:Error>
    Could please tell me why I am facing this problem only when I am using  Dynamic configuration code.
    If I dont use  Dynamic configuration code then I am not getting any error .But my requirement is to generate dynamic file name (Note I have tried with Variable Substution also, It is also not solving my problem as I need time stamp of ("CET") time zone).

    Hi Abhishek,
    Yes I have appended throws StreamTransformationException and imported the relevant StreamTransformationException class also.
    Here I am getting strange thing when  I am adding dynamic configuration code " SYSTEM FAILURE" during JCo call. But Jco cal is something to internal systems which does not relate to Dynamic Configuration.
    If I run my code with out any Dynamic Configuration code then it is running with out any errors.
    But I need this  Dynamic Configuration inorder to generate dyanamic file name.

  • "SYSTEM FAILURE" during JCo call. max no of 100 conversations exceeded

    Hi Experts,
    My scernario is : RFC->File Asynch
    When i do testing this scenario in Configuration (ID) -> tools-> test configuration  I got below error.
            Interface Mapping
    Runtime error
    "SYSTEM FAILURE" during JCo call. max no of 100 conversations exceeded / CPIC-CALL: 'ThSAPCMRCV' : cmRc=17 thRc=45
    1) I have created incremented 100 to 300: Set the following environment variable CPIC_MAX_CONV=300
    2) In sender RFC communication channel i have all correct parameters like: gateway service,Program ID, client number, password, userid etc
    Pleas help me out.
    thanks
    siva grandhi

    hi,
    se this
    /thread/174978 [original link is broken]
    Look at SAP note 314530,316877
    set this enviroment Variables:
    set CPIC_MAX_CONV = 500 (WINDOWS)
    setenv CPIC_MAX_CONV 500 (unix)
    also check the rfc destination JCO_RUNTIME_JCOSERVER
    AI_RUNTIME_JCOSERVER. if u have any error in this RFC plz refer to this link
    http://help.sap.com/saphelp_nw04s/helpdata/en/9b/da0f41026df223e10000000a155106/frameset.htm
    Thanks
    Rodrigo
    ps:reward points if useful
    Edited by: Rodrigo Pertierra on Mar 26, 2008 11:59 AM

  • I installed the CC trial with an error window saying: Could not create the file '/Users/dranim/Library/Preferences/Adobe/After Effects/13.2/dummy'.  That was the first window.   Heres the second: After Effects can't continue: unexpected failure during app

    I installed the CC trial with an error window saying: Could not create the file '/Users/dranim/Library/Preferences/Adobe/After Effects/13.2/dummy'.  That was the first window.   Heres the second: After Effects can’t continue: unexpected failure during application startup  I paid for the month subscription of 29.99. It claims it is downloading again?! Am I missing something here? Why is this process so complicated? I need to get this resolved asap and start working.

    I originally had Adobe Photoshop Extended, then upgraded to the Production Suite. I ran the Adobe Cleaner, and that uninstalled most Adobe products, including my existing Adobe install, and then I re-installed everything with the same error code. Since CS4 came with CS5, I've installed AE CS4, but would really like to upgrade because I'm new to Creative Suite, and not sure how CS4 integrates with CS5...CS4 After Effects installed perfectly. I do have a 64 bit system, and installing to an OCZ Vertex 2....every other suite installs perfectly, except AE. And I think that is the coolest program in the Suite. I thank you all so much for taking the time to help, I really want to get AECS5 running...I did try to install after doing the recommended items Adobe suggests for Exit Codes 6 and 7, including turning off many startups...
    I'm baffled....
    Ben

  • How to design a grid to withstand a partial network failure

    Hi,
    We are evaluating Coherence for a mission-critical system where we want to test partial network failure scenario. We want to run 4 physical hosts, 8 JVMs with 2 JVM on each host. The evaluation criteria is to connect 2 machines on either side of a router, kill one side during a load test, thereby disconnecting the 2 machines and run with the remaining two. In order to have a fail-safe behavior in this scenario, I guess we must ascertain that the back-ups for the objects on one side of a router are always made on the other side. Can Coherence detect such a network set up and store backups accordingly? Or is there a way to configure this by overriding the default behavior?
    Pls advise
    Thanks,
    Sairam

    Hi Sairam
    If you use scenario 1) then your test will work. As this scenario only has two machines then the primary node for a piece of data will be on one machine and Coherence will make sure the backup is on the other machine. If you then break the link between the machines or lose a machine you will not have lost data.
    If however you have more than 2 machines then you break the link between them you have what is known as a split-brain - which means you have effectively split your cluster in two. Both sides only know they they can no longer see the other part of the cluster and assume they are must be the remaining working part. In this case though you will have lost data from the cluster as some of the backups for each part of the cluster will be on the other part. There is nothing you can do about this, you cannot control which machines backups are allocated to.
    Increasing the backup count to 3 does not give you any more reliance than having a backup count of 2. As far as I know Coherence only guarantees that the placement of the first backup is on another machine.
    I am not quite sure what you are trying to test as a Coherence cluster cannot automatically survive a network failure that splits the cluster. There are things in 3.6 that you might be able to do with Quorums to mitigate the damage while you recover and there are things you can do to make recovery easier - but you will have to recover lost data.
    JK

  • Modbus ip shared variable network failure

    I am using lab view 8.6 DSC module to communicate to a watlow system which contains five watlow 96 controllers and an EM gateway.  I have created shared variables for the process temperatures and setpoints for each of the five controllers using watlow modbus register Numbers with a 400001 offset.  I have also created shared variables for Updating,CommFail,UpdateNow,and UpdateRate which where predefined. I have error when starting the VI if the SV  has been  dragged and dropped into the block diagram. The message is  Error -1967353902 (The Modbus I/O server failed to receive any response from the Modbus slave device.) occurred at SV in vi. If I bind a variable in the VI to this same SV the error does not occur but the variable cycles between Good, Network Failure, No known value, and device failure as stated in the variable manager watched variables.  The Updating, CommFail and UpdateRate all have a consistent Good in the quality column of the variable manager.  UpdateNow has X in value, type, timestamp, and quality columns.  CommFail and Updating does cycle between true and false randomly.  I have tried a third party software called SpecView 32 demo to see if the commincation with the modbus system is not working and I can create five watlow controlers on my screen and direct them to the ip address along with a unit address and the system works without faults.  This leads me to believe the commincation bewteen the SV Engine and the IP address is not correct.  HELP Please. 
    Robert Jensen
    UND EERC

    If your application can deal with it I would recommend staying clear of the 'Networked Published' option.
    When I started my Modbus development on cRIO....I left it enabled, and with ~100 shared variables on a 9074, the CPU was railing, and I saw a buffering behavior on the shared variables (which was not desirable in my application).
    In my application I am using the old modbus library (as apposed to the new API) for cRIO to slave comms, the cRIO being the master.
    I am also using the IOserver making the cRIO a slave to an external SCADA - and it passes essentially the same data arrays as I use on the modbus library for my local HMI [Not an NI product].....Which is two full Modbus frame writes (@ 120 words each, and about 60 words more for ~300 words outbound from the cRIO).
    The IOserver slave was a recent addition and did not add much to the CPU load - although only 16 bytes is high speed, the balance of the total word package is at either 1 second or 3 seconds.
    So, in my experince, the 'Networked Published' option adds significant CPU loading (on entery level cRIOs) YMMV.
    I am huge fan of the shared variable engine (some at NI were pusing the CVT, and TCE etc...). However most of my shared variables are not the Networked Published variety (excepting local module channels) those have remained networked published for DSM (Distributed System Manager) use.

  • JMS adapter webspehere mq, network failure - stop

    Hello,
    After a network failure the JMS sender adapter (JMS --> XI) goes red, and doesn't automatically recover after a network failure. It is necessary to manually stop and start the adapter. This is bad design from SAP. You would like it to behave like the ftp adapter which automatically continues to poll for files after a network outage.
    What can be done to overcome this bad design?
    I'm on XI SP 18.
    My Idea is to use the new AAM described in
    SAP Note 766332
    1. com.sap.aii.af.service.administration.cpa: Channel-related interfaces and APIs
    2. com.sap.aii.af.service.administration.monitoring: Monitoring interfaces and APIs
    3. com.sap.aii.af.service.administration.i18n: Localization interfaces and APIs
    I would like to write a standard j2se application to connect using jndi to list the status of the channels, and then stop and start depending on status.
    Which jar files should be included?
    Example code?

    Hello,
    I have now built a workaround solution based on the AAM API.
    It is implemented as a J2EE stateless session bean and a J2EE client. The client reads config with information about which channels to restart and calls the bean which checks status on the channels and restarts the ones with errors or in a stopped state. We schedult the program to run every 5 minutes and check the status.
    Check the channelstatus.jsp for hints on how to use the api.
    It's a pity SAP hasn't got this functionality builtin, it's clearly a design error.
    /Otto
    Edited by: Otto Frost on Dec 18, 2007 4:32 PM

  • Oracle 10g CRS autorecovery from network failures - Solaris with IPMP

    Hi all,
    Just wondering if anyone has experience with a setup similar to mine. Let me first apologise for the lengthy introduction that follows >.<
    A quick run-down of my implementation: Sun SPARC Solaris 10, Oracle CRS, ASM and RAC database patched to version 10.2.0.4 respectively, no third-party cluster software used for a 2-node cluster. Additionally, the SAN storage is attached directly with fiber cable to both servers, and the CRS files (OCR, voting disks) are always visible to the servers, there is no switch/hub between the server and the storage. There is IPMP configured for both the public and interconnect network devices. When performing the usual failover tests for IPMP, both the OS logs and the CRS logs show a failure detected, and a failover to the surviving network interface (on both the public and the private network devices).
    For the private interconnect, when both of the network devices are disabled (by manually disconnecting the network cables), this results in the 2nd node rebooting, and the CRS process starting, but unable to synchronize with the 1st node (which is running fine the whole time). Further, when I look at the CRS logs, it is able to correctly identify all the OCR files and voting disks. When the network connectivity is restored, both the OS and CRS logs reflect this connection has been repaired. However, the CRS logs at this point still state that node 1 (which is running fine) is down, and the 2nd node attempts to join the cluster as the master node. When I manually run the 'crsctl stop crs' and 'crsctl start crs' commands, this results in a message stating that the node is going to be rebooted to ensure cluster integrity, and the 2nd node reboots, starts the CRS daemons again at startup, and joins the cluster normally.
    For the public network, when the 2nd node is manually disconnected, the VIP is seen to not failover, and any attempts to connect to this node via the VIP result in a timeout. When connectivity is restored, as expected the OS and CRS logs acknowledge the recovery, and the VIP for node 2 automatically fails over, but the listener goes down as well. Using the 'srvctl start listener' command brings it up again, and everything is fine. During this whole process, the database instance runs fine on both nodes.
    From the case studies above, I can see that the network failures are detected by the Oracle Clusterware, and a simple command run once this failure is repaired restores full functionality to the RAC database. However, is there anyway to automate this recovery, for the 2 cases stated above, so that there is no need for manual intervention by the DBAs? I was able to test case 2 (public network) with the Oracle document 805969.1 (VIP does not relocate back to the original node after public network problem is resolved), is there a similar workaround for the interconnect?
    Any and all pointers would be appreciated, and again, sorry for the lengthy post.
    Edited by: NS Selvam on 16-Dec-2009 20:36
    changed some minor typos

    hi
    i ve given the shell script.i just need to run that i usually get the op like
    [root@rac-1 Desktop]# sh iscsi-corntab.sh
    Logging in to [iface: default, target: iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz, portal: 192.168.181.10,3260]
    Login to [iface: default, target: iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz, portal: 192.168.181.10,3260]: successfulthe script contains :
    iscsiadm -m node -T iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz -p 192.168.181.10 -l
    iscsiadm -m node -T iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz -p 192.168.181.10 --op update -n node.startup -v automatic
    (cd /dev/disk/by-path; ls -l *sayantan-chakraborty* | awk '{FS=" "; print $9 " " $10 " " $11}')
    [root@rac-1 Desktop]# (cd /dev/disk/by-path; ls -l *sayantan-chakraborty* | awk '{FS=" "; print $9 " " $10 " " $11}')
    ip-192.168.181.10:3260-iscsi-iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz-lun-1 -> ../../sdc
    [root@rac-1 Desktop]# can you post the oput of ls /dev/iscsi ??you may get like this:
    [root@rac-1 Desktop]# ls /dev/iscsi
    xyz
    [root@rac-1 Desktop]#

Maybe you are looking for

  • I have forgotten my password, how should i solve the problem?

    There is nothing wrong with my computer. The only prolem is that I have forgotton the password so I cant log in. I hope to get help from here!

  • How to get the report description from obiee's web services (web catalog)?

    I am trying to get the Description from the properties of a report (using web services + web catalog). I am not able to retrieve the description through the itemProperties[] array, and have not been successful finding it anywhere else. Has anyone bee

  • Getting the name of outer class in an inner class

    Hi, I have a private inner class, something like this: public class OuterClass extends AnotherClass { public OuterClass() { supre(); private class innerClass1 extends SomeotherClass { protected void someMethod() { // how to get the name of outer clas

  • Bi cats datasource

    Hi Gurus I am working on the BI development, I had an issue regarding the CATS datasource, which was active , we had created new records in R/3, these values were not reflecting in the datasource when it is viewed in the Extract Checker. Can anyone h

  • Error in JavaServer Pages: NoClassDefFoundError

    I'm trying to execute a JSP page. When I do the server generates the following error: java.lang.NoClassDefFoundError: AlphabetCode (wrong name: webpages/AlphabetCode) My CLASSPATH is set to this directory: CLASSPATH=D:\Download\javaSDK\JSWDK-~1.1\web