Packet acknowledge failed after failover

Hello,
We're running MQ 3.5 SP1 EE and we are wondering how to deal with the following problem. We have a client that consists of 2 producers and 1 consumer. There is a single connection for the client. Each producer creates its own session with:createSession(TRANSACTED_SESSION,Session.AUTO_ACKNOWLEDGE) The consumer creates its session with:
createSession(TRANSACTED_SESSION,Session.DUPS_OK_ACKNOWLEDGE)                       We have the following reconnect settings:imqReconnectEnabled = true
imqReconnectAttempts = 1
imqAddressListIterations = -1
imqReconnectInterval = 30000
imqAddressListBehavior = RANDOMSometimes when we kill one of our cluster servers, the client threads begin throwing exceptions as follows. From the send() method from one of the producers:
com.sun.messaging.jms.JMSException: [C4000]: Packet acknowledge failed.From the send() method of the other producer (a few seconds later):com.sun.messaging.jms.JMSException: [C4001]: Write packet failed. - cause: java.net.SocketException: Socket is closedFinally, an exception is caught by the consumer's exception listener:
JMSException caught: [C4002]: Read packet failed. - cause: java.net.SocketException: Socket is closedWhat is the best way to recover from this? We were thinking of closing the connection, reopening, and starting all sessions again. Is there anything smarter we can do?
Kernel: 2.4.21-9.0.3.ELsmp
Dist: RedHat ES 3.0
Thanks,
Aaron

I know this is a long shot, but what was the outcome of this? How did you fix this issue? I am facing the same issue with MQ 4.3 (flaky network, occasional failure to recover, even with infinite recovery configured).
Once the 'Packet acknowledge failed' error occurs (see below), the consumer recovery thread gets stuck in a loop of retrying, but never recovers, even though it says it has (it thinks it's connected, but no messages are ever delivered to it). Since the network is flaky, the consuming client is often disconnected, but usually recovers okay (thanks to using the imq reconnect attributes), but if the error below occurs, that client never reconnects to the broker:
2009-10-28 13:58:52,812 INFO [com.xxx.MessageListenerContainer] Setup of JMS message listener invoker failed - trying to recover: com.sun.messaging.jms.JMSException: [C4000]: Packet acknowledge failed. user=user, broker=xxx:1093(8095)
Here's the logs of the recovery that results in a consumer that never gets anything delivered to it...
2009-10-28 13:58:52,875 ERROR [STDERR] 28-Oct-2009 13:58:52 com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_TRANSPORT_CONNECTED, broker: xxx1:1093(8095)
2009-10-28 13:58:52,875 ERROR [STDERR] 28-Oct-2009 13:58:52 com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_STARTED, broker: xxx1:1093(8095)
2009-10-28 13:58:52,875 ERROR [STDERR] 28-Oct-2009 13:58:52 com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_IN_PROCESS, broker: xxx1:1093(8095)
2009-10-28 13:58:52,875 ERROR [STDERR] 28-Oct-2009 13:58:52 com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_SUCCEEDED, broker: xxx1:1093(8095)
2009-10-28 13:58:52,875 ERROR [STDERR] 28-Oct-2009 13:58:52 com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_INACTIVE, broker: xxx1:1093(8095)
2009-10-28 13:58:53,000 INFO [com.xxx.MessageListenerContainer] Successfully refreshed JMS Connection
But ultimately, the client goes into a state where it doesn't recover. Here you can see 4 clients trying to recover from previous failures:
2009-10-28 14:02:39,796 ERROR [STDERR] 28-Oct-2009 14:02:39 com.sun.messaging.jmq.jmsclient.ExceptionHandler logCaughtException
WARNING: [I500]: Caught JVM Exception: java.net.SocketException: Socket closed
2009-10-28 14:02:42,500 ERROR [STDERR] 28-Oct-2009 14:02:42 com.sun.messaging.jmq.jmsclient.ExceptionHandler logCaughtException
WARNING: [I500]: Caught JVM Exception: java.net.SocketException: Socket closed
2009-10-28 14:02:42,500 ERROR [STDERR] 28-Oct-2009 14:02:42 com.sun.messaging.jmq.jmsclient.ExceptionHandler logCaughtException
WARNING: [I500]: Caught JVM Exception: java.net.SocketException: Socket closed
2009-10-28 14:02:42,500 ERROR [STDERR] 28-Oct-2009 14:02:42 com.sun.messaging.jmq.jmsclient.ExceptionHandler logCaughtException
WARNING: [I500]: Caught JVM Exception: java.net.SocketException: Socket closed
Each time the failure occurs, we get an additional exception. Rebooting the MQ broker allows a new client to be created and the messages are delivered, but the existing (previously failed) clients, are still left trying to connect.
Sounds like we're using some stale connection perhaps? We're using Spring's DefaultMessageListenerContainer. Could it be that the container is caching an old connection and trying to re-use it rather than creating a new one? It's not clear to me (yet) whether the mq client hides the failure from Spring's DefaultMessageListenerContainer and so doesn't give the Spring class the chance to recover itself.
Many thanks

Similar Messages

  • [C4000]: Packet acknowledge failed.

    My HA cluster takeover (to a second broker - jdbc persistent data [ORA9i]):
    [B1168]: Takeover lock has been acquired for failed broker xxxxxxxx
    take a lot of time and I receive the following exception:
    INFO: [I107]: Connection recover state: RECOVER_FAILED, broker: xxxxxxxx:7676(33443)
    May 29, 2009 11:02:59 AM com.sun.messaging.jmq.jmsclient.ConnectionRecover run
    WARNING: com.sun.messaging.jms.JMSException: [C4000]: Packet acknowledge failed. user=guest, broker=xxxxxxxx:7676(33443)
    com.sun.messaging.jms.JMSException: [C4000]: Packet acknowledge failed. user=guest, broker=xxxxxxxx:7676(33443)
         at com.sun.messaging.jmq.jmsclient.ProtocolHandler.writePacketWithAck(ProtocolHandler.java:735)
         at com.sun.messaging.jmq.jmsclient.ProtocolHandler.writePacketWithReply2(ProtocolHandler.java:497)
         at com.sun.messaging.jmq.jmsclient.ProtocolHandler.hello(ProtocolHandler.java:949)
         at com.sun.messaging.jmq.jmsclient.ConnectionImpl.hello(ConnectionImpl.java:557)
         at com.sun.messaging.jmq.jmsclient.ConnectionRecover.recover(ConnectionRecover.java:308)
         at com.sun.messaging.jmq.jmsclient.ConnectionRecover.run(ConnectionRecover.java:230)
         at java.lang.Thread.run(Thread.java:595)
    What does it means?
    thanks asterix

    The client logging "Packet acknowledge failed" during reconnecting is expected if the broker hasn't completes the takeover - the client runtime will simply try reconnect again

  • Packet acknowledge failed exception

    Hello. I'm making simple use of JMS and recently started getting an exception when I run my Junit test. I'm getting the following exception: (your have to forgive me that I don't give the rest of the exception. The computer that is getting it is not connected to the interenet and dont feel like retyping all the lines and/or copying them to CD to be able to copy and paste them).
    sun.messaging.jms.JMSException: [C4000]: Packet acknowledge failed. user-admin, broker=localhost:7676(34554)
    This exception started when I added a means for my parent program to send terminate requests to the child, so I'm assuming that this is because the child terminates before it acknoledges the terminate request. I can't find any documentation online to confirm rather or not that could be the case. Assuming that is the case what is the best way to handle it? I need to acknoledge most messages; but would prefer not to throw an exception in the one instance where I intentionally terminate the child.
    edit: it seems that messages other then the terminate method can throw this exception. It's being thrown by the parent at seemingly random periods whenever it tries to write to the child.
    help is appreciated ;)
    Edited by: dsollen on Oct 26, 2009 7:48 AM

    Hello. I'm making simple use of JMS and recently started getting an exception when I run my Junit test. I'm getting the following exception: (your have to forgive me that I don't give the rest of the exception. The computer that is getting it is not connected to the interenet and dont feel like retyping all the lines and/or copying them to CD to be able to copy and paste them).
    sun.messaging.jms.JMSException: [C4000]: Packet acknowledge failed. user-admin, broker=localhost:7676(34554)
    This exception started when I added a means for my parent program to send terminate requests to the child, so I'm assuming that this is because the child terminates before it acknoledges the terminate request. I can't find any documentation online to confirm rather or not that could be the case. Assuming that is the case what is the best way to handle it? I need to acknoledge most messages; but would prefer not to throw an exception in the one instance where I intentionally terminate the child.
    edit: it seems that messages other then the terminate method can throw this exception. It's being thrown by the parent at seemingly random periods whenever it tries to write to the child.
    help is appreciated ;)
    Edited by: dsollen on Oct 26, 2009 7:48 AM

  • SUP failed over manually, voice service failed after FAILOVER, started accessing old voice vlan which was removed from config

    Hey guys, 
    I am pretty sure, my subject is kinda confusing. Sorry about that. Here is what happened. 
    1. 4510r with Supervisor V 1000BaseX, switched over to standby Sup, then reseated Active SUP, once reseat complete, switched again to get the reseated SUP up and running as Active SUP. 
    2. a simple maintenance which was supposed to cause no outage and it did not cause any outage as well. 
    3. however, what i did not notice was, even though the voice vlan was configured to access 2353, they were accessing vlan 453. 
    4. the change was made 2 weeks prior to this maintenance where voice vlans were previously accessing 453 and they were all changed to access 2353. configs were saved. 
    5. however, after the maintenance, the running config showed that they were acessing 2353 but when checking the mac address on the interface, it was seen accessing 453. 
    6. the fix was to remove the config and re add it , that fixed it. 
    Has anyone else experienced the issue ? What really happened there ?  
    software version: Version 15.0(2)SG5
    #sh module 
    Chassis Type : WS-C4510R
    Power consumed by backplane : 40 Watts
    Mod Ports          Card Type                                            Model             
    ---+-----+--------------------------------------+------------------+-----------
     1     2  Supervisor V 1000BaseX (GBIC)                 WS-X4516            
     2     2  Supervisor V 1000BaseX (GBIC)                  WS-X4516           
     3    48  10/100/1000BaseT (RJ45)V, Cisco/IEEE   WS-X4548-GB-RJ45V  
     5    48  10/100/1000BaseT (RJ45)V, Cisco/IEEE   WS-X4548-GB-RJ45V   
     6    48  10/100/1000BaseT (RJ45)V, Cisco/IEEE   WS-X4548-GB-RJ45V   
     7    48  10/100/1000BaseT (RJ45)V, Cisco/IEEE   WS-X4548-GB-RJ45V  
     8    48  10/100/1000BaseT (RJ45)V, Cisco/IEEE   WS-X4548-GB-RJ45V   
     9    48  10/100/1000BaseT (RJ45)V, Cisco/IEEE   WS-X4548-GB-RJ45V   

    configs were saved many times prior to the maintenance. i did a " write mem ". 

  • Certs did not working after failover

    we had to failover our primary ace to secondary because primary ace crashed and we had to replace it. After we failied over certain certs stopped working
    s
    To fix this problem i had to remove the cert from ssl-proxy service and re-add it work. has anybody run into the problem like this? why would this fail after failover to secondary ace?

    Oracle-User wrote:
    Hi All
    We are facing issues on FNDWRR.exe after we failover to DR apps servers. We are not able to retrieve concurrent job logs using web browser. Apache log file shows "Premature end of script headers" error message.
    [Sat May 25 19:51:21 2013] [error] [client 10.64.224.134] [ecid: 1369507879:10.166.3.16:3282:0:13,0] Premature end of script headers: /d01/app/ebs/gap/apps_st/comn/webapps/oacore/html/bin/FNDWRR.exe
    Any idea what can be causing this issue?
    Thanks in advance!Did AutoConfig complete successfully?
    Please relink FNDWRR.exe manually or via adadmin and check then.
    Thanks,
    Hussein

  • Server 2012 File Server Cluster Shadow Copies Disappear Some Time After Failover

    Hello,
    I've seen similar questions posted on here before however I have yet to find a solution that worked for us so I'm adding my process in hopes someone can point out where I went wrong.
    The problem: After failover, shadow copies are only available for a short time on the secondary server.  Before the task to create new shadow copies happens the shadow copies are deleted.  Failing back shows them missing on the primary server as
    well when this happens.
    We have a 2 node (hereafter server1 and server2) cluster with a quorum disk.  There are 8 disk resources which are mapped to the cluster via iScsi.  4 of these disks are setup as storage and the other 4 are currently set up as shadow copy volumes
    for their respective storage volume.
    Previously we weren't using separate shadow copy volumes and seeing the same issue described in the topic title.  I followed two other topics on here that seemed close and then setup the separate shadow copy volumes however it has yet to alleviate the
    issue.  These are the two other topics :
    Topic 1: https://social.technet.microsoft.com/Forums/windowsserver/en-US/ba0d2568-53ac-4523-a49e-4e453d14627f/failover-cluster-server-file-server-role-is-clustered-shadow-copies-do-not-seem-to-travel-to?forum=winserverClustering
    Topic 2: https://social.technet.microsoft.com/Forums/windowsserver/en-US/c884c31b-a50e-4c9d-96f3-119e347a61e8/shadow-copies-missing-after-failover-on-2008-r2-cluster
    After reading both of those topics I did the following:
    1) Add the 4 new volumes to the cluster for shadow copies
    2) Made each storage volume dependent on it's shadow copy volume in FCM
    3) Went to the currently active node directly and opened up "My Computer", I then went to the properties of each storage volume and set up shadow copies to go to the respective shadow copy volume drive letter with correct size for spacing, etc.
    4) I then went back to FCM and right clicked on the corresponding storage volume and choose "Configure Shadow Copy" and set the schedule for 12:00 noon and 5:00 PM.
    5) I noticed that on the nodes the task was created and that the task would failover between the nodes and appeared correct.
    6) Everything appears to failover correctly, all volumes come up, drive letters are same, shadow copy storage settings are the same, and 4 scheduled tasks for shadow copy appear on the current node after failover.
    Thinking everything was setup according to best practice I did some testing by changing file contents throughout the day making sure that previous versions were created as scheduled on server1.  I then rebooted Server1 to simulate failure.  Server2
    picked up the role within about 10 seconds and files were avaiable.  I checked and I could still see previous versions for the files after failover that were created on server1.  Unfortunately that didn't last as the next day before noon I was going
    to make more changes to files to ensure that not only could we see the shadow copies that were created when Server1 owned the file server role but also that the copies created on Server2 would be seen on failback.  I was disappointed to discover that
    the shadow copies were all gone and failing back didn't produce them either.
    Does anyone have any insight into this issue?  I must be missing a switch somewhere or perhaps this isn't even possible with our cluster type based on this: http://technet.microsoft.com/en-us/library/cc779378%28v=ws.10%29.aspx
    Now here's an interesting part, shadow copies on 1 of our 4 volumes have been retained from both nodes through the testing, but I can't figure out what makes it different though I do suspect that perhaps the "Disk#s" in computer management / disk
    management perhaps need to be the same between servers?  For example, on server 1 the disk #s for cluster volume 1 might be "Disk4" but on server 2 the same volume might be called "Disk7", however I think that operations like this
    and shadow copy are based on the disk GUID and perhaps this shouldn't matter.
    Edit, checked on the disk numbers, I see no correlation between what I'm seeing in shadow copy and what is happening to the numbers.  All other items, quotas, etc fail and work correctly despite these diffs:
    Disk Numbers on Server 1:
    Format: "shadow/storerelation volume = Disk Number"
    aHome storage1 =   16 
    aShared storage2 = 09
    sHome storage3 =   01
    sShared storage4 = 04
    aHome shadow1 =   10
    aShared shadow2 = 11
    sHome shadow3 =   02
    sShared shadow4 = 05
    Disk numbers on Server 2:
    aHome storage1 = 16 (SAME)
    aShared storage2 = 04 (DIFF)
    sHome storage3 = 05 (DIFF)
    sShared storage4 = 08 (DIFF)
    aHome shadow1 = 10 (SAME)
    aShared shadow2 = 11 (SAME)
    sHome shadow3 = 06 (DIFF)
    sShared shadow4 = 09 (DIFF)
    Thanks in advance for your assistance/guidance on this matter!

    Hello Alex,
    Thank you for your reply.  I will go through your questions in order as best I can, though I'm not the backup expert here.
    1) "Did you see any event ID when the VSS fail?
    please offer us more information about your environment, such as what type backup you are using the soft ware based or hard ware VSS device."
    I saw a number of events on inspection.  Interestingly enough, the event ID 60 issues did not occur on the drive where shadow copies did remain after the two reboots.  I'm putting my event notes in a code block to try to preserve formatting/readability.
     I've written down events from both server 1 and 2 in this code block, documenting the first reboot causing the role to move to server 2 and then the second reboot going back to server 1:
    JANUARY 2
    9:34:20 PM - Server 1 - Event ID: 1074 - INFO - Source: User 32 - Standard reboot request from explorer.exe (Initiated by me)
    9:34:21 PM - Server 1 - Event ID: 7036 - INFO - Source: Service Control Manager - "The Volume Shadow Copy service entered the running state."
    9:34:21 PM - Server 1 - Event ID: 60 - ERROR - Source: volsnap - "The description for Event ID 60 from source volsnap cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    \Device\HarddiskVolumeShadowCopy49
    F:
    T:
    The locale specific resource for the desired message is not present"
    9:34:21 PM - Server 1 - Event ID 60 - ERROR - Source: volsnap - "The description for Event ID 60 from source volsnap cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    \Device\HarddiskVolumeShadowCopy1
    H:
    V:
    The locale specific resource for the desired message is not present"
    ***The above event repeats with only the number changing, drive letters stay same, citing VolumeShadowCopy# numbers 6, 13, 18, 22, 27, 32, 38, 41, 45, 51,
    9:34:21 PM - Server 1 - Event ID: 60 - ERROR - Source: volsnap - "The description for Event ID 60 from source volsnap cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    \Device\HarddiskVolumeShadowCopy4
    E:
    S:
    The locale specific resource for the desired message is not present"
    ***The above event repeats with only the number changing, drive letters stay same, citing VolumeShadowCopy# numbers 5, 10, 19, 21, 25, 29, 37, 40, 46, 48, 48
    9:34:28 PM - Server 1 - Event ID: 7036 - INFO - Source: Service Control Manager - "The NetBackup Legacy Network Service service entered the stopped state."
    9:34:28 PM - Server 1 - Event ID: 7036 - INFO - Source: Service Control Manager - "The Volume Shadow Copy service entered the stopped state.""
    9:34:29 PM - Server 1 - Event ID: 7036 - INFO - Source: Service Control Manager - "The NetBackup Client Service service entered the stopped state."
    9:34:30 PM - Server 1 - Event ID: 7036 - INFO - Source: Service Control Manager - "The NetBackup Discovery Framework service entered the stopped state."
    10:44:07 PM - Server 2 - Event ID: 7036 - INFO - Source: Service Control Manager - "The Volume Shadow Copy service entered the running state."
    10:44:08 PM - Server 2 - Event ID: 7036 - INFO - Source: Service Control Manager - "The Microsoft Software Shadow Copy Provider service entered the running state."
    10:45:01 PM - Server 2 - Event ID: 48 - ERROR - Source: bxois - "Target failed to respond in time to a NOP request."
    10:45:01 PM - Server 2 - Event ID: 20 - ERROR - Source: bxois - "Connection to the target was lost. The initiator will attempt to retry the connection."
    10:45:01 PM - Server 2 - Event ID: 153 - WARN - Source: disk - "The IO operation at logical block address 0x146d2c580 for Disk 7 was retried."
    10:45:03 PM - Server 2 - Event ID: 34 - INFO - Source: bxois - "A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name."
    JANUARY 3
    At around 2:30 I reboot Server 2, seeing that shadow copy was missing after previous failure. Here are the relevant events from the flip back to server 1.
    2:30:34 PM - Server 2 - Event ID: 60 - ERROR - Source: volsnap - "The description for Event ID 60 from source volsnap cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    \Device\HarddiskVolumeShadowCopy24
    F:
    T:
    The locale specific resource for the desired message is not present"
    2:30:34 PM - Server 2 - Event ID: 60 - ERROR - Source: volsnap - "The description for Event ID 60 from source volsnap cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    \Device\HarddiskVolumeShadowCopy23
    E:
    S:
    The locale specific resource for the desired message is not present"
    We are using Symantec NetBackup.  The client agent is installed on both server1 and 2.  We're backing them up based on the complete drive letter for each storage volume (this makes recovery easier).  I believe this is what you would call "software
    based VSS".  We don't have the infrastructure/setup to do hardware based snapshots.  The drives reside on a compellent san mapped to the cluster via iScsi.
    2) "Confirm the following registry is exist:
    - HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\VSS\Settings"
    The key is there, however the DWORD value is not, would that mean that the
    default value is being used at this point?

  • Brconnect Error ORA-12637: Packet receive failed

    Hello,
    We have a problem with Brconnect (42).
    When I start the brconnect program in DB13 transaction I get a following error message.
    The error come when I start the brconnect program under user sapservice(SID).
    When I start the brconnect program under (SID)ADM user the brconnect start error free.
    After server restart the brconnect program run good approximetly 1 hours under sapservice(SID) user too.
    We have used SAP ECC 6.0 with Oracle 10g with Windows 2003 x64 R2 SP2 environment.
    BR0301E SQL error -12637 at location BrInitOraCreate-2, SQL statement:
    'CONNECT system/************* AT PROF_CONN IN SYSOPER MODE'
    ORA-12637: Packet receive failed
    BR0303E Determination of Oracle version failed
    Have you get any idea what was the problem ?
    Thank you a lot.
    Imre Bolyki

    Hi,
    To execute the BRTools you need to have enough permissions and the environment variables set for that particular user. by default the SAPINST sets those variables and grants the required permission to the <SID>ADM user for executing the BRTools. Other users may not have enough permission to execute this and in turn these users will not have enough access to work on the database.
    Regards,
    Varadharajan M

  • ORA-12637: Packet receive failed

    Hi All
    I really need your help.
    I've installed Oracle Database 10g Release 2. Database works properly.
    Developer 6i, Developer 10g, SQL Developer, etc. are working with this database without problems.
    But Forms 5 can't work. It connect to Database 10g about 3 minutes. After that I can build datablocks.
    When I run the form, I see the following message:
    ORA-12637: Packet receive failed
    What should I do for resolve this problem?
    Thanks.

    Are you sure Developer 5 can't work with Database 10g?Which Oracle version is Developer 5 ? You can see it by entering Sql*Plus from Developer 5 Home.
    According to Metalink Note:207303.1 - Client / Server / Interoperability Support Between Different Oracle Versions not even 8.0.6, the Forms 6i version, is certified with 10g, but there are patches, as I said before.
    I'm not aware of patches for Forms 5 to run with 10g.

  • Live Migration Failed After Yum Update on 2.2.2

    Hi,
    I've live migrated vm's from node2 to node1 (master server). Put node2 into maintenance mode, then reconfigure ntp.conf to sync with our new ntp server. While in maintenance mode, I've also ran yum update which updated the kernel from 2.6.18-128.2.1.4.37.el5xen to 2.6.18-128.2.1.4.44.el5xen. In the process, xen was also updated from 3.4.0-0.1.32.el5 to 3.4.0-0.1.39.el5.
    After rebooting, I've put node2 back into active mode. But now I can't live migrate the vm's back into node 2. From node1, ovs_operations.log:
    "2012-06-23 11:03:28" INFO=> migrate_vm: vm('/OVS/running_pool/1210_vm05') start...
    "2012-06-23 11:03:29" INFO=> xen_migrate_vm: migrate with ssl enabled failed, do failover(no ssl). vm('/var/ovs/mount/54DA5753709A48B3BFAEE65C2EAECCE0/running_pool/1210_vm05') -> tgt_srv('node2')
    "2012-06-23 11:03:30" ERROR=> xen_migrate_vm: failed. vm('/var/ovs/mount/54DA5753709A48B3BFAEE65C2EAECCE0/running_pool/1210_vm05') -> tgt_srv('node2') =><Exception: xen_migrate_vm: migrate without ssl failed either.>
    From node2 ovs_operations.log:
    "2012-06-23 11:03:30" INFO=> ha_join_dlm_domain: =>success
    "2012-06-23 11:03:30" ERROR=> ha_set_dlm_lock:failed. lock('6f1cb211-ecef-4ac6-af2b-091fc6fd5966') name('1210_vm05')=> <Exception: create lock('/dlm/ovm/f1cb211ecef4ac6af2b091fc6fd5966') failed. <OSError: [Errno 26] Text file busy: '/dlm/ovm/f1cb211ecef4ac6af2b091fc6fd5966'>
    Is this due to the different xen versions after the update? How can I live migrate the vm's back to node2 so that I can take node1 into maintenance mode to reconfigure ntp.conf and run yum update?
    Thanks!

    The next thing to try is to work though the steps in [[Error loading web sites]]. Though if this is happening on all of your computers it might have something to do with your internet connection. Maybe resetting that will help.

  • Can I know what is "Commit" after failover to Azure ?

    Can I know what is "commit" after failover to Azure ?
    I want to know "Commit" button of protected items.
    SETUP RECOVERY: [ Between an on-premisese Hyper-V site and Azure ]
    After failover from on-premis Hyper-V site to Azure, Protected item show "Commit" button.
    "Commit" Jobs include "Prerequisite check"  and "Commit".
    Regards,
    Yoshihiro Kawabata

    In ASR Failover can be thought to be a 2 Phase Activity.
    1) The Actual Failover where you bring up the VM in Azure using the latest recovery point available.
    2) Committing the failover  to that point.
    Now the question in your mind will be why do we have these 2 Phases. The reason for this is as follows
    Lets say you have configured your VM to have 24 recovery points with hourly App Consistent Snapshots. When you failover ASR automatically picks up the latest point in time that is available for failover (day 9:35 AM). Say you were not happy with that recovery
    point because of some consistency issue in the application data, you can use the Change Recovery Point button(Gesture) in the ASR Portal to choose a different recovery point (say an app consistent snapshot from 9:00 AM that day) to perform the failover.
    Once you are satisfied with the snapshot that is failed over in Azure you can hit the Commit Button. Once you hit the commit button you will not be able to change your Recovery Point.
    Let me know if you have more questions.
    Regards,
    Anoop KV

  • Packet received fails when trying to connect to database

    Dear experts,
    i always received an error message: Packet Received Failed whenever i tried to connect to my 9i database. Once a while i will b able to connect without problem but sometimes is very annoying that i keep receiving this error and i have to retries a lot of time to connect depending on my luck.
    Does anyone have any idea what could be wrong here?
    Many thanks in advance for your advise.

    me, too.
    I installed Oracle 10g for windows on PC1
    and Developer/2000 Forms 5.0.6.8 + sql*net client 2.3.4.0.0 on PC2
    When I tried to connect to database by using SQL*Plus 3.3.4.0.0 the first time to connect is ok (but not fast as normal situation)
    after disconnect the SQL*Plus and tried to made another connection immediately, the [ORA-12637 packet receive failed] appears.
    after sometime, it may be success to connect again.
    I used to disable all Firewall and the problem still happen.
    Any one can help please?

  • Unit test fails after upgrading to Kodo 4.0.0 from 4.0.0-EA4

    I have a group of 6 unit tests failing after upgrading to the new Kodo
    4.0.0 (with BEA) from Kodo-4.0.0-EA4 (with Solarmetric). I'm getting
    exceptions like the one at the bottom of this email. It seems to be an
    interaction with the PostgreSQL driver, though I can't be sure. I
    haven't changed my JDO configuration or the related classes in months
    since I've been focusing on using the objects that have already been
    defined. The .jdo, .jdoquery, and .java code are below the exception,
    just in case there's something wrong in there. Does anyone have advice
    as to how I might debug this?
    Thanks,
    Mark
    Testsuite: edu.ucsc.whisper.test.integration.UserManagerQueryIntegrationTest
    Tests run: 15, Failures: 0, Errors: 6, Time elapsed: 23.308 sec
    Testcase:
    testGetAllUsersWithFirstName(edu.ucsc.whisper.test.integration.UserManagerQueryIntegrationTest):
    Caused an ERROR
    The column index is out of range: 2, number of columns: 1.
    <2|false|4.0.0> kodo.jdo.DataStoreException: The column index is out of
    range: 2, number of columns: 1.
    at
    kodo.jdbc.sql.DBDictionary.newStoreException(DBDictionary.java:4092)
    at kodo.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:82)
    at kodo.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:66)
    at kodo.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:46)
    at
    kodo.jdbc.kernel.SelectResultObjectProvider.handleCheckedException(SelectResultObjectProvider.java:176)
    at
    kodo.kernel.QueryImpl$PackingResultObjectProvider.handleCheckedException(QueryImpl.java:2460)
    at
    com.solarmetric.rop.EagerResultList.<init>(EagerResultList.java:32)
    at kodo.kernel.QueryImpl.toResult(QueryImpl.java:1445)
    at kodo.kernel.QueryImpl.execute(QueryImpl.java:1136)
    at kodo.kernel.QueryImpl.execute(QueryImpl.java:901)
    at kodo.kernel.QueryImpl.execute(QueryImpl.java:865)
    at kodo.kernel.DelegatingQuery.execute(DelegatingQuery.java:787)
    at kodo.jdo.QueryImpl.executeWithArray(QueryImpl.java:210)
    at kodo.jdo.QueryImpl.execute(QueryImpl.java:137)
    at
    edu.ucsc.whisper.core.dao.JdoUserDao.findAllUsersWithFirstName(JdoUserDao.java:232)
    at
    edu.ucsc.whisper.core.manager.DefaultUserManager.getAllUsersWithFirstName(DefaultUserManager.java:252)
    NestedThrowablesStackTrace:
    org.postgresql.util.PSQLException: The column index is out of range: 2,
    number of columns: 1.
    at
    org.postgresql.core.v3.SimpleParameterList.bind(SimpleParameterList.java:57)
    at
    org.postgresql.core.v3.SimpleParameterList.setLiteralParameter(SimpleParameterList.java:101)
    at
    org.postgresql.jdbc2.AbstractJdbc2Statement.bindLiteral(AbstractJdbc2Statement.java:2085)
    at
    org.postgresql.jdbc2.AbstractJdbc2Statement.setInt(AbstractJdbc2Statement.java:1133)
    at
    com.solarmetric.jdbc.DelegatingPreparedStatement.setInt(DelegatingPreparedStatement.java:390)
    at
    com.solarmetric.jdbc.PoolConnection$PoolPreparedStatement.setInt(PoolConnection.java:440)
    at
    com.solarmetric.jdbc.DelegatingPreparedStatement.setInt(DelegatingPreparedStatement.java:390)
    at
    com.solarmetric.jdbc.DelegatingPreparedStatement.setInt(DelegatingPreparedStatement.java:390)
    at
    com.solarmetric.jdbc.DelegatingPreparedStatement.setInt(DelegatingPreparedStatement.java:390)
    at
    com.solarmetric.jdbc.LoggingConnectionDecorator$LoggingConnection$LoggingPreparedStatement.setInt(LoggingConnectionDecorator.java:1
    257)
    at
    com.solarmetric.jdbc.DelegatingPreparedStatement.setInt(DelegatingPreparedStatement.java:390)
    at
    com.solarmetric.jdbc.DelegatingPreparedStatement.setInt(DelegatingPreparedStatement.java:390)
    at kodo.jdbc.sql.DBDictionary.setInt(DBDictionary.java:980)
    at kodo.jdbc.sql.DBDictionary.setUnknown(DBDictionary.java:1299)
    at kodo.jdbc.sql.SQLBuffer.setParameters(SQLBuffer.java:638)
    at kodo.jdbc.sql.SQLBuffer.prepareStatement(SQLBuffer.java:539)
    at kodo.jdbc.sql.SQLBuffer.prepareStatement(SQLBuffer.java:512)
    at kodo.jdbc.sql.SelectImpl.execute(SelectImpl.java:332)
    at kodo.jdbc.sql.SelectImpl.execute(SelectImpl.java:301)
    at kodo.jdbc.sql.Union$UnionSelect.execute(Union.java:642)
    at kodo.jdbc.sql.Union.execute(Union.java:326)
    at kodo.jdbc.sql.Union.execute(Union.java:313)
    at
    kodo.jdbc.kernel.SelectResultObjectProvider.open(SelectResultObjectProvider.java:98)
    at
    kodo.kernel.QueryImpl$PackingResultObjectProvider.open(QueryImpl.java:2405)
    at
    com.solarmetric.rop.EagerResultList.<init>(EagerResultList.java:22)
    at kodo.kernel.QueryImpl.toResult(QueryImpl.java:1445)
    at kodo.kernel.QueryImpl.execute(QueryImpl.java:1136)
    at kodo.kernel.QueryImpl.execute(QueryImpl.java:901)
    at kodo.kernel.QueryImpl.execute(QueryImpl.java:865)
    at kodo.kernel.DelegatingQuery.execute(DelegatingQuery.java:787)
    at kodo.jdo.QueryImpl.executeWithArray(QueryImpl.java:210)
    at kodo.jdo.QueryImpl.execute(QueryImpl.java:137)
    at
    edu.ucsc.whisper.core.dao.JdoUserDao.findAllUsersWithFirstName(JdoUserDao.java:232)
    --- DefaultUser.java -------------------------------------------------
    public class DefaultUser
    implements User
    /** The account username. */
    private String username;
    /** The account password. */
    private String password;
    /** A flag indicating whether or not the account is enabled. */
    private boolean enabled;
    /** The authorities granted to this account. */
    private Set<Authority> authorities;
    /** Information about the user, including their name and text that
    describes them. */
    private UserInfo userInfo;
    /** The set of organizations where this user works. */
    private Set<Organization> organizations;
    --- DefaultUser.jdo --------------------------------------------------
    <?xml version="1.0"?>
    <!DOCTYPE jdo PUBLIC
    "-//Sun Microsystems, Inc.//DTD Java Data Objects Metadata 2.0//EN"
    "http://java.sun.com/dtd/jdo_2_0.dtd">
    <jdo>
    <package name="edu.ucsc.whisper.core">
    <sequence name="user_id_seq"
    factory-class="native(Sequence=user_id_seq)"/>
    <class name="DefaultUser" detachable="true"
    table="whisper_user" identity-type="datastore">
    <datastore-identity sequence="user_id_seq" column="userId"/>
    <field name="username">
    <column name="username" length="80" jdbc-type="VARCHAR" />
    </field>
    <field name="password">
    <column name="password" length="40" jdbc-type="CHAR" />
    </field>
    <field name="enabled">
    <column name="enabled" />
    </field>
    <field name="userInfo" persistence-modifier="persistent"
    default-fetch-group="true" dependent="true">
    <extension vendor-name="jpox"
    key="implementation-classes"
    value="edu.ucsc.whisper.core.DefaultUserInfo" />
    <extension vendor-name="kodo"
    key="type"
    value="edu.ucsc.whisper.core.DefaultUserInfo" />
    </field>
    <field name="authorities" persistence-modifier="persistent"
    table="user_authorities"
    default-fetch-group="true">
    <collection
    element-type="edu.ucsc.whisper.core.DefaultAuthority" />
    <join column="userId" delete-action="cascade"/>
    <element column="authorityId" delete-action="cascade"/>
    </field>
    <field name="organizations" persistence-modifier="persistent"
    table="user_organizations" mapped-by="user"
    default-fetch-group="true" dependent="true">
    <collection
    element-type="edu.ucsc.whisper.core.DefaultOrganization"
    dependent-element="true"/>
    <join column="userId"/>
    <!--<element column="organizationId"/>-->
    </field>
    </class>
    </package>
    </jdo>
    --- DefaultUser.jdoquery ---------------------------------------------
    <?xml version="1.0"?>
    <!DOCTYPE jdo PUBLIC
    "-//Sun Microsystems, Inc.//DTD Java Data Objects Metadata 2.0//EN"
    "http://java.sun.com/dtd/jdo_2_0.dtd">
    <jdo>
    <package name="edu.ucsc.whisper.core">
    <class name="DefaultUser">
    <query name="UserByUsername"
    language="javax.jdo.query.JDOQL"><![CDATA[
    SELECT UNIQUE FROM edu.ucsc.whisper.core.DefaultUser
    WHERE username==searchName
    PARAMETERS java.lang.String searchName
    ]]></query>
    <query name="DisabledUsers"
    language="javax.jdo.query.JDOQL"><![CDATA[
    SELECT FROM edu.ucsc.whisper.core.DefaultUser WHERE
    enabled==false
    ]]></query>
    <query name="EnabledUsers"
    language="javax.jdo.query.JDOQL"><![CDATA[
    SELECT FROM edu.ucsc.whisper.core.DefaultUser WHERE
    enabled==true
    ]]></query>
    <query name="CountUsers"
    language="javax.jdo.query.JDOQL"><![CDATA[
    SELECT count( this ) FROM edu.ucsc.whisper.core.DefaultUser
    ]]></query>
    </class>
    </package>
    </jdo>

    I'm sorry, I have no idea. I suggest sending a test case that
    reproduces the problem to support.

  • I bought Creative suite 5 a few years ago, my computer's hard drive failed, after It was fixed I re installed my program and it now says my Serial number is not valid.

    I bought Creative suite 5 a few years ago, my computer's hard drive failed, after It was fixed I re installed my program and it now says my Serial number is not valid.
    I'm on the last day of a 30 day trail... it won't take my serial number which is the same one that worked before

    Contact support by web chat.
    Mylenium

  • IMac Display failing after 1 year of average use.

    I have a mid 2011 27" iMac (the one with thunderbolt)  and the display is starting to fail after 1 year of average use (an hour of use every day on average).  The display is now much dimmer on the left half of the display (the GPU sits underneath that portion) and also flickers the whole display now and then to a lower brightness.
    I am about 45 days out of warranty and basically will have to cough up $500 to have it fixed by Apple.  My questions are...  Has anyone else ever had this happen to them?  The genius bar employee thought that it was something he has seen happen to displays when exposed to heat over long periods of time.  So basically playing starcraft 2 and doing video encoding killed my display if you believe that to be the cause.
    Are there any conditions in which apple would recognize this as a manufacturing defect and fix it free of charge?  Otherwise it looks like im writing a check for $500 next week.

    Today a dark streak showed up on my display right in the middle. It is shaped like a zebra stripe but is dark grey and wont go away.
    I too have a mid 2011 27" 3.4GHz i7 iMac with a 2GB Graphics card . Im about 50 days out of warranty.
    Seems like its perfect coincidence  for apple that 50 days after my warranty runs out, the most expensive thing to repair breaks.
    After doing some research it seems like there is a lot of people with this model experiencing this problem.
    Pretty sad that my 2005 iMac is holding up better than my brand new iMac.
    Been saving up for it for like 4 years and I dont have the money to repair it.
    I tried everything, I knew none of the stuff i tried would work but i still tried resetting the PRAM repairing disk permissions all that good stuff. I even have an air purifer/dehumidifier in my room so i dont know what caused it.
    I didnt think i needed applcare because ive had my other imac since 2005  without a hiccup.
    Ive rendered out 3d animations using the cpu at 100% for days and it still runs fine.
    Havent even used my new imac for anything that would put stress on it.
    What i do notice is that it runs hot all the time. ever since I got it. Sometimes just doing basic stuff such as browsing safari it can get so hot where i cant touch it. my GPU normally stays at around 70C and the hottest its been was 80C which is still safe. The alluminum is supposed to transfer heat better but all it does is retain heat more and get super hot. the old plastic one i have is 10x cooler.
    What i did notice today tho when my screen problem happened, is that the GPU temperatue is at 50C and wont go above it. I monitor the temperature with istat  and never have seen it at 50C and below unless i am just doing a fresh boot.
    What really concerns me tho is how the screen went out so close to when my warranty expired.

  • How to restore standby from primary after failover.

    HI we have a dr setup but the standby is provided archives through an application not the datagaurd.
    So for applying application patch I cant switchover the database.
    I have to shutdown the primary database and apply failover to the standby database to apply patch.
    Can you please tell how can I restore my standby database to actual primary after failover.
    As our actual activity will be in 2 weeks time. We want to apply patch on saturday.
    DR DRILL is on 8th nov.

    Your earlier statement "Can you please tell how can I restore my standby database to actual primary after failover"
    means something different from "SO have to restore from cold backup of primary database"
    What you seem to want is a procedure to recreate the Standby from a Backup of the Pirmary. (ie FROM Primary TO Standby).
    You haven't specified your Oracle Version. DG has been available since 9i (was even backported to 8i) but I will presume that you are using 9i
    You should follow the same process as used for the first creation of the standby, except that you don't have to repeat some of the setup steps (listener, tnsnames etc)
    http://download.oracle.com/docs/cd/B10501_01/server.920/a96653/create_ps.htm#63563
    Hemant K Chitale
    http://hemantoracledba.blogspot.com

Maybe you are looking for