"Service Cluster left the cluster" - lost all my data
My four storage enabled cluster nodes lost all their cached data when the all services left the cluster in response to some issue(?). Is that the expected behavior? Is the correct procedure to transactionally store to disk so you can reload when this happens or should this simply never happen? Seems like this should not happen. These four nodes are on the the same server. At about time 12:31 everything goes pear shaped.
2011-01-14 12:31:16.904/50004.436 Oracle Coherence GE 3.6.0.0 <Error> (thread=Cluster, member=3): This senior Member(Id=3, Timestamp=2011-01-13 22:37:52.106, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4428,member:Administrator, Role=CoherenceServer) appears to have been disconnected from other nodes due to a long period of inactivity and the seniority has been assumed by the Member(Id=9, Timestamp=2011-01-13 22:38:01.438, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:3904,member:Administrator, Role=CoherenceServer); stopping cluster service.
2011-01-14 12:31:16.905/50004.437 Oracle Coherence GE 3.6.0.0 <D5> (thread=Cluster, member=3): Service Cluster left the cluster
2011-01-14 12:31:16.906/50004.438 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedStatsCacheService, member=3): Service DistributedStatsCacheService left the cluster
2011-01-14 12:31:16.906/50004.438 Oracle Coherence GE 3.6.0.0 <D5> (thread=Proxy:ExtendTcpProxyService, member=3): Service ExtendTcpProxyService left the cluster
2011-01-14 12:31:16.907/50004.439 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedQuotesCacheService, member=3): Service DistributedQuotesCacheService left the cluster
2011-01-14 12:31:16.913/50004.445 Oracle Coherence GE 3.6.0.0 <D5> (thread=Invocation:Management, member=3): Service Management left the cluster
2011-01-14 12:31:16.913/50004.445 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedOrdersService, member=3): Service DistributedOrdersService left the cluster
2011-01-14 12:31:16.913/50004.445 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedCacheService, member=3): Service DistributedCacheService left the cluster
2011-01-14 12:31:16.914/50004.446 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=214992652, Open=false)
2011-01-14 12:31:16.914/50004.446 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=8305999, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=1383343339, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: TcpConnection(Id=0x0000012D84061C15C0A803149CF3279B334BE6140AC76C47CA03670D76A96D22, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65480)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=1003858188, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=1586910282, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: TcpConnection(Id=0x0000012D84060E5AC0A8031442EA3CC26AC425D55D93A6AFC5404E5A76A96D1E, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65472)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor:TcpProcessor, member=3): Released: TcpConnection(Id=0x0000012D84061C15C0A803149CF3279B334BE6140AC76C47CA03670D76A96D22, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65480)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=160435953, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor:TcpProcessor, member=3): Released: TcpConnection(Id=0x0000012D84060E5AC0A8031442EA3CC26AC425D55D93A6AFC5404E5A76A96D1E, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65472)
2011-01-14 12:31:16.916/50004.448 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=1635893341, Open=false)
2011-01-14 12:31:16.916/50004.448 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: TcpConnection(Id=0x0000012D84061203C0A8031455CD3A790F6009CA79AEC8BACC464D9976A96D20, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65478)
2011-01-14 12:31:16.916/50004.448 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor:TcpProcessor, member=3): Released: TcpConnection(Id=0x0000012D84061203C0A8031455CD3A790F6009CA79AEC8BACC464D9976A96D20, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65478)
2011-01-14 12:31:16.916/50004.448 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedExecutionsService, member=3): Service DistributedExecutionsService left the cluster
2011-01-14 12:31:16.919/50004.451 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedPositionsCacheService, member=3): Service DistributedPositionsCacheService left the clusterand ...
2011-01-14 12:31:22.874/50006.273 Oracle Coherence GE 3.6.0.0 <Info> (thread=main, member=n/a): Restarting cluster
2011-01-14 12:31:22.924/50006.323 Oracle Coherence GE 3.6.0.0 <D4> (thread=main, member=n/a): TCMP bound to /192.168.3.20:8094 using SystemSocketProvider
2011-01-14 12:31:52.937/50036.336 Oracle Coherence GE 3.6.0.0 <Warning> (thread=Cluster, member=n/a): This Member(Id=0, Timestamp=2011-01-14 12:31:22.924, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:4136,member:Administrator, Role=CoherenceServer) has been attempting to join the cluster at address 225.0.0.1:54321 with TTL 4 for 30 seconds without success; this could indicate a mis-configured TTL value, or it may simply be the result of a busy cluster or active failover.
2011-01-14 12:31:52.950/50036.349 Oracle Coherence GE 3.6.0.0 <Warning> (thread=Cluster, member=n/a): Received a discovery message that indicates the presence of an existing cluster that does not respond to join requests; this is usually caused by a network layer failure:Logs starting at 12:30 from the four nodes are here:
http://www.nmedia.net/~andrew/logs/1.log
http://www.nmedia.net/~andrew/logs/2.log
http://www.nmedia.net/~andrew/logs/3.log
http://www.nmedia.net/~andrew/logs/4.log
If someone could tell me if this is a bug in the cluster re-join logic or something I screwed up that would be great. Thanks!
Andrew
Hi Andrew
I had a quick look at your logs but cannot say for certain why your cluster died. I can say that losing data is a normal consequence of node loss though. If you have the backup count set to 1 then you can lose a single node without losing data. If you lose more than one node (on different machines, or the same machine if you only have one) over a very short space of time then you will almost certainly lose at least one partition and hence lose the data within that partition.
Going back to you logs is is difficult to determine the underlying cause without the whole set of logs. You have posted links to four logs but from looking at them the cluster has about 16 nodes. I know from experience (as we had a cluster that was quite unstable for a while) that tracing these issues through the logs can be a bit awkwrd but you soon get the hang of it :-)
For example in the log http://www.nmedia.net/~andrew/logs/1.log you have...
2011-01-14 12:31:16.807/49993.331 Oracle Coherence GE 3.6.0.0 <D5> (thread=Cluster, member=9): MemberLeft notification for Member(Id=3, Timestamp=2011-01-13 22:37:52.106, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4428,member:Administrator, Role=CoherenceServer, PublisherSuccessRate=0.9975, ReceiverSuccessRate=0.9999, PauseRate=0.0, Threshold=93, Paused=false, Deferring=false, OutstandingPackets=0, DeferredPackets=0, ReadyPackets=0, LastIn=261ms, LastOut=277ms, LastSlow=n/a) received from Member(Id=22, Timestamp=2011-01-14 08:21:22.284, Address=192.168.3.121:8092, MachineId=27513, Location=machine:H1,process:3716,member:Howard, Role=Order_entry_window, PublisherSuccessRate=0.8326, ReceiverSuccessRate=1.0, PauseRate=0.0024, Threshold=1456, Paused=false, Deferring=false, OutstandingPackets=0, DeferredPackets=0, ReadyPackets=0, LastIn=0ms, LastOut=8ms, LastSlow=n/a)...which is Member-9 recieving a message about the departure of Member-3 from Member-22, so you would then need to look at the logs for Member-22 to see why it thought Member-3 had departed and also look at the logs for Member-3 for that time to see what might be wrong with it.
The more worrying message would be these...
2011-01-14 12:31:16.709/49993.233 Oracle Coherence GE 3.6.0.0 <Warning> (thread=PacketPublisher, member=9): Experienced a 19025 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2011-01-14 08:21:12.174, Address=192.168.3.121:8090, MachineId=27513, Location=machine:H1,process:4316,member:Howard, Role=OrderbookviewerViewer); 111 packets rescheduled, PauseRate=0.0014, Threshold=1696...a 19 second delay is a long time and would suggest either very long GC pauses of a network problem. Do you have GC logs of these processes. Are all the servers connected to the same switch or is the cluster distributed over more than one part of your network? Do you have too much on one machine, are you overloading the NIC, are you swapping, all these can cause delays and/or los of packets.
We have had problems with storage disabled nodes doing long GC pauses and causing storage nodes to drop out of the cluster. Our cluster was on 3.5.3-p8 whereas you are on 3.6.0.0 which is supposed to have better node death detection so you might not have the same issues we had.
Sorry to not be more help,
JK
Similar Messages
-
I downloaded the iOS7, lost all the custom ring tones that I had on my phone. the iTunes sync page says that they have all been selected but they don't show up in my phone? Help?
I had the same problem and was able to fix it. Go to your iTunes and click on the Tones folder on the left under library. Some of my ring tones had an exclamation mark next to it and those were the ones that did not sync. You have to click on those and find it in your computer. Once the exclamation mark is gone, go to Tones section of your device in iTunes, make sure you select the tones you want synced, and click apply. I hope that helps you!
-
i have updated my 4s iphone to ios 6.1.3 and have trouble connecting it to wi fi . i tried to back up it on itunes and then restore it but now i have left with a phone that still doesnt connect to wi fi and lost all my data. anyone knows how to fix it?
If no change after restoring the iPhone with iTunes as a new iPhone or not from the backup, there is a hardware problem.
-
My macbook pro recently got swiped and I lost all my data, problem is, just before it was swiped I set up my new iphone 5 which transferred all my photos and music over which is no longer on the computer. What will happen to my phone if I plug it in?
I never used icloud before so there is no data to back up from. I just spoke to the apple store and they said that if I do plug it in, all the data will be swiped....is there a program I can use to transfer my data from my phone to computer without loosing it all ???
Renee -
I have iphone 4s and i did by mistake a restore with different id so i lost all my data and i got all the other id data , contacts .....etc , please how can i fix that mistake
How to Restore from a backup: http://support.apple.com/kb/ht1766
-
I lost all my data while doing the latest software upgrade, how do I get it back if I dont have a backup on iTunes?
If you don't have a backup, then all your pics/text/notes are gone.
You can sync back music and app. If you were syncing your contacts with a program in itunes, then you'll get that back.
Not sure why you didn't do a backup, especially prior to doing an upgrade. -
I have lost all my data from my iphone during synchronisation. Can anyone help me?
You might profit from a visit to a Minneapolis Apple store. With a brand new Mac, I think they will be most helpful - not just with transferring files, but with other advice as well. And I strongly recommend you invest in a back up drive. The Apple folk can help with that as well.
-
I tried to sync my phone for the first time and somehow it restored and i lost all my data. Help!
If you don't have a backup made by iTunes, your data and settings are gone.
Apps and other media can be downloaded again for free:
Downloading past purchases from the App Store, iBookstore, and iTunes Store -
I had to reboot my Ipod touch. I lost all my data and I went into my settings to turn the internet on so I could get everything back and It wouldnt let me turn it on It was just grey. Please help me.
Sounds like this:
iOS: Wi-Fi or Bluetooth settings grayed out or dim
One user reported that placing the iPod in the freezer fixed the problem. A trick that works frequently with iPhones:
Settings > AirPlane Mode ON, Do Not Disturb ON
Power down and wait 5-10 minutes
Power up
Settings > AirPlane Mode OFF, Do Not Disturb OFF
If not successful, an appointment at the Genius Bar of an Apple store is usually in order.
Apple Retail Store - Genius Bar -
I was updating my iphone 4 to ios6 but the phone has entered recovery mode, have i lost all my data.
How do i find out if my picture were backedup to the icloudDid you fail to make sure that everything was on your computer before updating?
-
I backed up by computer to an external HD and lost all my data from TC. I backed it back up FROM the external HD, but I lost the dateline on Time Machine. Can I restore the dates and timess of my backups?
Another one that Apple doesn't make obvious.
See #E3 in Time Machine - Troubleshooting. -
Hello,
Ive been using sync on both my laptop and desktop in windows 7 and linux mint. Its been working great cross platform, but I recently formatted my drives and reinstalled everything. I obviously didn't save the Firefox passwords because I trusted the sync server to have it all. After a fresh OS install, I installed the latest ver of Firefox, logged in to my sync account and selected the default option for sync (both ways) knowing that the server has all my data and will download it onto my newly installed Firefox.
Unfortunately the server doesn't seem to have my data anymore as it didn't download any passwords I had saved and sync since before the full format/installation process.
How do I get it back now? Does the server has a roll back function?
Thank you
OviThank you again for your kind and prompt reply. I really appreciate it. I think the fact that I am transferring stuff from my old desktop PC to a new laptop has saved me. I copied all the apps from iTunes on the old desktop to a flashdrive, then added those apps (via Explorer!) into iTunes Media. Then I synched the phone. I have got all but 45 apps back, including the data!!! How awesome is that! It's 9.50 pm here in Australia and I have been sat in front of all this technology for about 15 hours now, so I'm giving up for the night. Perhaps tomorrow, somehow the lost 45 apps will magically appear. I haven't quite worked out which ones I'm missing as I'm so tired.
I want to thank you most sincerely though for taking the time to reply to me. You are wonderful.
Please, Heavenly Technology God, let this all work in the morning. -
i recently update my iphone 4s to software IOS 7 thrue only not from the PC. During the process the mobile switvhed off but the apple sign still showng on the sreen.
the mobile is not woking i lost all my data (contacts and photos)
im sure u have a solve for this problem
i apreciate your quick answerYou need to update iTunes to 11.1 on your PC
-
I gifted an audio book to someone and they lost all their data, how can I re-send the gift or have them download the already purchased audiobook?
You get one and only one download of audiobooks. As always it is your responsibility to back up your content.
-
Lost all my data on my ipad!
An automatic update has reset my ipad and lost all my data.
How can I get everything back including photos, games (and the stages I had reached) etc that I should have sitting backed up on icloud?I wish I could answer that question definitively. What it should do and what it actually does are two different things - no matter what anyone tries to tell you. That has been my experience with my iDevices anyway.
Back to your issue, you can restore from one of the older backups and it should contain most of your app data - of course depending on how old the backup is.
Connect the iPad to your computer, launch iTunes and then right click on the iPad name on the left and select - restore from backup. A window will pop up that you can select one of the older backups from.
Now the backup obviously would not contain anything more current than what was in the backup on the day that the backup was created. So your scheduke and calendar would not be up to date. If the photos on the iPad were from a sync with iTunes, they should still be available to sync again with iTunes on your computer.
Its really up to you if you think that restoring from one of those backups would be updated enough for you, or if you just want to move on from here.
You should probably read this before you do anything so you will know exactly want will be returned to your iPad if you restore from the backup.
http://support.apple.com/kb/HT4946
Maybe you are looking for
-
ITunes icon missing from settings list
All icons except iTunes is in settings list. It is in the dock. I am unable to buy books or apps without adding my credit card info. Have looked at numerous FAQ but haven 't found an answer. Thank you
-
Hi, Can any one let me know about the out bound IDOC ststuses. In our system the Out bOund IDOC's are being in status 03, they are not moving to 12. Can any one please let me know why it is not moving to status 12. Regards, Ravi G
-
Hi, We have configured duet workflow & implement a simple test workflow to test the same. After executing the workflow and the scheduled programs we get the following error in slg1 : Exception of type CX_SY_REF_IS_INITIAL has occurred. See details fo
-
Computation in between items in a region
Hi, I have three items on a region Item 1 Item 2 item 3 10 20 30 As soon as i enter 10 in item1 and 20 in item2 then item3 needs to be pupulated with the total of it. Could any one guide me the processs of doing this. Please advice Kris
-
Labview 5.0's system exec vi
Anyone know of a better VI than "exec" to run a dos program. The problem is, that the exec vi does NOT wait for it to complete. I can see that its being called but it doesnt allow it to finish. The use of a wait VI did not do anything, since the dos