IP Communicator Delay

Hello,
In my communication manager i configured a trunk against a Gatekeeper. When i dial from an IP communicator to the Gatekeeper i have a 15 sec of delay but when i try from any hardphone the call is established quickly.
In the device configuration the hardphone has a feature enabled by default called Enbloc dialing , when i disabled it, the hardphone has the same behavior. The Ip communicator has not this option. is there any way that i can eliminate this delay from the Ip communicator???
Thanks for your support
Jose Arango

Jose,
Do you experience the delay when dialing all the digits, of the number, before hitting the NewCall softkey and then hitting the Dial softkey? Sounds like there are overlapping Route Patterns configured in CUCM. You'll need to get rid of them, or reduce the InterDigit Timeout on CUCM to reduce the delay.
The below link explains how to change the parameter in CUCM. It's a bit dated but still valid.
http://www.cisco.com/en/US/partner/products/sw/voicesw/ps556/products_tech_note09186a00800dab26.shtml
-Felipe

Similar Messages

How to monitor member pause event, communication delay in a cache cluster

I want to create a small monitor for my cache cluster
I am planning to send alert on the following event
1. member left --- I can capture it using member listener
2. frequent communication delay -- no idea how to capture this
3. member pause -- no idea how to capture this
4. frequency of GC
appreciate if you can provide some input on 2,3 & 4.

Hi,
You can do this by configuring the local scheme used by your ReadWriteBackingMap to use the BINARY unit calculator. For example:
      <backing-map-scheme>
        <read-write-backing-map-scheme>
          <internal-cache-scheme>
            <local-scheme>
              <unit-calculator>BINARY</unit-calculator>
            </local-scheme>
          </internal-cache-scheme>
      </backing-map-scheme>The BINARY unit calculator assigns a cached object a weight equal to the number of bytes of memory required to cache the object. This unit calculator can only be used for local caches that store Binary objects (as is the case with a backing map used by a distributed cache, as in your case).
Once you have configured the BINARY unit calculator, the "Units" attribute of the distributed cache MBean will report the total amount of memory (in bytes) consumed by the instance of the cache running on that particular cluster member.
See the following for additional details:
http://wiki.tangosol.com/display/COH32UG/local-scheme
http://www.tangosol.com/downloads/javadoc/320/com/tangosol/net/cache/BinaryMemoryCalculator.html
Regards,
Jason

Serial Communication delay

I am experiencing a delay when I run my commands through LV7.
The following is my pc settings:
Baud rate:57600
bits: 8
Parity: 1
stop: 1
no flow control
However when I run the exact same command on hyperterminal I get no delay.
I am writing 15 bytes at a time. Is there anything I can try to fix this issue.

Hi Preitano,
I had not considered that implementation.
Let's walk through the code:
1. Write to the serial port (send command, etc).
2. Wait 200 ms
3. Either wait for timeout or have data at the serial buffer.
4. Read until serial buffer is empty.
I assume there are no delays inside the True case, simply a read serial port and an append to string.
And you use the value from "Bytes at Port" to set how many bytes to read. Seems okay.
Have you ever noticed loss of data? ie: not all messages are received?
It is true that you are reading at 57600. I am used to much slower speeds, such as 9600.
I don't see why you would have noticeable delays as compared with HyperTerminal. The only concern I would have is if some of the messages are lost due to not having a delay.
The thing to consider is that Hyperterminal (or other such sw) will read the port continuously and never stop.
In LV, you read until -- something --... then stop reading to process the data (make decision, etc.). And there is a bit of overhead because of it. It is this overhead that adds up with every iteration of the loop. Maybe someone can correct me, but I seem to remember that it is approx 4ms per action. So every iteration of your loop probably takes approx 20ms. Which is not that much... but it adds up quickly.
In terms of delays, what is the difference? Is it consistent? Does it slow down?
To trap the delay, you may need to be clever. The "Highlight Execution" is not your friend in this case
However, if for some reason or other, the number of bytes reported by the buffer is not accurate, the "timeout" of the VISA Read may kick in and cause excessive delay. The only way to confirm this would be to compare the two values (number from "Bytes at Port" and actual bytes read).
Hope this helps..
Ray

"Service Cluster left the cluster" - lost all my data

My four storage enabled cluster nodes lost all their cached data when the all services left the cluster in response to some issue(?). Is that the expected behavior? Is the correct procedure to transactionally store to disk so you can reload when this happens or should this simply never happen? Seems like this should not happen. These four nodes are on the the same server. At about time 12:31 everything goes pear shaped.
2011-01-14 12:31:16.904/50004.436 Oracle Coherence GE 3.6.0.0 <Error> (thread=Cluster, member=3): This senior Member(Id=3, Timestamp=2011-01-13 22:37:52.106, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4428,member:Administrator, Role=CoherenceServer) appears to have been disconnected from other nodes due to a long period of inactivity and the seniority has been assumed by the Member(Id=9, Timestamp=2011-01-13 22:38:01.438, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:3904,member:Administrator, Role=CoherenceServer); stopping cluster service.
2011-01-14 12:31:16.905/50004.437 Oracle Coherence GE 3.6.0.0 <D5> (thread=Cluster, member=3): Service Cluster left the cluster
2011-01-14 12:31:16.906/50004.438 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedStatsCacheService, member=3): Service DistributedStatsCacheService left the cluster
2011-01-14 12:31:16.906/50004.438 Oracle Coherence GE 3.6.0.0 <D5> (thread=Proxy:ExtendTcpProxyService, member=3): Service ExtendTcpProxyService left the cluster
2011-01-14 12:31:16.907/50004.439 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedQuotesCacheService, member=3): Service DistributedQuotesCacheService left the cluster
2011-01-14 12:31:16.913/50004.445 Oracle Coherence GE 3.6.0.0 <D5> (thread=Invocation:Management, member=3): Service Management left the cluster
2011-01-14 12:31:16.913/50004.445 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedOrdersService, member=3): Service DistributedOrdersService left the cluster
2011-01-14 12:31:16.913/50004.445 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedCacheService, member=3): Service DistributedCacheService left the cluster
2011-01-14 12:31:16.914/50004.446 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=214992652, Open=false)
2011-01-14 12:31:16.914/50004.446 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=8305999, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=1383343339, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: TcpConnection(Id=0x0000012D84061C15C0A803149CF3279B334BE6140AC76C47CA03670D76A96D22, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65480)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=1003858188, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=1586910282, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: TcpConnection(Id=0x0000012D84060E5AC0A8031442EA3CC26AC425D55D93A6AFC5404E5A76A96D1E, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65472)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor:TcpProcessor, member=3): Released: TcpConnection(Id=0x0000012D84061C15C0A803149CF3279B334BE6140AC76C47CA03670D76A96D22, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65480)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=160435953, Open=false)
2011-01-14 12:31:16.915/50004.447 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor:TcpProcessor, member=3): Released: TcpConnection(Id=0x0000012D84060E5AC0A8031442EA3CC26AC425D55D93A6AFC5404E5A76A96D1E, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65472)
2011-01-14 12:31:16.916/50004.448 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: Channel(Id=1635893341, Open=false)
2011-01-14 12:31:16.916/50004.448 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor, member=3): Closed: TcpConnection(Id=0x0000012D84061203C0A8031455CD3A790F6009CA79AEC8BACC464D9976A96D20, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65478)
2011-01-14 12:31:16.916/50004.448 Oracle Coherence GE 3.6.0.0 <D6> (thread=Proxy:ExtendTcpProxyService:TcpAcceptor:TcpProcessor, member=3): Released: TcpConnection(Id=0x0000012D84061203C0A8031455CD3A790F6009CA79AEC8BACC464D9976A96D20, Open=false, LocalAddress=192.168.3.20:9091, RemoteAddress=192.168.3.6:65478)
2011-01-14 12:31:16.916/50004.448 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedExecutionsService, member=3): Service DistributedExecutionsService left the cluster
2011-01-14 12:31:16.919/50004.451 Oracle Coherence GE 3.6.0.0 <D5> (thread=DistributedCache:DistributedPositionsCacheService, member=3): Service DistributedPositionsCacheService left the clusterand ...
2011-01-14 12:31:22.874/50006.273 Oracle Coherence GE 3.6.0.0 <Info> (thread=main, member=n/a): Restarting cluster
2011-01-14 12:31:22.924/50006.323 Oracle Coherence GE 3.6.0.0 <D4> (thread=main, member=n/a): TCMP bound to /192.168.3.20:8094 using SystemSocketProvider
2011-01-14 12:31:52.937/50036.336 Oracle Coherence GE 3.6.0.0 <Warning> (thread=Cluster, member=n/a): This Member(Id=0, Timestamp=2011-01-14 12:31:22.924, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:4136,member:Administrator, Role=CoherenceServer) has been attempting to join the cluster at address 225.0.0.1:54321 with TTL 4 for 30 seconds without success; this could indicate a mis-configured TTL value, or it may simply be the result of a busy cluster or active failover.
2011-01-14 12:31:52.950/50036.349 Oracle Coherence GE 3.6.0.0 <Warning> (thread=Cluster, member=n/a): Received a discovery message that indicates the presence of an existing cluster that does not respond to join requests; this is usually caused by a network layer failure:Logs starting at 12:30 from the four nodes are here:
http://www.nmedia.net/~andrew/logs/1.log
http://www.nmedia.net/~andrew/logs/2.log
http://www.nmedia.net/~andrew/logs/3.log
http://www.nmedia.net/~andrew/logs/4.log
If someone could tell me if this is a bug in the cluster re-join logic or something I screwed up that would be great. Thanks!
Andrew

Hi Andrew
I had a quick look at your logs but cannot say for certain why your cluster died. I can say that losing data is a normal consequence of node loss though. If you have the backup count set to 1 then you can lose a single node without losing data. If you lose more than one node (on different machines, or the same machine if you only have one) over a very short space of time then you will almost certainly lose at least one partition and hence lose the data within that partition.
Going back to you logs is is difficult to determine the underlying cause without the whole set of logs. You have posted links to four logs but from looking at them the cluster has about 16 nodes. I know from experience (as we had a cluster that was quite unstable for a while) that tracing these issues through the logs can be a bit awkwrd but you soon get the hang of it :-)
For example in the log http://www.nmedia.net/~andrew/logs/1.log you have...
2011-01-14 12:31:16.807/49993.331 Oracle Coherence GE 3.6.0.0 <D5> (thread=Cluster, member=9): MemberLeft notification for Member(Id=3, Timestamp=2011-01-13 22:37:52.106, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4428,member:Administrator, Role=CoherenceServer, PublisherSuccessRate=0.9975, ReceiverSuccessRate=0.9999, PauseRate=0.0, Threshold=93, Paused=false, Deferring=false, OutstandingPackets=0, DeferredPackets=0, ReadyPackets=0, LastIn=261ms, LastOut=277ms, LastSlow=n/a) received from Member(Id=22, Timestamp=2011-01-14 08:21:22.284, Address=192.168.3.121:8092, MachineId=27513, Location=machine:H1,process:3716,member:Howard, Role=Order_entry_window, PublisherSuccessRate=0.8326, ReceiverSuccessRate=1.0, PauseRate=0.0024, Threshold=1456, Paused=false, Deferring=false, OutstandingPackets=0, DeferredPackets=0, ReadyPackets=0, LastIn=0ms, LastOut=8ms, LastSlow=n/a)...which is Member-9 recieving a message about the departure of Member-3 from Member-22, so you would then need to look at the logs for Member-22 to see why it thought Member-3 had departed and also look at the logs for Member-3 for that time to see what might be wrong with it.
The more worrying message would be these...
2011-01-14 12:31:16.709/49993.233 Oracle Coherence GE 3.6.0.0 <Warning> (thread=PacketPublisher, member=9): Experienced a 19025 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2011-01-14 08:21:12.174, Address=192.168.3.121:8090, MachineId=27513, Location=machine:H1,process:4316,member:Howard, Role=OrderbookviewerViewer); 111 packets rescheduled, PauseRate=0.0014, Threshold=1696...a 19 second delay is a long time and would suggest either very long GC pauses of a network problem. Do you have GC logs of these processes. Are all the servers connected to the same switch or is the cluster distributed over more than one part of your network? Do you have too much on one machine, are you overloading the NIC, are you swapping, all these can cause delays and/or los of packets.
We have had problems with storage disabled nodes doing long GC pauses and causing storage nodes to drop out of the cluster. Our cluster was on 3.5.3-p8 whereas you are on 3.6.0.0 which is supposed to have better node death detection so you might not have the same issues we had.
Sorry to not be more help,
JK

Is it a Limitation of Named Cache Storage- Fails for large volume ???

I debugged the code which loads data from database into the cache as mentioned in the posting : Pre-loading the Cache from Database during application start-up
Now what this code does is load 869 rows from database into java.util.Map using Hibernate loadAll() method. All is fine uptill this point.
The next step is to putAll the entries into cache i.e. contactCache.putAll(buffer). This is where it hungs for a min and i see org.eclipse.jdi.TimeoutException followed by below exception stack trace
IN DEFAULT CACHE SERVER JVM
2009-10-30 10:53:44.076/1342.849 Oracle Coherence GE 3.5.2/463 <Warning> (thread=PacketPublisher, member=1): Experienced a 1390 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2009-10-30 10:31:54.697, Address=165.137.250.122:8089, MachineId=54906, Location=site:cable.comcast.com,machine:PACDCL-CJWWND1b,process:4856); 23 packets rescheduled, PauseRate=0.0010, Threshold=2080
2009-10-30 11:06:10.060/2088.833 Oracle Coherence GE 3.5.2/463 <Error> (thread=Cluster, member=1): Attempting recovery (due to soft timeout) of Guard{Daemon=DistributedCache}
2009-10-30 11:06:12.430/2091.203 Oracle Coherence GE 3.5.2/463 <Error> (thread=Cluster, member=1): Terminating guarded execution (due to hard timeout) of Guard{Daemon=DistributedCache}
2009-10-30 11:06:15.657/2094.430 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:06:15.954/2094.727 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
Coherence <Error>: Halting this cluster node due to unrecoverable service failure
2009-10-30 11:06:16.671/2095.444 Oracle Coherence GE 3.5.2/463 <Error> (thread=Termination Thread, member=1): Full Thread Dump
Thread[Cluster|Member(Id=1, Timestamp=2009-10-30 10:31:31.621, Address=165.137.250.122:8088, MachineId=54906, Location=site:cable.comcast.com,machine:PACDCL-CJWWND1b,process:5380),5,Cluster]
     java.lang.Object.wait(Native Method)
     com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onWait(Grid.CDB:9)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
     java.lang.Thread.run(Thread.java:595)
Thread[(Code Generation Thread 1),5,system]
Thread[(Signal Handler),5,system]
Thread[TcpRingListener,6,Cluster]
     java.net.PlainSocketImpl.socketAccept(Native Method)
     java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
     java.net.ServerSocket.implAccept(ServerSocket.java:450)
     java.net.ServerSocket.accept(ServerSocket.java:421)
     com.tangosol.coherence.component.net.socket.TcpSocketAccepter.accept(TcpSocketAccepter.CDB:18)
     com.tangosol.coherence.component.util.daemon.TcpRingListener.acceptConnection(TcpRingListener.CDB:10)
     com.tangosol.coherence.component.util.daemon.TcpRingListener.onNotify(TcpRingListener.CDB:9)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
     java.lang.Thread.run(Thread.java:595)
Thread[PacketSpeaker,8,Cluster]
     java.lang.Object.wait(Native Method)
     com.tangosol.coherence.component.util.queue.ConcurrentQueue.waitForEntry(ConcurrentQueue.CDB:16)
     com.tangosol.coherence.component.util.queue.ConcurrentQueue.remove(ConcurrentQueue.CDB:7)
     com.tangosol.coherence.component.util.Queue.remove(Queue.CDB:1)
     com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketSpeaker.onNotify(PacketSpeaker.CDB:62)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
     java.lang.Thread.run(Thread.java:595)
Thread[PacketPublisher,6,Cluster]
     java.lang.Object.wait(Native Method)
     com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
     com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketPublisher.onWait(PacketPublisher.CDB:2)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
     java.lang.Thread.run(Thread.java:595)
Thread[(VM Periodic Task),10,system]
Thread[(Sensor Event Thread),5,system]
Thread[(Attach Listener),5,system]
Thread[(GC Main Thread),5,system]
Thread[(Code Optimization Thread 1),5,system]
Thread[Invocation:Management:EventDispatcher,5,Cluster]
     java.lang.Object.wait(Native Method)
     com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
     com.tangosol.coherence.component.util.daemon.queueProcessor.Service$EventDispatcher.onWait(Service.CDB:7)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
     java.lang.Thread.run(Thread.java:595)
Thread[Main Thread,5,main]
     java.lang.Object.wait(Native Method)
     com.tangosol.net.DefaultCacheServer.main(DefaultCacheServer.java:79)
Thread[Logger@9265725 3.5.2/463,3,main]
     java.lang.Object.wait(Native Method)
     com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
     java.lang.Thread.run(Thread.java:595)
Thread[Invocation:Management,5,Cluster]
     java.lang.Object.wait(Native Method)
     com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onWait(Grid.CDB:9)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
     java.lang.Thread.run(Thread.java:595)
Thread[Reference Handler,10,system]
     java.lang.ref.Reference.getPending(Native Method)
     java.lang.ref.Reference.access$000(Unknown Source)
     java.lang.ref.Reference$ReferenceHandler.run(Unknown Source)
Thread[PacketListenerN,8,Cluster]
     java.net.PlainDatagramSocketImpl.receive0(Native Method)
     java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:136)
     java.net.DatagramSocket.receive(DatagramSocket.java:712)
     com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:20)
     com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:4)
     com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:19)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
     java.lang.Thread.run(Thread.java:595)
Thread[Finalizer,8,system]
     java.lang.Thread.run(Thread.java:595)
Thread[DistributedCache,5,Cluster]
     com.tangosol.util.Binary.<init>(Binary.java:87)
     com.tangosol.util.Binary.<init>(Binary.java:61)
     com.tangosol.io.AbstractByteArrayReadBuffer.toBinary(AbstractByteArrayReadBuffer.java:152)
     com.tangosol.io.pof.PofBufferReader.readBinary(PofBufferReader.java:3412)
     com.tangosol.io.pof.PofBufferReader.readAsObject(PofBufferReader.java:2854)
     com.tangosol.io.pof.PofBufferReader.readObject(PofBufferReader.java:2600)
     com.tangosol.io.pof.ConfigurablePofContext.deserialize(ConfigurablePofContext.java:348)
     com.tangosol.coherence.component.util.daemon.queueProcessor.Service.readObject(Service.CDB:4)
     com.tangosol.coherence.component.net.Message.readObject(Message.CDB:1)
     com.tangosol.coherence.component.net.message.requestMessage.distributedCacheRequest.MapRequest.read(MapRequest.CDB:24)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:123)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
     java.lang.Thread.run(Thread.java:595)
Thread[PacketReceiver,7,Cluster]
     java.lang.Object.wait(Native Method)
     com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
     com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketReceiver.onWait(PacketReceiver.CDB:2)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
     java.lang.Thread.run(Thread.java:595)
Thread[PacketListener1,8,Cluster]
     java.net.PlainDatagramSocketImpl.receive0(Native Method)
     java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:136)
     java.net.DatagramSocket.receive(DatagramSocket.java:712)
     com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:20)
     com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:4)
     com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:19)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
     java.lang.Thread.run(Thread.java:595)
Thread[Termination Thread,5,Cluster]
     java.lang.Thread.dumpThreads(Native Method)
     java.lang.Thread.getAllStackTraces(Thread.java:1434)
     sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     java.lang.reflect.Method.invoke(Method.java:585)
     com.tangosol.net.GuardSupport.logStackTraces(GuardSupport.java:791)
     com.tangosol.coherence.component.net.Cluster.onServiceFailed(Cluster.CDB:5)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid$Guard.terminate(Grid.CDB:17)
     com.tangosol.net.GuardSupport$2.run(GuardSupport.java:652)
     java.lang.Thread.run(Thread.java:595)
2009-10-30 11:06:20.958/2099.731 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:06:20.958/2099.731 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
2009-10-30 11:07:17.682/2156.455 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:07:17.682/2156.455 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
2009-10-30 11:07:17.682/2156.455 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): TcpRing: disconnected from member 2 due to a kill request
2009-10-30 11:07:17.682/2156.455 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member 2 left service Management with senior member 1
2009-10-30 11:07:17.682/2156.455 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member 2 left service DistributedCache with senior member 1
2009-10-30 11:07:17.682/2156.455 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member(Id=2, Timestamp=2009-10-30 11:07:17.682, Address=165.137.250.122:8089, MachineId=54906, Location=site:cable.comcast.com,machine:PACDCL-CJWWND1b,process:4856) left Cluster with senior member 1
2009-10-30 11:07:17.682/2156.455 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Service guardian is 51795ms late, indicating that this JVM may be running slowly or experienced a long GC
2009-10-30 11:07:18.073/2156.846 Oracle Coherence GE 3.5.2/463 <Info> (thread=PacketListenerN, member=1): Scheduled senior member heartbeat is overdue; rejoining multicast group.
2009-10-30 11:07:22.696/2161.469 Oracle Coherence GE 3.5.2/463 <Error> (thread=Cluster, member=1): Attempting recovery (due to soft timeout 26277ms ago) of Guard{Daemon=TcpRingListener}
2009-10-30 11:07:22.696/2161.469 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:07:22.696/2161.469 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
2009-10-30 11:07:26.835/2165.608 Oracle Coherence GE 3.5.2/463 <Info> (thread=PacketListenerN, member=1): Scheduled senior member heartbeat is overdue; rejoining multicast group.
2009-10-30 11:07:27.709/2166.482 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:07:27.709/2166.482 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
2009-10-30 11:07:32.723/2171.496 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:07:32.723/2171.496 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
2009-10-30 11:07:42.796/2181.569 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:07:42.843/2181.616 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
2009-10-30 11:07:42.890/2181.663 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Service guardian is 10089ms late, indicating that this JVM may be running slowly or experienced a long GC
2009-10-30 11:07:42.968/2181.741 Oracle Coherence GE 3.5.2/463 <Info> (thread=PacketListenerN, member=1): Scheduled senior member heartbeat is overdue; rejoining multicast group.
2009-10-30 11:07:47.857/2186.630 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:07:47.935/2186.708 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
2009-10-30 11:07:50.527/2189.300 Oracle Coherence GE 3.5.2/463 <Info> (thread=PacketListenerN, member=1): Scheduled senior member heartbeat is overdue; rejoining multicast group.
2009-10-30 11:07:52.948/2191.721 Oracle Coherence GE 3.5.2/463 <Info> (thread=Main Thread, member=1): Restarting Service: DistributedCache
2009-10-30 11:07:52.948/2191.721 Oracle Coherence GE 3.5.2/463 <Error> (thread=Main Thread, member=1): Failed to restart services: java.lang.IllegalStateException: Failed to unregister: DistributedCache{Name=DistributedCache, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=129, BackupPartitions=128}
- SQL Error: 1400, SQLState: 23000
- ORA-01400: cannot insert NULL into ("CTXOWNER"."CTX_TRM_TXTS"."CTX_TRM_TXT_ID")
- SQL Error: 1400, SQLState: 23000
- ORA-01400: cannot insert NULL into ("CTXOWNER"."CTX_TRM_TXTS"."CTX_TRM_TXT_ID")
Coherence <Error>: Halting this cluster node due to unrecoverable service failureNow i do see its complaining about cannot insert null values. But wondering how come it was able to insert from database into java.util.Map. Its matter of dumping from Map to Coherence Cache which is another Map
IN CACHE FACTORY VM
Map (com.comcast.customer.contract.contract.hibernate.Term):
2009-10-30 11:06:46.076/2095.134 Oracle Coherence GE 3.5.2/463 <Warning> (thread=PacketPublisher, member=2): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=3, Timestamp=2009-10-30 10:52:20.758, Address=165.137.250.122:8090, MachineId=54906, Location=site:cable.comcast.com,machine:PACDCL-CJWWND1b,process:2756)
by MemberSet(Size=1, BitSetCount=2
Member(Id=1, Timestamp=2009-10-30 10:31:31.621, Address=165.137.250.122:8088, MachineId=54906, Location=site:cable.comcast.com,machine:PACDCL-CJWWND1b,process:5380)
Map (com.comcast.customer.contract.contract.hibernate.Term):
Map (com.comcast.customer.contract.contract.hibernate.Term): 2009-10-30 11:06:46.887/2095.945 Oracle Coherence GE 3.5.2/463 <Error> (thread=PacketPublisher, member=2): This node appears to have become disconnected from the rest of the cluster containing 2 nodes. All departure confirmation requests went unanswered.
Stopping cluster service.
Map (com.comcast.customer.contract.contract.hibernate.Term): 2009-10-30 11:06:48.773/2097.831 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=2): Service Cluster left the cluster
2009-10-30 11:06:49.257/2098.315 Oracle Coherence GE 3.5.2/463 <D5> (thread=Invocation:Management, member=2): Service Management left the cluster
2009-10-30 11:06:49.257/2098.315 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache, member=2): Service DistributedCache left the clusterIN JUnit Test VM
Coherence <Error>: Halting this cluster node due to unrecoverable service failurePlease note i am running Default Cache Server VM, Cache Factory VM, Eclipse JUnit Test VM in the same machine.
Please note the same piece of code works absolutely fine when i load other object which return 154 rows.

Thanks for quick response.
>
So using the local scheme, you place 869 objects into that cache, correct? Does that work?
I didn't tried with local scheme. But i did try with <read-write-backing-map> scheme as it was giving problem i reduced the size to 100 & changed to local-scheme.
If you would like me to try with local-scheme i would do so but it will not prove anything as we need Hibernate Cache store to do write's
>
Can you explain what the remaining issue is? (What part is failing?)
There are several issues and i am really striving to make it work :)-
Here is the list
- revert back to <read-write-backing-map> scheme so that i can pre-poulate the cache from database so that subsequent reads and writes hit the cache instead of database
- to pre-populate the cache during application start-up . We use Spring 2.5, Hibernate 3.2
- the queryContract(contract) method is similar to Search screen i.e. it takes sample contract object as an argument with some attributes populated. I am using Filter API to return the List of Contract objects based on the search parameters of sample contract as follows
Filter filter = new EqualsFilter(IdentityExtractor.INSTANCE, contract);
Set setEntries = contractCache.entrySet(filter); The above code expects all the attributes of sample contract object are fully populated and if not it throws Null Pointer Exception
For example if date attribute is null then Null Pointer Exception is thrown at the following line
writer.writeLong(2, this.date.getTimeInMillis());
I greatly appreciate the inventor of Tangosol Coherence product responding to my queries on the forum. Hopefully with his help i will be able to resolve these issues :)-

Error Manual intervention is required to stop the members...

I'm seeing a couple errors when disconnected members try to rejoin.
I've read what's here about "validate polls":
http://coherence.oracle.com/display/COH35UG/Partitioned+Cache+Service+Log+Messages
I think the relevant stuff is this:
2010-12-15 09:04:25.193/17363.862 Oracle Coherence GE 3.6.1.0 <Error> (thread=Cluster, member=n/a): validatePolls: This service timed-out due to unanswered handshake request. Manual intervention is required to stop the members that have not responded to this Poll
PollId=2, active
InitTimeMillis=1292425166002
Service=Cluster (0)
RespondedMemberSet=[1,2,3,4,5,6,7,8,9,10,11,12,17,19,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,40,41,42]
LeftMemberSet=[20,21,39]
RemainingMemberSet=[18]
2010-12-15 09:04:25.197/17363.866 Oracle Coherence GE 3.6.1.0 <Error> (thread=qReader, member=n/a): Error while starting cluster: com.tangosol.net.RequestTimeoutException: Timeout during service start: ServiceInfo(Id=0, Name=Cluster, Type=Cluster
MemberSet=ServiceMemberSet...I have a number of these apps running simultaneously between two machines. At about the same time all the instances of the app running on one machine experienced the same issue. It looks like those members received cluster heartbeats which did not include the memeber so they left the cluster. They all then tried to rejoin but did not get a timely handshake back from a few members (20,21,39) so they just gave up. Is this correct behavior? Is there something I can do so it continues to reconnect? At a minimum can I somehow listen for this situation and do a System.exit(-1) so the apps' startup scripts know to restart them?
Thanks,
Andrew
The (almost) full log including the bit posted above is:
2010-12-15 04:15:18.478/17.147 Oracle Coherence 3.6.1.0 <Info> (thread=qReader, member=n/a): Loaded operational configuration from "jar:file:/Z:/coherence/lib/coherence.jar!/tangosol-coherence.xml"
2010-12-15 04:15:18.498/17.167 Oracle Coherence 3.6.1.0 <Info> (thread=qReader, member=n/a): Loaded operational overrides from "jar:file:/Z:/coherence/lib/coherence.jar!/tangosol-coherence-override-dev.xml"
Oracle Coherence Version 3.6.1.0 Build 19636
Grid Edition: Development mode
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
2010-12-15 04:15:19.122/17.791 Oracle Coherence GE 3.6.1.0 <Info> (thread=qReader, member=n/a): Loaded cache configuration from "file:/Z:/coherence/cache-config.xml"
2010-12-15 04:15:24.853/23.522 Oracle Coherence GE 3.6.1.0 <Info> (thread=Cluster, member=n/a): Failed to satisfy the variance: allowed=16, actual=52
2010-12-15 04:15:24.853/23.522 Oracle Coherence GE 3.6.1.0 <Info> (thread=Cluster, member=n/a): Increasing allowable variance to 20
2010-12-15 04:15:25.884/24.553 Oracle Coherence GE 3.6.1.0 <Info> (thread=Cluster, member=n/a): This Member(Id=16, Timestamp=2010-12-15 04:15:29.001, Address=192.168.3.6:8092, MachineId=27140, Location=machine:mothra,process:5888,member:six, Role=RediquoteclientRediQuoteClient, Edition=Grid Edition, Mode=Development, CpuCount=32, SocketCount=32) joined cluster "dev" with senior Member(Id=1, Timestamp=2010-12-14 20:01:47.154, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4292,member:Administrator, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=8, SocketCount=2)
2010-12-15 04:15:26.214/24.883 Oracle Coherence GE 3.6.1.0 <Info> (thread=qReader, member=n/a): Started cluster Name=dev
Group{Address=225.0.0.1, Port=54321, TTL=4}
MasterMemberSet
ThisMember=Member(Id=16, Timestamp=2010-12-15 04:15:29.001, Address=192.168.3.6:8092, MachineId=27140, Location=machine:mothra,process:5888,member:six, Role=RediquoteclientRediQuoteClient)
OldestMember=Member(Id=1, Timestamp=2010-12-14 20:01:47.154, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4292,member:Administrator, Role=CoherenceServer)
ActualMemberSet=MemberSet(Size=15, BitSetCount=2
    Member(Id=1, Timestamp=2010-12-14 20:01:47.154, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4292,member:Administrator, Role=CoherenceServer)
    Member(Id=2, Timestamp=2010-12-14 20:01:50.669, Address=192.168.3.20:8090, MachineId=27412, Location=machine:amd4,process:6048,member:Administrator, Role=CoherenceServer)
    Member(Id=3, Timestamp=2010-12-14 20:01:51.794, Address=192.168.3.20:8092, MachineId=27412, Location=machine:amd4,process:6980,member:Administrator, Role=CoherenceServer)
    Member(Id=4, Timestamp=2010-12-15 02:33:41.403, Address=192.168.3.5:8088, MachineId=27139, Location=machine:blackjack,process:12420,member:Administrator, Role=DJclientStartMe)
    Member(Id=5, Timestamp=2010-12-15 02:56:49.727, Address=192.168.3.5:8090, MachineId=27139, Location=machine:blackjack,process:7520,member:Administrator, Role=DJclientBboClient)
    Member(Id=7, Timestamp=2010-12-15 04:11:43.521, Address=192.168.3.5:8092, MachineId=27139, Location=machine:blackjack,process:7456,member:JD, Role=PE)
    Member(Id=8, Timestamp=2010-12-15 04:15:07.121, Address=192.168.3.7:8088, MachineId=27399, Location=machine:amd2,process:8036,member:ten, Role=RediquoteclientRediQuoteClient)
    Member(Id=9, Timestamp=2010-12-15 04:15:07.27, Address=192.168.3.7:8090, MachineId=27399, Location=machine:amd2,process:6328,member:twelve, Role=RediquoteclientRediQuoteClient)
    Member(Id=10, Timestamp=2010-12-15 04:15:07.295, Address=192.168.3.7:8092, MachineId=27399, Location=machine:amd2,process:2028,member:three, Role=RediquoteclientRediQuoteClient)
    Member(Id=11, Timestamp=2010-12-15 04:15:07.413, Address=192.168.3.7:8094, MachineId=27399, Location=machine:amd2,process:2744,member:two, Role=RediquoteclientRediQuoteClient)
    Member(Id=12, Timestamp=2010-12-15 04:15:11.511, Address=192.168.3.7:8096, MachineId=27399, Location=machine:amd2,process:2704,member:one, Role=RediquoteclientRediQuoteClient)
    Member(Id=13, Timestamp=2010-12-15 04:15:23.773, Address=192.168.3.6:8088, MachineId=27140, Location=machine:mothra,process:8424,member:five, Role=RediquoteclientRediQuoteClient)
    Member(Id=14, Timestamp=2010-12-15 04:15:24.957, Address=192.168.3.6:8090, MachineId=27140, Location=machine:mothra,process:8480,member:four, Role=RediquoteclientRediQuoteClient)
    Member(Id=15, Timestamp=2010-12-15 04:15:28.992, Address=192.168.3.6:8094, MachineId=27140, Location=machine:mothra,process:5152,member:seven, Role=RediquoteclientRediQuoteClient)
    Member(Id=16, Timestamp=2010-12-15 04:15:29.001, Address=192.168.3.6:8092, MachineId=27140, Location=machine:mothra,process:5888,member:six, Role=RediquoteclientRediQuoteClient)
RecycleMillis=1200000
RecycleSet=MemberSet(Size=0,BitSetCount=0
TcpRing{Connections=[15]}
IpMonitor{AddressListSize=3}
2010-12-15 07:12:32.467/10651.136 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 86107 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2010-12-15 05:56:59.648, Address=192.168.3.20:8096, MachineId=27412, Location=machine:amd4,process:3844,member:Administrator, Role=TvuMainTVU); 447 packets rescheduled, PauseRate=0.0189, Threshold=1878
2010-12-15 07:12:42.215/10660.884 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 49852 ms communication delay (probable remote GC) with Member(Id=20, Timestamp=2010-12-15 05:56:02.412, Address=192.168.3.20:8098, MachineId=27412, Location=machine:amd4,process:4704,member:Administrator, Role=StatsStatsLoader2); 266 packets rescheduled, PauseRate=0.0107, Threshold=1878
2010-12-15 07:16:32.750/10891.419 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 3372 ms communication delay (probable remote GC) with Member(Id=18, Timestamp=2010-12-15 05:40:07.904, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:5764,member:Administrator, Role=A1bookBinaryMainA1Multicast); 33 packets rescheduled, PauseRate=5.0E-4, Threshold=1612
2010-12-15 07:26:31.208/11489.878 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 4819 ms communication delay (probable remote GC) with Member(Id=18, Timestamp=2010-12-15 05:40:07.904, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:5764,member:Administrator, Role=A1bookBinaryMainA1Multicast); 41 packets rescheduled, PauseRate=0.0012, Threshold=1456
2010-12-15 07:47:21.567/12740.237 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 7115 ms communication delay (probable remote GC) with Member(Id=20, Timestamp=2010-12-15 05:56:02.412, Address=192.168.3.20:8098, MachineId=27412, Location=machine:amd4,process:4704,member:Administrator, Role=StatsStatsLoader2); 52 packets rescheduled, PauseRate=0.0085, Threshold=1612
2010-12-15 07:47:49.067/12767.736 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 8614 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2010-12-15 05:56:59.648, Address=192.168.3.20:8096, MachineId=27412, Location=machine:amd4,process:3844,member:Administrator, Role=TvuMainTVU); 60 packets rescheduled, PauseRate=0.0142, Threshold=1696
2010-12-15 07:56:32.988/13291.657 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 11513 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2010-12-15 05:56:59.648, Address=192.168.3.20:8096, MachineId=27412, Location=machine:amd4,process:3844,member:Administrator, Role=TvuMainTVU); 74 packets rescheduled, PauseRate=0.0147, Threshold=1612
2010-12-15 08:51:33.890/16592.559 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 33650 ms communication delay (probable remote GC) with Member(Id=20, Timestamp=2010-12-15 05:56:02.412, Address=192.168.3.20:8098, MachineId=27412, Location=machine:amd4,process:4704,member:Administrator, Role=StatsStatsLoader2); 184 packets rescheduled, PauseRate=0.0085, Threshold=1456
2010-12-15 08:52:22.802/16641.471 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 10437 ms communication delay (probable remote GC) with Member(Id=18, Timestamp=2010-12-15 05:40:07.904, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:5764,member:Administrator, Role=A1bookBinaryMainA1Multicast); 68 packets rescheduled, PauseRate=0.0016, Threshold=1129
2010-12-15 08:53:23.232/16701.901 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 5682 ms communication delay (probable remote GC) with Member(Id=18, Timestamp=2010-12-15 05:40:07.904, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:5764,member:Administrator, Role=A1bookBinaryMainA1Multicast); 45 packets rescheduled, PauseRate=0.0020, Threshold=1073
2010-12-15 08:53:50.802/16729.471 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 15112 ms communication delay (probable remote GC) with Member(Id=39, Timestamp=2010-12-15 07:57:53.415, Address=192.168.3.20:8102, MachineId=27412, Location=machine:amd4,process:4596,member:Administrator, Role=ex_viewer); 92 packets rescheduled, PauseRate=0.0044, Threshold=1785
2010-12-15 08:53:50.814/16729.483 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 1263 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2010-12-15 05:56:59.648, Address=192.168.3.20:8096, MachineId=27412, Location=machine:amd4,process:3844,member:Administrator, Role=TvuMainTVU); 23 packets rescheduled, PauseRate=0.0101, Threshold=1532
2010-12-15 08:54:36.660/16775.329 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 76110 ms communication delay (probable remote GC) with Member(Id=20, Timestamp=2010-12-15 05:56:02.412, Address=192.168.3.20:8098, MachineId=27412, Location=machine:amd4,process:4704,member:Administrator, Role=StatsStatsLoader2); 396 packets rescheduled, PauseRate=0.0155, Threshold=1384
2010-12-15 08:54:36.915/16775.584 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 5302 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2010-12-15 05:56:59.648, Address=192.168.3.20:8096, MachineId=27412, Location=machine:amd4,process:3844,member:Administrator, Role=TvuMainTVU); 43 packets rescheduled, PauseRate=0.0105, Threshold=1456
2010-12-15 08:55:28.984/16827.653 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 4293 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2010-12-15 05:56:59.648, Address=192.168.3.20:8096, MachineId=27412, Location=machine:amd4,process:3844,member:Administrator, Role=TvuMainTVU); 38 packets rescheduled, PauseRate=0.0109, Threshold=1384
2010-12-15 08:56:07.114/16865.783 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 3385 ms communication delay (probable remote GC) with Member(Id=18, Timestamp=2010-12-15 05:40:07.904, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:5764,member:Administrator, Role=A1bookBinaryMainA1Multicast); 33 packets rescheduled, PauseRate=0.0023, Threshold=1020
2010-12-15 08:56:55.091/16913.760 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 1398 ms communication delay (probable remote GC) with Member(Id=21, Timestamp=2010-12-15 05:56:59.648, Address=192.168.3.20:8096, MachineId=27412, Location=machine:amd4,process:3844,member:Administrator, Role=TvuMainTVU); 23 packets rescheduled, PauseRate=0.0109, Threshold=1315
2010-12-15 08:57:36.620/16955.289 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 9830 ms communication delay (probable remote GC) with Member(Id=18, Timestamp=2010-12-15 05:40:07.904, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:5764,member:Administrator, Role=A1bookBinaryMainA1Multicast); 66 packets rescheduled, PauseRate=0.0031, Threshold=969
2010-12-15 08:57:36.825/16955.494 Oracle Coherence GE 3.6.1.0 <Warning> (thread=PacketPublisher, member=16): Experienced a 127134 ms communication delay (probable remote GC) with Member(Id=39, Timestamp=2010-12-15 07:57:53.415, Address=192.168.3.20:8102, MachineId=27412, Location=machine:amd4,process:4596,member:Administrator, Role=ex_viewer); 650 packets rescheduled, PauseRate=0.0394, Threshold=1696
2010-12-15 08:59:22.500/17061.169 Oracle Coherence GE 3.6.1.0 <Error> (thread=Cluster, member=16): Received cluster heartbeat from the senior Member(Id=1, Timestamp=2010-12-14 20:01:47.154, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4292,member:Administrator, Role=CoherenceServer) that does not contain this Member(Id=16, Timestamp=2010-12-15 04:15:29.001, Address=192.168.3.6:8092, MachineId=27140, Location=machine:mothra,process:5888,member:six, Role=RediquoteclientRediQuoteClient); stopping cluster service.
2010-12-15 08:59:22.504/17061.173 Oracle Coherence GE 3.6.1.0 <Error> (thread=Cluster, member=16): Full Thread Dump
Thread[Thread-3,5,main]
     java.lang.Object.wait(Native Method)
     rediquoteclient.QuoteServerLink.pause(QuoteServerLink.java:158)
     rediquoteclient.QuoteServerLink$1.run(QuoteServerLink.java:54)
     java.lang.Thread.run(Thread.java:619)
Thread[Cluster|Member(Id=16, Timestamp=2010-12-15 04:15:29.001, Address=192.168.3.6:8092, MachineId=27140, Location=machine:mothra,process:5888,member:six, Role=RediquoteclientRediQuoteClient),5,Cluster]
     java.lang.Thread.dumpThreads(Native Method)
     java.lang.Thread.getAllStackTraces(Thread.java:1487)
     com.tangosol.net.GuardSupport.logStackTraces(GuardSupport.java:810)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService$SeniorMemberHeartbeat.onReceived(ClusterService.CDB:33)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:11)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:33)
     com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.onNotify(ClusterService.CDB:3)
     com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
     java.lang.Thread.run(Thread.java:619)
Thread[Finalizer,8,system]
     java.lang.Object.wait(Native Method)
     java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
     java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
     java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
2010-12-15 08:59:23.722/17062.391 Oracle Coherence GE 3.6.1.0 <Info> (thread=qReader, member=16): Restarting NamedCache: quotes.REDI
2010-12-15 08:59:23.723/17062.392 Oracle Coherence GE 3.6.1.0 <Info> (thread=qReader, member=16): Restarting Service: DistributedQuotesCacheService
2010-12-15 08:59:23.723/17062.392 Oracle Coherence GE 3.6.1.0 <Info> (thread=qReader, member=n/a): Restarting cluster
2010-12-15 08:59:25.987/17064.656 Oracle Coherence GE 3.6.1.0 <Info> (thread=Cluster, member=n/a): This Member(Id=43, Timestamp=2010-12-15 08:59:44.971, Address=192.168.3.6:8092, MachineId=27140, Location=machine:mothra,process:5888,member:six, Role=RediquoteclientRediQuoteClient, Edition=Grid Edition, Mode=Development, CpuCount=32, SocketCount=32) joined cluster "dev" with senior Member(Id=1, Timestamp=2010-12-14 20:01:47.154, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4292,member:Administrator, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=8, SocketCount=2)
2010-12-15 09:03:59.364/17338.033 Oracle Coherence GE 3.6.1.0 <Error> (thread=Cluster, member=n/a): Detected soft timeout) of {WrapperGuardable Guard{Daemon=IpMonitor} Service=ClusterService{Name=Cluster, State=(SERVICE_STARTED, STATE_JOINED), Id=0, Version=3.6, OldestMemberId=1}}
2010-12-15 09:03:59.370/17338.039 Oracle Coherence GE 3.6.1.0 <Warning> (thread=Recovery Thread, member=n/a): Attempting recovery of Guard{Daemon=IpMonitor}
2010-12-15 09:04:25.193/17363.862 Oracle Coherence GE 3.6.1.0 <Error> (thread=Cluster, member=n/a): validatePolls: This service timed-out due to unanswered handshake request. Manual intervention is required to stop the members that have not responded to this Poll
PollId=2, active
InitTimeMillis=1292425166002
Service=Cluster (0)
RespondedMemberSet=[1,2,3,4,5,6,7,8,9,10,11,12,17,19,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,40,41,42]
LeftMemberSet=[20,21,39]
RemainingMemberSet=[18]
2010-12-15 09:04:25.197/17363.866 Oracle Coherence GE 3.6.1.0 <Error> (thread=qReader, member=n/a): Error while starting cluster: com.tangosol.net.RequestTimeoutException: Timeout during service start: ServiceInfo(Id=0, Name=Cluster, Type=Cluster
MemberSet=ServiceMemberSet(
    OldestMember=n/a
    ActualMemberSet=MemberSet(Size=33, BitSetCount=2
      Member(Id=1, Timestamp=2010-12-14 20:01:47.154, Address=192.168.3.20:8088, MachineId=27412, Location=machine:amd4,process:4292,member:Administrator, Role=CoherenceServer)
      Member(Id=2, Timestamp=2010-12-14 20:01:50.669, Address=192.168.3.20:8090, MachineId=27412, Location=machine:amd4,process:6048,member:Administrator, Role=CoherenceServer)
      Member(Id=3, Timestamp=2010-12-14 20:01:51.794, Address=192.168.3.20:8092, MachineId=27412, Location=machine:amd4,process:6980,member:Administrator, Role=CoherenceServer)
      Member(Id=5, Timestamp=2010-12-15 02:56:49.727, Address=192.168.3.5:8090, MachineId=27139, Location=machine:blackjack,process:7520,member:Administrator, Role=DJclientBboClient)
      Member(Id=7, Timestamp=2010-12-15 04:11:43.521, Address=192.168.3.5:8092, MachineId=27139, Location=machine:blackjack,process:7456,member:JD, Role=PE)
      Member(Id=8, Timestamp=2010-12-15 04:15:07.121, Address=192.168.3.7:8088, MachineId=27399, Location=machine:amd2,process:8036,member:ten, Role=RediquoteclientRediQuoteClient)
      Member(Id=9, Timestamp=2010-12-15 04:15:07.27, Address=192.168.3.7:8090, MachineId=27399, Location=machine:amd2,process:6328,member:twelve, Role=RediquoteclientRediQuoteClient)
      Member(Id=10, Timestamp=2010-12-15 04:15:07.295, Address=192.168.3.7:8092, MachineId=27399, Location=machine:amd2,process:2028,member:three, Role=RediquoteclientRediQuoteClient)
      Member(Id=11, Timestamp=2010-12-15 04:15:07.413, Address=192.168.3.7:8094, MachineId=27399, Location=machine:amd2,process:2744,member:two, Role=RediquoteclientRediQuoteClient)
      Member(Id=12, Timestamp=2010-12-15 04:15:11.511, Address=192.168.3.7:8096, MachineId=27399, Location=machine:amd2,process:2704,member:one, Role=RediquoteclientRediQuoteClient)
      Member(Id=17, Timestamp=2010-12-15 05:24:58.776, Address=192.168.3.5:8096, MachineId=27139, Location=machine:blackjack,process:9836,member:Administrator, Role=DJclientRestarter)
      Member(Id=18, Timestamp=2010-12-15 05:40:07.904, Address=192.168.3.20:8094, MachineId=27412, Location=machine:amd4,process:5764,member:Administrator, Role=A1bookBinaryMainA1Multicast)
      Member(Id=22, Timestamp=2010-12-15 06:00:07.414, Address=192.168.3.25:8088, MachineId=27417, Location=machine:amd6,process:316,member:Administrator, Role=OmsOms)
      Member(Id=23, Timestamp=2010-12-15 06:00:09.872, Address=192.168.3.7:8098, MachineId=27399, Location=machine:amd2,process:2984,member:user, Role=PEQuoteProcessor)
      Member(Id=24, Timestamp=2010-12-15 06:00:51.201, Address=192.168.3.24:8088, MachineId=27416, Location=machine:Amd5,process:3096,member:St, Role=OmsOms)
      Member(Id=25, Timestamp=2010-12-15 06:52:44.897, Address=192.168.3.32:8088, MachineId=27424, Location=machine:themysticx,process:3804,member:Tony, Role=OmsOms)
      Member(Id=26, Timestamp=2010-12-15 07:13:05.082, Address=192.168.3.26:8088, MachineId=27418, Location=machine:dabulls,process:3256,member:St, Role=BboBBOClientMain)
      Member(Id=27, Timestamp=2010-12-15 07:27:00.105, Address=192.168.3.120:8090, MachineId=27512, Location=machine:TEST-1234,process:2936,member:TEST, Role=EXV)
      Member(Id=28, Timestamp=2010-12-15 07:27:00.183, Address=192.168.3.120:8088, MachineId=27512, Location=machine:TEST-1234,process:3116,member:TEST, Role=oew)
      Member(Id=29, Timestamp=2010-12-15 07:27:00.251, Address=192.168.3.120:8092, MachineId=27512, Location=machine:TEST-1234,process:272,member:TEST, Role=OV)
      Member(Id=30, Timestamp=2010-12-15 07:51:41.145, Address=192.168.3.116:8088, MachineId=27508, Location=machine:BE,process:2144,member:user, Role=EXV)
      Member(Id=32, Timestamp=2010-12-15 07:34:03.385, Address=192.168.3.5:8098, MachineId=27139, Location=machine:blackjack,process:5836,member:WayneN, Role=PE)
      Member(Id=35, Timestamp=2010-12-15 08:22:49.77, Address=192.168.3.121:8092, MachineId=27513, Location=machine:Hd,process:4052,member:Hd, Role=oew)
      Member(Id=36, Timestamp=2010-12-15 07:51:41.529, Address=192.168.3.116:8090, MachineId=27508, Location=machine:BE,process:2292,member:user, Role=OV)
      Member(Id=37, Timestamp=2010-12-15 07:51:41.556, Address=192.168.3.116:8091, MachineId=27508, Location=machine:BE,process:2064,member:user, Role=oew)
      Member(Id=38, Timestamp=2010-12-15 07:51:41.863, Address=192.168.3.116:8094, MachineId=27508, Location=machine:BE,process:2108,member:user, Role=PE)
      Member(Id=40, Timestamp=2010-12-15 08:06:00.705, Address=192.168.3.5:8100, MachineId=27139, Location=machine:blackjack,process:5484,member:ChrisL, Role=PE)
      Member(Id=41, Timestamp=2010-12-15 08:22:59.722, Address=192.168.3.121:8094, MachineId=27513, Location=machine:Hd,process:2972,member:Hd, Role=PE)
      Member(Id=43, Timestamp=2010-12-15 08:59:44.971, Address=192.168.3.6:8092, MachineId=27140, Location=machine:mothra,process:5888,member:six, Role=RediquoteclientRediQuoteClient)
      Member(Id=44, Timestamp=2010-12-15 09:01:17.056, Address=192.168.3.121:8090, MachineId=27513, Location=machine:Hd,process:860,member:Hd)
      Member(Id=45, Timestamp=2010-12-15 09:01:35.962, Address=192.168.3.5:8094, MachineId=27139, Location=machine:blackjack,process:4852,member:Administrator)
      Member(Id=46, Timestamp=2010-12-15 09:01:41.282, Address=192.168.3.5:8088, MachineId=27139, Location=machine:blackjack,process:12420,member:Administrator)
      Member(Id=47, Timestamp=2010-12-15 09:02:44.877, Address=192.168.3.121:8088, MachineId=27513, Location=machine:Hd,process:2804,member:Hd)
    MemberId/ServiceVersion/ServiceJoined/MemberState
      1/3.6/Tue Dec 14 20:01:47 CST 2010/JOINED,
      2/3.6/Tue Dec 14 20:01:50 CST 2010/JOINED,
      3/3.6/Tue Dec 14 20:01:51 CST 2010/JOINED,
      5/3.6/Wed Dec 15 02:56:49 CST 2010/JOINED,
      7/3.6/Wed Dec 15 04:11:43 CST 2010/JOINED,
      8/3.6/Wed Dec 15 04:15:07 CST 2010/JOINED,
      9/3.6/Wed Dec 15 04:15:07 CST 2010/JOINED,
      10/3.6/Wed Dec 15 04:15:07 CST 2010/JOINED,
      11/3.6/Wed Dec 15 04:15:07 CST 2010/JOINED,
      12/3.6/Wed Dec 15 04:15:11 CST 2010/JOINED,
      17/3.6/Wed Dec 15 05:24:58 CST 2010/JOINED,
      18/3.6/Wed Dec 15 05:40:07 CST 2010/JOINING,
      22/3.6/Wed Dec 15 06:00:07 CST 2010/JOINED,
      23/3.6/Wed Dec 15 06:00:09 CST 2010/JOINED,
      24/3.6/Wed Dec 15 06:00:51 CST 2010/JOINED,
      25/3.6/Wed Dec 15 06:52:44 CST 2010/JOINED,
      26/3.6/Wed Dec 15 07:13:05 CST 2010/JOINED,
      27/3.6/Wed Dec 15 07:27:00 CST 2010/JOINED,
      28/3.6/Wed Dec 15 07:27:00 CST 2010/JOINED,
      29/3.6/Wed Dec 15 07:27:00 CST 2010/JOINED,
      30/3.6/Wed Dec 15 07:51:41 CST 2010/JOINED,
      32/3.6/Wed Dec 15 07:34:03 CST 2010/JOINED,
      35/3.6/Wed Dec 15 08:22:49 CST 2010/JOINED,
      36/3.6/Wed Dec 15 07:51:41 CST 2010/JOINED,
      37/3.6/Wed Dec 15 07:51:41 CST 2010/JOINED,
      38/3.6/Wed Dec 15 07:51:41 CST 2010/JOINED,
      40/3.6/Wed Dec 15 08:06:00 CST 2010/JOINED,
      41/3.6/Wed Dec 15 08:22:59 CST 2010/JOINED,
      43/3.6/Wed Dec 15 08:59:44 CST 2010/JOINED,
      44/3.6/Wed Dec 15 09:01:17 CST 2010/JOINED,
      45/3.6/Wed Dec 15 09:01:35 CST 2010/JOINED,
      46/3.6/Wed Dec 15 09:01:41 CST 2010/JOINED,
      47/3.6/Wed Dec 15 09:02:44 CST 2010/JOINED
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onStartupTimeout(Grid.CDB:6)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.start(Service.CDB:28)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.start(Grid.CDB:6)
     at com.tangosol.coherence.component.net.Cluster.onStart(Cluster.CDB:636)
     at com.tangosol.coherence.component.net.Cluster.start(Cluster.CDB:11)
     at com.tangosol.coherence.component.util.SafeCluster.startCluster(SafeCluster.CDB:3)
     at com.tangosol.coherence.component.util.SafeCluster.restartCluster(SafeCluster.CDB:7)
     at com.tangosol.coherence.component.util.SafeCluster.ensureRunningCluster(SafeCluster.CDB:26)
     at com.tangosol.coherence.component.util.SafeService.restartService(SafeService.CDB:22)
     at com.tangosol.coherence.component.util.SafeService.ensureRunningService(SafeService.CDB:39)
     at com.tangosol.coherence.component.util.safeService.SafeCacheService.ensureRunningCacheService(SafeCacheService.CDB:3)
     at com.tangosol.coherence.component.util.SafeNamedCache$CacheAction.run(SafeNamedCache.CDB:3)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:337)
     at com.tangosol.coherence.component.util.SafeNamedCache.restartNamedCache(SafeNamedCache.CDB:8)
     at com.tangosol.coherence.component.util.SafeNamedCache.ensureRunningNamedCache(SafeNamedCache.CDB:33)
     at com.tangosol.coherence.component.util.SafeNamedCache.getRunningNamedCache(SafeNamedCache.CDB:1)
     at com.tangosol.coherence.component.util.SafeNamedCache.putAll(SafeNamedCache.CDB:1)
     at rediquoteclient.CoherencePublisher$1.run(CoherencePublisher.java:105)
     at java.lang.Thread.run(Thread.java:619)

Ashbaernon said: Firstly, the complete error message is "This VI
is not executable. The full development version of Labview is required
to fix the errors"
"VI is not executable" can mean the VI is missing some component, but it can also mean the VI or one of its subVIs has not been compiled to the same version as your RTE (Run-Time Engine). I've specifically seen this using the TestStand full-featured LabVIEW user interface, which was last compiled with LabVIEW 7.x (5 years ago), yet shipped with TS 4. Anyhow, if this is the cause, the solution is to:
Open the top level VI and any dynamically called VIs (i.e. VIs called by VI Server).
Holding down Ctrl and Shift, press the Run arrow to recompile all VIs in memory.
File >> Save All (or Ctrl-Shift-S)
Rebuild your EXE.
-Jason
Message Edited by LabBEAN on 02-14-2009 10:28 PM
Certified LabVIEW Architect
Wait for Flag / Set Flag
Separate Views from Implementation for Strict Type Defs

Coherence Cluster Errors- Need your help to solve

Hi,
We had this error recently in QA and these servers are not new servers. These servers were running from some time and in good condition.
We had a below error happened suddently and cuased servers outage for some time.
After restarted all the servers, this issue has gone.
We are trying to understand the root cause to avoid this issue in future and need expertise in this forum for that.
Brief summary of issue
1. We had performed multicaste testing on the coherence cluster IP and port and all the communication is good.
2. Issues started with error of Unable to refresh sockets:
                      Stopping cluster due to unhandled exception: com.tangosol.net.messaging.ConnectionException: Unable to refresh sockets: [UnicastUdpSocket{State=STATE_OPEN, address:port=1.1.1.85:8088},                     MulticastUdpSocket{State=STATE_OPEN, address:port=239.3.1.17:35122, InterfaceAddress=10.137.3.85, TimeToLive=1}, TcpSocketAccepter{State=STATE_OPEN, ServerSocket=1.1.1.85:8088}]; last failed socket:                          MulticastUdpSocket{State=STATE_OPEN, address:port=239.3.1.17:35122, InterfaceAddress=10.137.3.85, TimeToLive=1}
                                           at com.tangosol.coherence.component.net.Cluster$SocketManager.refreshSockets(Cluster.CDB:91)
                                            at com.tangosol.coherence.component.net.Cluster$SocketManager$MulticastUdpSocket.onInterruptedIOException(Cluster.CDB:9)
                                       at com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:33)
                                  at com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:4)
                                       at com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:19)
                                       at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
                                       at java.lang.Thread.run(Thread.java:662)
                    Caused by: java.net.SocketTimeoutException: Receive timed out
3. After that, I noticed copule of errors like
                                   Restarting Service: DistributedCache   validatePolls: This service timed-out due to unanswered handshake request. Manual intervention is required to stop the members that have not responded to this Poll
4. Continously logging errors like :   Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/
5. After that noticed,
                         Service DistributedCache: received ServiceConfigSync containing 272 entries
                         2013-10-26 08:26:43,241 -0700 level=ERROR class="STDERR"              2013-10-26 08:26:43.241/76.243 Oracle Coherence GE 3.5.1/461 <Error> (thread=main, member=1): Error while starting service "DistributedCache":                          com.tangosol.net.RequestTimeoutException: Timeout during service start: ServiceInfo(Id=2, Name=DistributedCache, Type=DistributedCache
                           MemberSet=ServiceMemberSet(
                             OldestMember=Member(Id=3, Timestamp=2013-10-26 05:16:47.128, Address=10.137.3.49:8088, MachineId=32817, Location=site:test.test.net,machine:test30b,process:3870)
                                       ActualMemberSet=MemberSet(Size=3, BitSetCount=2
                                    Member(Id=1, Timestamp=2013-10-26 08:26:12.289, Address=1.1.1.85:8088, MachineId=32853, Location=site:test.test.net,machine:test304,process:6207, Role=JavaLangThread)
                                    Member(Id=3, Timestamp=2013-10-26 05:16:47.128, Address=1.1.1.49:8088, MachineId=32817, Location=site:test.test.net,machine:test30b,process:3870)
                                    Member(Id=5, Timestamp=2013-10-26 08:26:29.871, Address=1.1.1.86:8088, MachineId=32854, Location=site:test.test.net,machine:test305,process:3988)
                        MemberId/ServiceVersion/ServiceJoined/ServiceLeaving
                          1/3.5/Sat Oct 26 08:26:13 PDT 2013/false,
                          3/3.5/Sat Oct 26 05:16:47 PDT 2013/false,
                          5/3.5/Sat Oct 26 08:26:30 PDT 2013/false
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onStartupTimeout(Grid.CDB:6)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.start(Service.CDB:28)
Your Help is highly appreciated !!!!
Detailed Server Error Log:
2013-10-26 00:15:13,280 -0700 level=ERROR class="STDERR"
2013-10-26 00:15:13.279/2079180.072 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=4): Experienced a 2642 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2013-10-01 22:43:27.913, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870, Role=JavaLangThread); 34 packets rescheduled, PauseRate=0.0010, Threshold=222
2013-10-26 00:15:15,508 -0700 level=ERROR class="STDERR"
2013-10-26 00:15:15.508/2079182.301 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=4): Experienced a 4875 ms communication delay (probable remote GC) with Member(Id=1, Timestamp=2013-10-08 22:00:17.258, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988, Role=JavaLangThread); 47 packets rescheduled, PauseRate=3.0E-4, Threshold=1438
2013-10-26 01:15:29,028 -0700 level=ERROR class="STDERR"
2013-10-26 01:15:29.018/2082795.811 Oracle Coherence GE 3.5.1/461 <Info> (thread=PacketListenerN, member=4): Scheduled senior member heartbeat is overdue; rejoining multicast group.
2013-10-26 01:15:29,036 -0700 level=ERROR class="STDERR"
2013-10-26 01:15:29.036/2082795.829 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=4): Experienced a 13068 ms communication delay (probable remote GC) with Member(Id=1, Timestamp=2013-10-08 22:00:17.258, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988, Role=JavaLangThread); 86 packets rescheduled, PauseRate=4.0E-4, Threshold=1438
2013-10-26 01:15:29,037 -0700 level=ERROR class="STDERR"
2013-10-26 01:15:29.036/2082795.829 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=4): Experienced a 13069 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2013-10-01 22:43:27.913, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870, Role=JavaLangThread); 84 packets rescheduled, PauseRate=0.0010, Threshold=269
2013-10-26 01:31:44,494 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 02:15:34,907 -0700 level=ERROR class="STDERR"
2013-10-26 02:15:34.906/2086401.699 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=4): Experienced a 6476 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2013-10-01 22:43:27.913, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870, Role=JavaLangThread); 24 packets rescheduled, PauseRate=0.0011, Threshold=313
2013-10-26 02:43:52,199 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 03:00:55,493 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 03:15:41,144 -0700 level=ERROR class="STDERR"
2013-10-26 03:15:41.144/2090007.937 Oracle Coherence GE 3.5.1/461 <D5> (thread=PacketPublisher, member=4): Experienced a 202 ms communication delay (probable remote GC) with Member(Id=1, Timestamp=2013-10-08 22:00:17.258, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988, Role=JavaLangThread); 25 packets rescheduled, PauseRate=4.0E-4, Threshold=1509
2013-10-26 03:15:41,592 -0700 level=ERROR class="STDERR"
2013-10-26 03:15:41.592/2090008.385 Oracle Coherence GE 3.5.1/461 <D5> (thread=PacketPublisher, member=4): Experienced a 371 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2013-10-01 22:43:27.913, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870, Role=JavaLangThread); 41 packets rescheduled, PauseRate=0.0010, Threshold=290
2013-10-26 03:31:38,099 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 04:15:47,869 -0700 level=ERROR class="STDERR"
2013-10-26 04:15:47.869/2093614.662 Oracle Coherence GE 3.5.1/461 <D5> (thread=PacketPublisher, member=4): Experienced a 850 ms communication delay (probable remote GC) with Member(Id=1, Timestamp=2013-10-08 22:00:17.258, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988, Role=JavaLangThread); 52 packets rescheduled, PauseRate=4.0E-4, Threshold=1509
2013-10-26 04:16:00,192 -0700 level=ERROR class="STDERR"
2013-10-26 04:16:00.182/2093626.975 Oracle Coherence GE 3.5.1/461 <Info> (thread=PacketListenerN, member=4): Scheduled senior member heartbeat is overdue; rejoining multicast group.
2013-10-26 04:16:00,199 -0700 level=ERROR class="STDERR"
2013-10-26 04:16:00.199/2093626.992 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=4): Experienced a 13180 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2013-10-01 22:43:27.913, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870, Role=JavaLangThread); 126 packets rescheduled, PauseRate=0.0011, Threshold=424
2013-10-26 04:16:01,897 -0700 level=ERROR class="STDERR"
2013-10-26 04:16:01.897/2093628.690 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=4): Experienced a 1503 ms communication delay (probable remote GC) with Member(Id=1, Timestamp=2013-10-08 22:00:17.258, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988, Role=JavaLangThread); 173 packets rescheduled, PauseRate=4.0E-4, Threshold=1509
2013-10-26 04:26:54,424 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 04:51:52,096 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 05:02:52,292 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 05:16:06,076 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.075/2097232.868 Oracle Coherence GE 3.5.1/461 <Error> (thread=PacketListenerN, member=4):
Stopping cluster due to unhandled exception: com.tangosol.net.messaging.ConnectionException: Unable to refresh sockets: [UnicastUdpSocket{State=STATE_OPEN, address:port=1.1.1..85:8088}, MulticastUdpSocket{State=STATE_OPEN, address:port=239.3.1.17:35122, InterfaceAddress=1.1.1..85, TimeToLive=1}, TcpSocketAccepter{State=STATE_OPEN, ServerSocket=1.1.1..85:8088}]; last failed socket: MulticastUdpSocket{State=STATE_OPEN, address:port=239.3.1.17:35122, InterfaceAddress=1.1.1..85, TimeToLive=1}
    at com.tangosol.coherence.component.net.Cluster$SocketManager.refreshSockets(Cluster.CDB:91)
    at com.tangosol.coherence.component.net.Cluster$SocketManager$MulticastUdpSocket.onInterruptedIOException(Cluster.CDB:9)
    at com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:33)
    at com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:4)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:19)
    at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
    at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.SocketTimeoutException: Receive timed out
    at java.net.PlainDatagramSocketImpl.receive0(Native Method)
    at java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:145)
    at java.net.DatagramSocket.receive(DatagramSocket.java:725)
    at com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:20)
    at com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:4)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:19)
    at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
    at java.lang.Thread.run(Thread.java:662)
2013-10-26 05:16:06,080 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.080/2097232.873 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=4): Service Cluster left the cluster
2013-10-26 05:16:06,105 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.105/2097232.898 Oracle Coherence GE 3.5.1/461 <D5> (thread=Invocation:Management, member=4): Service Management left the cluster
2013-10-26 05:16:06,105 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.105/2097232.898 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-180, member=4): Restarting NamedCache: test234aaaapeu-cache
2013-10-26 05:16:06,105 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.105/2097232.898 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-180, member=4): Restarting Service: DistributedCache
2013-10-26 05:16:06,110 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.106/2097232.899 Oracle Coherence GE 3.5.1/461 <Error> (thread=DistributedCache, member=4):
validatePolls: This service timed-out due to unanswered handshake request. Manual intervention is required to stop the members that have not responded to this Poll
PollId=24209529, active
InitTimeMillis=1382789736843
Service=DistributedCache (2)
RespondedMemberSet=[]
LeftMemberSet=[]
RemainingMemberSet=[3]
Request=Message "LockRequest"
{test.test.net
FromMember=Member(Id=4, Timestamp=2013-10-24 15:16:09.067, Address=1.1.1..85:8088, MachineId=32853, Location=site:test.test.net,machine:testabc304,process:4000)
FromMessageId=38338332
Internal=false
MessagePartCount=1
PendingCount=0
MessageType=12
ToPollId=0
Poll=null
Packets
Service=DistributedCache{Name=DistributedCache, State=(SERVICE_STOPPED), Not initialized}
ToMemberSet=MemberSet(Size=1, BitSetCount=1
Member(Id=3, Timestamp=2013-10-01 22:43:27.913, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870, Role=JavaLangThread)
NotifySent=false
null
WaitTimeout=1382789776739, LeaseExpiration=9223372036854775807
2013-10-26 05:16:06,110 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.109/2097232.902 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=4): Service DistributedCache left the cluster
2013-10-26 05:16:06,117 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.117/2097232.910 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-180, member=n/a): Restarting cluster
2013-10-26 05:16:06,198 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:06.198/2097232.991 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a
2013-10-26 05:16:07,410 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.410/2097234.203 Oracle Coherence GE 3.5.1/461 <Info> (thread=Cluster, member=n/a): Created a new cluster "cluster:0x27CB" with Member(Id=1, Timestamp=2013-10-26 05:16:06.128, Address=1.1.1..85:8088, MachineId=32853, Location=site:test.test.net,machine:testabc304,process:4000, Edition=Grid Edition, Mode=Development, CpuCount=4, SocketCount=4) UID=0x0A89035500000141F4B15BF080551F98
2013-10-26 05:16:07,436 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.436/2097234.229 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-180, member=1): Restarting Service: Management
2013-10-26 05:16:07,450 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.450/2097234.243 Oracle Coherence GE 3.5.1/461 <D5> (thread=Invocation:Management, member=1): Service Management joined the cluster with senior service member 1
2013-10-26 05:16:07,474 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.474/2097234.267 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Service DistributedCache joined the cluster with senior service member 1
2013-10-26 05:16:07,491 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.491/2097234.284 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-183, member=1): Restarting NamedCache: test234aaaaficustomer-cache
2013-10-26 05:16:07,514 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.514/2097234.307 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-38, member=1): Restarting NamedCache: test234aaaaaccount-no-export-cache
2013-10-26 05:16:07,529 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.529/2097234.322 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-136, member=1): Restarting NamedCache: test234aaaausrsum-cache
2013-10-26 05:16:07,546 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.545/2097234.338 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-9, member=1): Restarting NamedCache: test234aaaafi-v2-cache
2013-10-26 05:16:07,569 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.567/2097234.360 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-59, member=1): Restarting NamedCache: test234aaaaaccount-v2-cache
2013-10-26 05:16:07,748 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.748/2097234.541 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-28, member=1): Restarting NamedCache: test234aaaafi-cache
2013-10-26 05:16:07,816 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:07.816/2097234.609 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-133, member=1): Restarting NamedCache: test234aaaahistory-v2-cache
2013-10-26 05:16:09,154 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.154/2097235.947 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-134, member=1): Restarting NamedCache: test234aaaaaccount-cache
2013-10-26 05:16:09,169 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.169/2097235.962 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-134, member=1): Restarting NamedCache: test234aaaahistory-cache
2013-10-26 05:16:09,444 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.444/2097236.237 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member(Id=2, Timestamp=2013-10-26 05:16:09.259, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988) joined Cluster with senior member 1
2013-10-26 05:16:09,539 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.539/2097236.332 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member 2 joined Service Management with senior member 1
2013-10-26 05:16:09,580 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.579/2097236.372 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member 2 joined Service DistributedCache with senior member 1
2013-10-26 05:16:09,599 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.599/2097236.392 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Service DistributedCache: sending ServiceConfigSync containing 268 entries to Member 2
2013-10-26 05:16:09,681 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.681/2097236.474 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): 1> Transferring 128 out of 257 vulnerable partitions to member 2 requesting 128
2013-10-26 05:16:09,892 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.881/2097236.674 Oracle Coherence GE 3.5.1/461 <D4> (thread=DistributedCache, member=1): 1> Transferring 129 out of 129 partitions to a machine-safe backup 1 at member 2 (under 129)
2013-10-26 05:16:09,901 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:09.901/2097236.694 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Transferring 388KB of backup[1] for PartitionSet{128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256} to member 2
2013-10-26 05:16:10,415 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:10.415/2097237.208 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): TcpRing: connecting to member 2 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..86,port=8088,localport=37005]}
2013-10-26 05:16:10,657 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:10.657/2097237.450 Oracle Coherence GE 3.5.1/461 <Warning> (thread=Cluster, member=1): Received panic from junior member Member(Id=2, Timestamp=2013-10-26 05:16:09.259, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988) caused by Member(Id=3, Timestamp=2013-10-01 22:43:27.913, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870, Role=JavaLangThread)
2013-10-26 05:16:11,592 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:11.592/2097238.385 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32822,localport=8088]}
2013-10-26 05:16:13,568 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:13.568/2097240.361 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-52, member=1): Restarting NamedCache: test234aaaauserData-cache
2013-10-26 05:16:13,596 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:13.596/2097240.389 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32823,localport=8088]}
2013-10-26 05:16:14,937 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:14.937/2097241.730 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-52, member=1): Restarting NamedCache: test234aaaacheckimage-cache
2013-10-26 05:16:15,600 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:15.600/2097242.393 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32824,localport=8088]}
2013-10-26 05:16:17,602 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:17.602/2097244.395 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32825,localport=8088]}
2013-10-26 05:16:19,605 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:19.605/2097246.398 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32828,localport=8088]}
2013-10-26 05:16:21,609 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:21.609/2097248.402 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32829,localport=8088]}
2013-10-26 05:16:23,611 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:23.611/2097250.404 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32830,localport=8088]}
2013-10-26 05:16:25,616 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:25.616/2097252.409 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32831,localport=8088]}
2013-10-26 05:16:27,619 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:27.619/2097254.412 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32832,localport=8088]}
2013-10-26 05:16:29,621 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:29.621/2097256.414 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32833,localport=8088]}
2013-10-26 05:16:31,626 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:31.626/2097258.419 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32834,localport=8088]}
2013-10-26 05:16:33,631 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:33.631/2097260.424 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32835,localport=8088]}
2013-10-26 05:16:35,632 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:35.632/2097262.425 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32836,localport=8088]}
2013-10-26 05:16:37,636 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:37.635/2097264.428 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32837,localport=8088]}
2013-10-26 05:16:39,641 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:39.640/2097266.433 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32838,localport=8088]}
2013-10-26 05:16:41,643 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:41.643/2097268.436 Oracle Coherence GE 3.5.1/461 <D4> (thread=TcpRingListener, member=1): Rejecting connection to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32841,localport=8088]}
2013-10-26 05:16:47,329 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:47.329/2097274.122 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member(Id=3, Timestamp=2013-10-26 05:16:47.128, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870) joined Cluster with senior member 1
2013-10-26 05:16:47,425 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:47.425/2097274.218 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member 3 joined Service Management with senior member 1
2013-10-26 05:16:47,477 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:47.476/2097274.269 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member 3 joined Service DistributedCache with senior member 1
2013-10-26 05:16:47,501 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:47.500/2097274.294 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Service DistributedCache: sending ServiceConfigSync containing 270 entries to Member 3
2013-10-26 05:16:47,548 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:47.548/2097274.341 Oracle Coherence GE 3.5.1/461 <D5> (thread=TcpRingListener, member=1): TcpRing: connecting to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=32846,localport=8088]}
2013-10-26 05:16:48,454 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:48.453/2097275.246 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): 2> Transferring 43 out of 129 primary partitions to member 3 requesting 43
2013-10-26 05:16:48,709 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:48.709/2097275.502 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): 2> Transferring 39 out of 125 primary partitions to member 3 requesting 39
2013-10-26 05:16:48,885 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:48.884/2097275.677 Oracle Coherence GE 3.5.1/461 <D5> (thread=http-0.0.0.0-8080-210, member=1): Repeating QueryRequest due to the re-distribution of PartitionSet{132, 133, 134, 135, 136, 137, 138, 139, 140, 141}
2013-10-26 05:16:50,850 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:50.848/2097277.641 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): 2> Transferring 29 out of 115 primary partitions to member 3 requesting 29
2013-10-26 05:16:50,968 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:50.968/2097277.761 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): 2> Transferring 21 out of 107 primary partitions to member 3 requesting 21
2013-10-26 05:16:51,097 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.097/2097277.890 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): 2> Transferring 14 out of 100 primary partitions to member 3 requesting 14
2013-10-26 05:16:51,218 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.218/2097278.011 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): 2> Transferring 6 out of 92 primary partitions to member 3 requesting 6
2013-10-26 05:16:51,340 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.340/2097278.133 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): 2> Transferring 1 out of 87 primary partitions to member 3 requesting 1
2013-10-26 05:16:51,352 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.352/2097278.145 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Transferring 540KB of backup[1] for PartitionSet{171, 172, 173, 174, 175, 176, 177} to member 3
2013-10-26 05:16:51,465 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.464/2097278.257 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Transferring 575KB of backup[1] for PartitionSet{178, 179, 180, 181, 182, 183} to member 3
2013-10-26 05:16:51,569 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.569/2097278.362 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Transferring 537KB of backup[1] for PartitionSet{184, 185, 186, 187} to member 3
2013-10-26 05:16:51,688 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.688/2097278.481 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Transferring 553KB of backup[1] for PartitionSet{188, 189, 190, 191, 192, 193, 194, 195, 196} to member 3
2013-10-26 05:16:51,817 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.817/2097278.610 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Transferring 526KB of backup[1] for PartitionSet{197, 198, 199, 200, 201, 202} to member 3
2013-10-26 05:16:51,928 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:51.928/2097278.721 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Transferring 768KB of backup[1] for PartitionSet{203, 204, 205, 206, 207, 208, 209} to member 3
2013-10-26 05:16:52,040 -0700 level=ERROR class="STDERR"
2013-10-26 05:16:52.039/2097278.832 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Transferring 198KB of backup[1] for PartitionSet{210, 211, 212, 213} to member 3
2013-10-26 05:19:06,157 -0700 level=ERROR class="STDERR"
2013-10-26 05:19:06.157/2097412.950 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-63, member=1): Restarting NamedCache: throttleData-cache
2013-10-26 05:22:15,094 -0700 level=ERROR class="STDERR"
2013-10-26 05:22:15.094/2097601.887 Oracle Coherence GE 3.5.1/461 <Info> (thread=http-0.0.0.0-8080-136, member=1): Restarting NamedCache: test234aaaadepositslipimage-cache
2013-10-26 05:22:17,183 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 05:28:49,617 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 05:29:39,729 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 05:33:37,607 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 05:39:33,872 -0700 level=INFO class="STDOUT"
WARN   getResponseBody, Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
2013-10-26 06:49:30,617 -0700 level=ERROR class="STDERR"
2013-10-26 06:49:30.617/2102837.410 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=1): Experienced a 6378 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2013-10-26 05:16:09.259, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988); 56 packets rescheduled, PauseRate=0.0011, Threshold=1976
2013-10-26 07:39:18,855 -0700 level=ERROR class="STDERR"
2013-10-26 07:39:18.854/2105825.647 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=1): Experienced a 7318 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2013-10-26 05:16:47.128, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870); 68 packets rescheduled, PauseRate=8.0E-4, Threshold=497
2013-10-26 07:49:37,510 -0700 level=ERROR class="STDERR"
2013-10-26 07:49:37.510/2106444.303 Oracle Coherence GE 3.5.1/461 <Warning> (thread=PacketPublisher, member=1): Experienced a 6653 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2013-10-26 05:16:09.259, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988); 69 packets rescheduled, PauseRate=0.0014, Threshold=1785
Copyright (c) 2000, 2009, Oracle and/or its affiliates. All rights reserved.
2013-10-26 08:26:11,291 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:11.291/44.293 Oracle Coherence GE 3.5.1/461 <Info> (thread=main, member=n/a): Loaded cache configuration from "file:/usr/local/whp-jboss-web-5/server/default/env/test234aaaacoherence-cache-config.xml"
2013-10-26 08:26:12,263 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.263/45.265 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a
2013-10-26 08:26:12,477 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.477/45.479 Oracle Coherence GE 3.5.1/461 <Info> (thread=Cluster, member=n/a): This Member(Id=1, Timestamp=2013-10-26 08:26:12.289, Address=1.1.1..85:8088, MachineId=32853, Location=site:test.test.net,machine:testabc304,process:6207, Role=JavaLangThread, Edition=Grid Edition, Mode=Development, CpuCount=4, SocketCount=4) joined cluster "cluster:0x27CB" with senior Member(Id=2, Timestamp=2013-10-26 05:16:09.259, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988, Edition=Grid Edition, Mode=Development, CpuCount=4, SocketCount=4)
2013-10-26 08:26:12,501 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.501/45.503 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=n/a): Member(Id=3, Timestamp=2013-10-26 05:16:47.128, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870) joined Cluster with senior member 2
2013-10-26 08:26:12,507 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.506/45.508 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=n/a): Member 2 joined Service Management with senior member 2
2013-10-26 08:26:12,507 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.507/45.509 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=n/a): Member 2 joined Service DistributedCache with senior member 2
2013-10-26 08:26:12,520 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.520/45.522 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=n/a): Member 3 joined Service Management with senior member 2
2013-10-26 08:26:12,520 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.520/45.522 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=n/a): Member 3 joined Service DistributedCache with senior member 2
2013-10-26 08:26:12,639 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.639/45.641 Oracle Coherence GE 3.5.1/461 <D5> (thread=Invocation:Management, member=1): Service Management joined the cluster with senior service member 2
2013-10-26 08:26:12,700 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:12.700/45.702 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): TcpRing: connecting to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..49,port=8088,localport=52891]}
2013-10-26 08:26:13,191 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:13.190/46.193 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Service DistributedCache joined the cluster with senior service member 2
2013-10-26 08:26:14,538 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:14.538/47.540 Oracle Coherence GE 3.5.1/461 <D5> (thread=TcpRingListener, member=1): TcpRing: connecting to member 2 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..86,port=40281,localport=8088]}
2013-10-26 08:26:29,695 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:29.694/62.696 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): TcpRing: disconnected from member 2 due to a kill request
2013-10-26 08:26:29,695 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:29.694/62.696 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member 2 left service Management with senior member 3
2013-10-26 08:26:29,695 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:29.694/62.696 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member 2 left service DistributedCache with senior member 3
2013-10-26 08:26:29,696 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:29.696/62.698 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member(Id=2, Timestamp=2013-10-26 08:26:29.694, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988) left Cluster with senior member 3
2013-10-26 08:26:30,069 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:30.069/63.071 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member(Id=5, Timestamp=2013-10-26 08:26:29.871, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988) joined Cluster with senior member 3
2013-10-26 08:26:30,271 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:30.271/63.273 Oracle Coherence GE 3.5.1/461 <D5> (thread=TcpRingListener, member=1): TcpRing: connecting to member 5 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/1.1.1..86,port=40285,localport=8088]}
2013-10-26 08:26:30,272 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:30.272/63.274 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member 5 joined Service Management with senior member 3
2013-10-26 08:26:30,443 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:30.443/63.445 Oracle Coherence GE 3.5.1/461 <D5> (thread=Cluster, member=1): Member 5 joined Service DistributedCache with senior member 3
2013-10-26 08:26:38,739 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:38.738/71.740 Oracle Coherence GE 3.5.1/461 <D5> (thread=DistributedCache, member=1): Service DistributedCache: received ServiceConfigSync containing 272 entries
2013-10-26 08:26:43,241 -0700 level=ERROR class="STDERR"
2013-10-26 08:26:43.241/76.243 Oracle Coherence GE 3.5.1/461 <Error> (thread=main, member=1): Error while starting service "DistributedCache": com.tangosol.net.RequestTimeoutException: Timeout during service start: ServiceInfo(Id=2, Name=DistributedCache, Type=DistributedCache
MemberSet=ServiceMemberSet(
OldestMember=Member(Id=3, Timestamp=2013-10-26 05:16:47.128, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870)
ActualMemberSet=MemberSet(Size=3, BitSetCount=2
Member(Id=1, Timestamp=2013-10-26 08:26:12.289, Address=1.1.1..85:8088, MachineId=32853, Location=site:test.test.net,machine:testabc304,process:6207, Role=JavaLangThread)
Member(Id=3, Timestamp=2013-10-26 05:16:47.128, Address=1.1.1..49:8088, MachineId=32817, Location=site:test.test.net,machine:testabc30b,process:3870)
Member(Id=5, Timestamp=2013-10-26 08:26:29.871, Address=1.1.1..86:8088, MachineId=32854, Location=site:test.test.net,machine:testabc305,process:3988)
MemberId/ServiceVersion/ServiceJoined/ServiceLeaving
1/3.5/Sat Oct 26 08:26:13 PDT 2013/false,
3/3.5/Sat Oct 26 05:16:47 PDT 2013/false,
5/3.5/Sat Oct 26 08:26:30 PDT 2013/false
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onStartupTimeout(Grid.CDB:6)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.start(Service.CDB:28)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.start(Grid.CDB:38)
    at com.tangosol.coherence.component.util.SafeService.startService(SafeService.CDB:28)
    at com.tangosol.coherence.component.util.safeService.SafeCacheService.startService(SafeCacheService.CDB:5)
    at com.tangosol.coherence.component.util.SafeService.ensureRunningService(SafeService.CDB:27)
    at com.tangosol.coherence.component.util.SafeService.start(SafeService.CDB:14)
    at com.tangosol.net.DefaultConfigurableCacheFactory.ensureService(DefaultConfigurableCacheFactory.java:973)
    at com.tangosol.net.DefaultConfigurableCacheFactory.ensureCache(DefaultConfigurableCacheFactory.java:842)
    at com.tangosol.net.DefaultConfigurableCacheFactory.configureCache(DefaultConfigurableCacheFactory.java:1053)
    at com.tangosol.net.DefaultConfigurableCacheFactory.ensureCache(DefaultConfigurableCacheFactory.java:290)
    at com.tangosol.net.CacheFactory.getCache(CacheFactory.java:747)
    at com.tangosol.net.CacheFactory.getCache(CacheFactory.java:724

Hi
The common causes of communication delays and packet timeouts are excessive GC pauses, high CPU usage, and swapping.
Each of these occurrences may disrupt the Coherence packet processing threads, thus preventing the processing and acknowledgment of packets from other cluster members.
1 check GC performance , see process memory consumption and GC logs.
2 check cpu , vmstat , top command.
3 check swap , vmstat command.
see Oracle Support Doc ID 1110544.1
Although communication delays and packet timeouts can be caused by network related issue.
check performance network :
Performing a Datagram Test for Network Performance - Coherence 3.5 User Guide - Oracle Coherence Knowledge Base
regards,
Leo_TA

Stopping cluster due to unhandled exception: java.lang.ArrayIndexOutOfBound

We had this problem in production where one node among the 16 node cluster terminated with this error.
2013-04-12 11:39:00.533/1139.283 Oracle Coherence EE 3.6.1.4 <Warning> (thread=PacketPublisher, member=4): Experienced a 12316 ms communication delay (probable remote GC) with Member(Id=6, Timestamp=2013-04-12 11:20:08.733, Address=169.168.22.79:32120, MachineId=5967, Location=XXXX,machine:XXXXXXX,process:18088102,member:Container1u7, Role=XXXXXXXX); 114 packets rescheduled, PauseRate=0.0108, Threshold=1878
2013-04-12 11:47:35.704/2528.573 Oracle Coherence EE 3.6.1.4 <Error> (thread=PacketReceiver, member=1): Stopping cluster due to unhandled exception: java.lang.ArrayIndexOutOfBoundsException
     at com.tangosol.coherence.component.net.Packet.extract(Packet.CDB:30)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketReceiver.onNotify(PacketReceiver.CDB:28)
     at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
     at java.lang.Thread.run(Thread.java:777)
After that that when the services which are configured to restart, tried to restart it failed with following exception. Any idea what would be causing this error. We have WKA configured.
2013-04-12 11:47:35.951/2528.820 Oracle Coherence EE 3.6.1.4 <Error> (thread=DEFAULT_EDN-Thread-28, member=n/a): Error while starting cluster: (Wrapped) java.io.IOException: SystemSocketProvider unable find available port(s)
     at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
     at com.tangosol.util.Base.ensureRuntimeException(Base.java:269)
     at com.tangosol.coherence.component.net.Cluster.onStart(Cluster.CDB:232)
     at com.tangosol.coherence.component.net.Cluster.start(Cluster.CDB:11)
     at com.tangosol.coherence.component.util.SafeCluster.startCluster(SafeCluster.CDB:3)
     at com.tangosol.coherence.component.util.SafeCluster.restartCluster(SafeCluster.CDB:7)
     at com.tangosol.coherence.component.util.SafeCluster.ensureRunningCluster(SafeCluster.CDB:26)
     at com.tangosol.coherence.component.util.SafeService.restartService(SafeService.CDB:22)
     at com.tangosol.coherence.component.util.SafeService.ensureRunningService(SafeService.CDB:39)
     at com.tangosol.coherence.component.util.safeService.SafeCacheService.ensureRunningCacheService(SafeCacheService.CDB:3)
     at com.tangosol.coherence.component.util.SafeNamedCache$CacheAction.run(SafeNamedCache.CDB:3)
     at java.security.AccessController.doPrivileged(AccessController.java:252)
     at javax.security.auth.Subject.doAs(Subject.java:494)
     at com.tangosol.coherence.component.util.SafeNamedCache.restartNamedCache(SafeNamedCache.CDB:8)
     at com.tangosol.coherence.component.util.SafeNamedCache.ensureRunningNamedCache(SafeNamedCache.CDB:33)
     at com.tangosol.coherence.component.util.SafeNamedCache.getRunningNamedCache(SafeNamedCache.CDB:1)
     at com.tangosol.coherence.component.util.SafeNamedCache.lock(SafeNamedCache.CDB:1)
     at container.pool.BoundedThreadPool$PooledThread.run(BoundedThreadPool.java:591)
Caused by: java.io.IOException: SystemSocketProvider unable find available port(s)
     at com.tangosol.coherence.component.net.Cluster$SocketManager.bindListeners(Cluster.CDB:117)
     at com.tangosol.coherence.component.net.Cluster.onStart(Cluster.CDB:228)
     ... 20 more

Hello,
This not a Coherence bug. It looks like the system is running out of memory.
Best regards,
-Dave

Other members not responding when a clustered member is not reachable

We have an product that uses Coherence 3.3 we have cluster 5 members.
We are finding that if for whatever reason a cluster member is down or unresponsive (eg doing a garbage collection) the other members are then trying to do some work to compensate, which leaves them not responding to requests. The nodes have been configured in a unicast mode so each node has been specified in the configuration.
Examples we've seen:
1) member 3 does a stop the world gc all other members pause - we see the following in the logs on all other members
2009-12-31 00:20:18.291 Oracle Coherence EE 3.3/387 <Warning> (thread=PacketPublisher, member=6): Experienced a 1982 ms communication delay (probable remote GC) with Member(Id=1, Timestamp=2009-12-30 16:01:26.85, Address=172.29.4.16:8090, MachineId=46352, Location=process:27758@and-msg4); 30 packets rescheduled, PauseRate=0.0030, Threshold=901
In this instance we had a pause of 19 seconds.
Can we configure the nodes to not wait if communication to member is not possible. How?
2) Take a single node down during the day (off peak hours) The other members start logging a lot of coherence debugging which I've been told not to worry about, but during this period the applicatiokn is not processing requests and causes timeout to the calling servers.
When a member is removed from the cluster what are the other nodes doing (redistributing cache?)? How can we minimise the fallout from this so that we can handle this more gracefully.
I would be grateful for any help and any useful links to documentation. It may well be that our configuration is incorrect as we leave most things as the default values. I could do with some pointers on where to start looking and which parameters to tweak as there seem to be quite a few.
Regards
Fez
What can we do to minimise this?
Edited by: user12406699 on 04-Jan-2010 07:31

Hi there thank you for the suggestions.
1) I understand that the pause during the gc will be helped by ensuring they are as short as possible. I have tinkered with the CMS parameters and I think our old was becoming very fragmented and therefore hitting the point where a stop the world occurred. I have limited the number of threads and increased the heap space. After a couple of days we are seeing that the concurrent garbage collector is doing work more frequently but with no stop the worlds. So I think issue 1 has will be resolved by this.
2) This issue still exists. When we stop tomcat server for one of the nodes we see a pause and lots of debug messages from coherence during this time the other node are no longer processing requests as we can see the number of running requests increase and the time to respond increases as well across the node. Because of the amount of traffic we are recieving our clients soon timeout.
Going over the sugestions
a) graceful shutdown of the cluster node
How? Currently we stop traffic being driven to the node and then stop tomcat. Is there a way to also gracefully stop node being apart of the cluster??
b) Set <timeout-milliseconds>
I don't quite understand the documentation for this setting
"Default value is 60000.
Note: For production use, the recommended value is the greater of 60000 and two times the maximum expected full GC duration."
So we will be currently using the default. To me 60 seconds sounds very high for a production timeout which makes me think that this is not what I had in mind. Is there anything in the configuartion that may help?
Regards
Fez

Java.lang.OutOfMemoryError: getNewTla in coherence production cluster!

Hi guys, we need some urgent help, JVMs in our production coherence cluster would randomly stop due to the outofmemory error, and we cannot find out the root cause.
1) Version 3.7.1.4, running 4x servers, each with 40x jvm, each jvm set to 2GB heap, for a total of 320GB cluster. Each server also has a extend proxy running with 3GB heap (no issue)
2) The cluster is configured using WKA by explicitly listing out all the server nodes in the config.
3) Our data storage is only ~30GB, details below.
Stats for cache 'CACHE0':
Number of cache entries: 14761116
Memory usage (mb): 26722.643
Average entry size (bytes): 1898
Stats for cache 'CACHE1':
Number of cache entries: 46047
Memory usage (mb): 51.911
Average entry size (bytes): 1182
Stats for cache 'CACHE2':
Number of cache entries: 4
Memory usage (mb): 0.154
Average entry size (bytes): 40448
Stats for cache 'CACHE3':
Number of cache entries: 69
Memory usage (mb): 0.705
Average entry size (bytes): 10707
Grand total: 26775.413 MB, Number of entries: 14807237
4) Random jvms storage nodes (not proxy) on each server would just go down with below errors, we cannot reproduce the issue, it just happens at random. Out of 40 jvms on each server about 3-5 went down over the weekend on, the issue happens on all 4 servers.
ERROR Coherence - 2012-08-11 11:36:51.670/156864.993 Oracle Coherence GE 3.7.1.4 <Error> (thread=Cluster, member=17):
java.lang.OutOfMemoryError: getNewTla
at java.util.HashMap.newKeyIterator(HashMap.java:1024)
at java.util.HashMap$KeySet.iterator(HashMap.java:1062)
at java.util.HashSet.iterator(HashSet.java:153)
at sun.nio.ch.SelectorImpl.processDeregisterQueue(SelectorImpl.java:127)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at com.tangosol.coherence.component.net.TcpRing.select(TcpRing.CDB:11)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.onWait(ClusterService.CDB:6)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
at java.lang.Thread.run(Thread.java:662)
ERROR Coherence - 2012-08-11 11:36:51.854/156865.177 Oracle Coherence GE 3.7.1.4 <Error> (thread=PacketListener1, member=17): Stopping cluster due to unhandled exception: java.lang.OutOfMemoryError: java/net/Inet4Address, size 24B
at java.net.PlainDatagramSocketImpl.receive0(Native Method)
at java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:145)
at java.net.DatagramSocket.receive(DatagramSocket.java:725)
at com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:22)
at com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:1)
at com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:20)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
at java.lang.Thread.run(Thread.java:662)
Exception in thread "Main Thread" java.lang.OutOfMemoryError
5) Initially we thought it was because of a issue with the small default packet speaker size when joinining the nodes since we using WKA. But changing the config did not help:
<coherence>
<cluster-config>
<packet-speaker>
<volume-threshold>
<minimum-packets>10000</minimum-packets>
</volume-threshold>
</packet-speaker>
</cluster-config>
</coherence>
Out of ideas, any help will be greatly appreciated. Thanks

i dont think the issue is with the code, i just noticed as soon as i start up all the cache servers, 1 went down already. And noone is accessing the system.
this is extremely troublesome, i am loading the hprof file to look at the dump per suggestion above, not sure if it will help pinpoint the root cause though.
cacheserver:1 30578 [Logger@9218328 3.7.1.4] WARN Coherence - 2012-08-13 13:52:14.857/32.262 Oracle Coherence GE 3.7.1.4 <Warning> (thread=PacketPublisher, member=22): Experienced a 1230 ms communication delay (probable remote GC) with Member(Id=1, Timestamp=2012-08-09 16:02:24.413, Address=xxxxxx, MachineId=xxxxx, Location=site:,machine:xxxxx,process:27118,member:xxxxxx:cacheserver:1, Role=CoherenceServer); 25 packets rescheduled, PauseRate=0.042, Threshold=875
Exception in thread "Main Thread" java.lang.OutOfMemoryError
[WARN ][thread ] dispatchUncaughtException
Logger: java.lang.OutOfMemoryError
java.lang.OutOfMemoryError
Exception in thread "PacketListener1" java.lang.OutOfMemoryError
[WARN ][thread ] dispatchUncaughtException
[WARN ][thread ] dispatchUncaughtException
java.lang.OutOfMemoryError: getNewTla
at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:983)
at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:976)
at java.lang.Thread.dispatchUncaughtException(Thread.java:1874)
java/lang/OutOfMemoryError: getNewTla
--- End of stack trace
java/lang/OutOfMemoryError: getNewTla
--- End of stack trace
Exception in thread "Logger@9218328 3.7.1.4" java.lang.OutOfMemoryError
Exception in thread "PacketListener1P" java.lang.OutOfMemoryError

BW sales load issue

I have a problem with this. The jobs in the screen snapshot below are running too long - 2LIS_11_VAITM, 2LIS_11_V_ITM, and 2LIS_13_VDITM. They use to run for just a couple of minutes. The job configuration is; when the first job finishes a delay of 10 minutes will be incurred then the second job will run, then another 10 minute delay before the third job runs. After these 3 jobs are complete an activation job needs to run. This activation job began to fail because the first 3 jobs were not finished. For now, the activation job has been moved but we need to understand why these are running longer and fix it. We may be experiencing a problem with database performance in PDF but I would not think it would take that much longer.
This is the BW sales load issue
The info-package group is schedule @ 00:05 am and it contains 3 info-packages: VAITM, V_ITM and VDITM with 10 minutes execution gap.
The first VAITM started 00:05:46 and ended it up in BW at 00:09:02.
Then we have 10 minutes delayed between jobs.
V_ITM started 00:20:12 and for some reason it finished at 01:00:51. We can see communication between PDF and BWP took much longer than normal.
Then we have 10 minutes delayed between jobs
At 01:00:25 the activation job started and failed. It requires the info-packages to be in GREEN status but V_ITM finished @ 01:00:51.
It probably failed because V_ITM still had YELLOW status at the moment.
VDITM started 01:12:02 and finally finished it load at 02:09:16. A big communication delayed between system.
See below an example of the load on the 01/14/2007. It looks normal communication between systems
Could you please give solution for this.
Thanks
Vasu.

Hi,
For every job you define and at the same time, you can also define the priority
levels to each job, since, during your job execution might be your job queue is
full with some other background processings, so try setting up the priority levels
of the job.
cheers,
Pattan.

Strong connection

Hi all,
I've got trouble with joining a member. There are already two members which talk to each other perfectly.
Can anyone shed light on this log entry?
FEINER: thread=Cluster, member=1: TcpRing: disconnected from member 2 due to transition to a strong connection
What is a strong connection?
Senior member log:
FEINER: thread=Cluster, member=1: Member(Id=3, Timestamp=2008-11-19 11:32:17.892, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2016) joined Cluster with senior member 1
19.11.2008 11:32:19 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=Cluster, member=1: TcpRing: connecting to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[http://addr=/192.168.0.198,port=8088,localport=1754]}
19.11.2008 11:32:19 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=Cluster, member=1: TcpRing: disconnected from member 2 due to transition to a strong connection
Member failing to join logs:
INFO: thread=Cluster, member=n/a: This Member(Id=3, Timestamp=2008-11-19 11:32:17.892, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2016, Edition=Grid Edition, Mode=Production, CpuCount=2, SocketCount=1) joined cluster "cluster:0x37EB" with senior Member(Id=1, Timestamp=2008-11-19 11:31:03.796, Address=192.168.0.163:8088, MachineId=26787, Location=site:my.local,machine:my34,process:5256, Role=CoherenceServer, Edition=Grid Edition, Mode=Production, CpuCount=4, SocketCount=2)
19.11.2008 11:32:18 com.tangosol.coherence.component.util.logOutput.Jdk log
INFO: thread=AWT-EventQueue-0, member=3: Loaded cache configuration from resource "file:/C:/workspace/Java/trunk/build/classes/coherence-cache-config.xml"
19.11.2008 11:32:48 com.tangosol.coherence.component.util.logOutput.Jdk log
SCHWERWIEGEND: thread=AWT-EventQueue-0, member=3: Error while starting service "DistributedCache": com.tangosol.net.RequestTimeoutException: Request timed out after 30017 millis
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.checkRequestTimeout(Grid.CDB:8)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.poll(Grid.CDB:52)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.poll(Grid.CDB:11)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.ensureService(ClusterService.CDB:15)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.start(Grid.CDB:23)
at com.tangosol.coherence.component.util.SafeService.startService(SafeService.CDB:28)
at com.tangosol.coherence.component.util.safeService.SafeCacheService.startService(SafeCacheService.CDB:5)
at com.tangosol.coherence.component.util.SafeService.ensureRunningService(SafeService.CDB:27)
at com.tangosol.coherence.component.util.SafeService.start(SafeService.CDB:14)
at com.tangosol.net.DefaultConfigurableCacheFactory.ensureService(DefaultConfigurableCacheFactory.java:841)
at com.tangosol.net.DefaultConfigurableCacheFactory.ensureCache(DefaultConfigurableCacheFactory.java:710)
at com.tangosol.net.DefaultConfigurableCacheFactory.configureCache(DefaultConfigurableCacheFactory.java:919)
at com.tangosol.net.DefaultConfigurableCacheFactory.ensureCache(DefaultConfigurableCacheFactory.java:277)
at com.tangosol.net.CacheFactory.getCache(CacheFactory.java:689)
at com.tangosol.net.CacheFactory.getCache(CacheFactory.java:667)
Thanks in advance,
-Ralf
Edited by: user989723 on Nov 19, 2008 12:33 PM
I changed my configuration as followes:
<multicast-listener>
<address>235.1.2.3</address>
<port>3091</port>
<time-to-live>1</time-to-live>
<join-timeout-milliseconds>30000</join-timeout-milliseconds>
</multicast-listener>
<packet-publisher>
<packet-delivery>
<timeout-milliseconds>60000</timeout-milliseconds>
</packet-delivery>
</packet-publisher>
And:
<service id="3">
<service-type>DistributedCache</service-type>
<service-component>DistributedCache</service-component>
<use-filters>
<filter-name>gzip</filter-name>
</use-filters>
<init-params>
<init-param id="2">
<param-name>lease-granularity</param-name>
<param-value>member</param-value>
</init-param>
<init-param id="3">
<param-name>partition-count</param-name>
<param-value>31</param-value>
</init-param>
<init-param id="5">
<param-name>transfer-threshold</param-name>
<param-value>1024</param-value>
</init-param>
<init-param id="6">
<param-name>backup-count</param-name>
<param-value>2</param-value>
</init-param>
<init-param id="8">
<param-name>thread-count</param-name>
<param-value>4</param-value>
</init-param>
<init-param id="14">
<param-name>request-timeout</param-name>
<param-value>60000</param-value>
</init-param>
</init-params>
</service>
Waiting for a "long" time then shows this in senior member log:
19.11.2008 12:09:25 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=Cluster, member=1: TcpRing: disconnected from member 2 due to transition to a strong connection
19.11.2008 12:15:16 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=1: Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304) has failed to respond to 17 packets; declaring this member as paused.
19.11.2008 12:15:28 com.tangosol.coherence.component.util.logOutput.Jdk log
WARNUNG: thread=PacketPublisher, member=1: Experienced a 11797 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304); 75 packets rescheduled, PauseRate=0.0323, Threshold=1976
19.11.2008 12:15:29 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=Cluster, member=1: TcpRing: connecting to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/192.168.0.198,port=8088,localport=2802]}
19.11.2008 12:16:31 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=1: Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304) has failed to respond to 17 packets; declaring this member as paused.
19.11.2008 12:16:38 com.tangosol.coherence.component.util.logOutput.Jdk log
WARNUNG: thread=PacketPublisher, member=1: Experienced a 6719 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304); 50 packets rescheduled, PauseRate=0.0426, Threshold=1878
19.11.2008 12:17:12 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=1: Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304) has failed to respond to 17 packets; declaring this member as paused.
19.11.2008 12:17:12 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=1: Experienced a 15 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304); 17 packets rescheduled, PauseRate=0.0396, Threshold=1785
19.11.2008 12:19:04 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=1: Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304) has failed to respond to 17 packets; declaring this member as paused.
19.11.2008 12:19:12 com.tangosol.coherence.component.util.logOutput.Jdk log
WARNUNG: thread=PacketPublisher, member=1: Experienced a 8547 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304); 59 packets rescheduled, PauseRate=0.046, Threshold=1612
19.11.2008 12:19:50 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=1: Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304) has failed to respond to 17 packets; declaring this member as paused.
19.11.2008 12:20:03 com.tangosol.coherence.component.util.logOutput.Jdk log
WARNUNG: thread=PacketPublisher, member=1: Experienced a 13469 ms communication delay (probable remote GC) with Member(Id=3, Timestamp=2008-11-19 12:09:24.322, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:304); 83 packets rescheduled, PauseRate=0.0634, Threshold=1456
19.11.2008 12:20:03 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=Cluster, member=1: TcpRing: connecting to member 3 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/192.168.0.198,port=8088,localport=2907]}
The JVMs (6u10) are started with args:
-server -verbose:gc -Xms512m -Xmx512m -XX:+DisableExplicitGC -XX:+UseParallelGC -Djava.library.path=lib/jTDS_1.2.2/SSO -Dtangosol.coherence.config=coherence-cache-config.xml -Dtangosol.coherence.override=tangosol-coherence-override.xml

Thanks Rob, so why is a disconnection forced for a strong connection? I think it's ok for a cluster to have a connection with a different machine ;-)
We've got two machines in development environment:
1. Server: Intel QuadCore, 2 GB RAM, 2x1 GBit (Teaming), 2 JVMs: DefaultCacheServer, MyCacheServer for loading data from database and inserting into cache
2. Workstation: Intel DualCore, 4 GB RAM, 1 GBit, 1 JVM: Client working on data in cache
Despite having GBit network cards, both are running on 100 MBit/s. In production there are more workstations querying the cache.
I see GCs in log of a JVM on the server when I start my desktop application:
20.11.2008 12:36:00 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=Cluster, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) joined Cluster with senior member 1
20.11.2008 12:36:00 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=Cluster, member=4: Member 2 joined Service Management with senior member 1
20.11.2008 12:36:01 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=Cluster, member=4: Member 2 joined Service DistributedCache with senior member 1
20.11.2008 12:36:01 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=DistributedCache, member=4: 2> Transferring 3 out of 11 primary partitions to member 2 requesting 3
20.11.2008 12:36:01 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=DistributedCache, member=4: 2) Transferring backup[1] for partition 12 from member 1 (over 3) to member 2 (under 13)
20.11.2008 12:36:01 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=DistributedCache, member=4: 2) Transferring backup[2] for partition 13 from member 3 (over 6) to member 2 (under 9)
20.11.2008 12:36:02 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=DistributedCache, member=4: 2) Transferring backup[2] for partition 14 from member 3 (over 4) to member 2 (under 6)
20.11.2008 12:36:02 com.tangosol.coherence.component.util.logOutput.Jdk log
FEINER: thread=DistributedCache, member=4: 2) Transferring backup[2] for partition 15 from member 3 (over 3) to member 2 (under 5)
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 304 packets rescheduled, PauseRate=0.0, Threshold=2775
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 16 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 276 packets rescheduled, PauseRate=0.0015, Threshold=2637
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 136 packets rescheduled, PauseRate=0.0015, Threshold=2506
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 16 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 546 packets rescheduled, PauseRate=0.0030, Threshold=2381
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 15 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 252 packets rescheduled, PauseRate=0.0044, Threshold=2262
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 129 packets rescheduled, PauseRate=0.0044, Threshold=2149
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 128 packets rescheduled, PauseRate=0.0044, Threshold=2042
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 126 packets rescheduled, PauseRate=0.0044, Threshold=1940
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 128 packets rescheduled, PauseRate=0.0044, Threshold=1843
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 63 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 410 packets rescheduled, PauseRate=0.0103, Threshold=1751
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 33 packets rescheduled, PauseRate=0.0103, Threshold=1664
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 97 packets rescheduled, PauseRate=0.0102, Threshold=1660
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 15 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 227 packets rescheduled, PauseRate=0.0116, Threshold=1577
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:10 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 33 packets rescheduled, PauseRate=0.0116, Threshold=1499
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 230 packets rescheduled, PauseRate=0.0115, Threshold=1496
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 15 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 260 packets rescheduled, PauseRate=0.0128, Threshold=1422
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 72 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 185 packets rescheduled, PauseRate=0.0128, Threshold=1351
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 15 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 253 packets rescheduled, PauseRate=0.0141, Threshold=1284
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 33 packets rescheduled, PauseRate=0.0141, Threshold=1220
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 16 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 275 packets rescheduled, PauseRate=0.0155, Threshold=1159
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 15 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 214 packets rescheduled, PauseRate=0.0169, Threshold=1102
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 47 packets rescheduled, PauseRate=0.0168, Threshold=1047
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 186 packets rescheduled, PauseRate=0.0167, Threshold=946
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 16 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 24 packets rescheduled, PauseRate=0.0181, Threshold=899
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 16 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 40 packets rescheduled, PauseRate=0.0195, Threshold=855
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 12:36:11 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=4: Experienced a 0 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 12:35:59.883, Address=192.168.0.198:8088, MachineId=26822, Location=site:my.local,machine:mylap,process:2100); 29 packets rescheduled, PauseRate=0.0193, Threshold=853
And I see GCs on my workstation:
20.11.2008 12:36:06 ui.task.ConnectClusterTask enableCluster
FEIN: connected to cache
[GC 46688K->25091K(517056K), 0.0414128 secs]
[GC 68867K->48150K(517056K), 0.0357713 secs]
[GC 91926K->67998K(517056K), 0.0765421 secs]
[GC 111774K->89280K(517056K), 0.1152057 secs]
[GC 133056K->115866K(492736K), 0.1066485 secs]
[GC 135322K->124544K(504896K), 0.0755632 secs]
[GC 144000K->133172K(504896K), 0.0989548 secs]
[GC 152628K->141928K(504896K), 0.0867878 secs]
[GC 161384K->150792K(504896K), 0.0935929 secs]
[GC 170248K->159660K(504896K), 0.0803323 secs]
[GC 179116K->161604K(504896K), 0.0322206 secs]
[GC 181060K->161640K(504896K), 0.0158883 secs]
[GC 181096K->161632K(504896K), 0.0065547 secs]
[GC 181088K->161632K(504896K), 0.0064441 secs]
[GC 181088K->161632K(504896K), 0.0076786 secs]
[GC 181088K->161632K(504896K), 0.0068146 secs]
[GC 181088K->161632K(504896K), 0.0065274 secs]
[GC 181088K->161632K(504896K), 0.0066196 secs]
[GC 181088K->161632K(504896K), 0.0064584 secs]
[GC 181088K->161648K(504896K), 0.0080762 secs]
20.11.2008 12:36:15 ui.task.SearchTask doInBackground
FEIN: result.size() == 2
I can see the "remote GC error messages" in server log start when I query the cache (see timestamps; query started at 12:36:06, error messages started at 12:36:10). I am using two MapListener without a filter etc. in my desktop application for acting on newly inserted keys.
My application works as followes:
1. DB query of a whole table
2. Server does calculations on data from 1.
3. Server inserts approx. ~100,000 entries in one key in cache (key "list" => list with entries)
4. The server inserts a key referring to a list entry from 3.
5. Client acts on that key and inserts an "answer" into cache
My opinion is that working on the large list causes many minor collections. Even if I query namedCache.keySet() for listing all keys in cache (in development there are only two keys) I see minor collections.
I also tried using -Xincgc instead of -XX:UseParallelGC and setting the size of young generation: -XX:NewSize=256m -XX:MaxNewSize=512m which slightly reduced minor collections.
Thanks,
-Ralf
Edited by: Ralf B. on Nov 20, 2008 1:15 PM
Thats interesting. Sizing up the young generation to 512 MB stops minor collections, but:
Server log shows "communication delay":
20.11.2008 13:13:40 com.tangosol.coherence.component.util.logOutput.Jdk log
AM FEINSTEN: thread=PacketPublisher, member=1: Member(Id=2, Timestamp=2008-11-20 13:12:44.311, Address=192.168.0.163:8089, MachineId=26787, Location=site:my.local,machine:myserv,process:5208, Role=CacheServer) has failed to respond to 17 packets; declaring this member as paused.
20.11.2008 13:13:54 com.tangosol.coherence.component.util.logOutput.Jdk log
WARNUNG: thread=PacketPublisher, member=1: Experienced a 13907 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2008-11-20 13:12:44.311, Address=192.168.0.163:8089, MachineId=26787, Location=site:my.local,machine:myserv,process:5208, Role=CacheServer); 85 packets rescheduled, PauseRate=0.1998, Threshold=3364
Server shows no GC at all while showing all keys in cache (as I said above, that was causing GCs too):
key=work_1 locked by=null
key=list locked by=null
No more GCs on my workstation.
JVM args used for server and client: -Xincgc -XX:+DisableExplicitGC -XX:NewSize=512m -XX:MaxNewSize=512m -Xms1024m -Xmx1024m
Edited by: Ralf B. on Nov 20, 2008 4:10 PM
I was able to get rid of GCs by sizing the young generation. But "communication delay" messages are logged when an application accesses the cache. I am curious about the "0 - 16 ms" delays, can this be due to network outages?
Thanks,
-Ralf

UNIQUE constraint vs checking before INSERT

I have a SQL server table RealEstate with columns - Id, Property, Property_Value. This table has about 5-10 million rows and can increase even more in the future. I want to insert a row only if a combination of Id, Property, Property_Value does not exist
in this table.
Example Table -
1,Rooms,5
1,Bath,2
1,Address,New York
2,Rooms,2
2,Bath,1
2,Address,Miami
Inserting 2,Address,Miami should NOT be allowed. But, 2,Price,2billion is okay. I am curious to know which is the "best" way to do this and
why. The why part is most important to me.
Check if a row exists before you insert it.
Set unique constraints on all 3 columns and let the database do the checking for you.
Is there any scenario where one would be better than the other ?
Thanks.

Why?
Because the database engine does exactly what you want - it is designed to do this in a way that anticipates collisions with simultaneous inserts and allows only a single row for any given combination of values. If you choose to manage this at the
application level - which is the alternative you propose - then EVERY application that attempts to insert rows must be designed to both check immediately before insertion and immediately afterwards (since these inserts can occur simulateously and you must
allow for communication delays between database and client). And since we know that programmers are not infallible (many other adjectives come to mind as well), there exists a high probability that the duplicate checking logic will fail. And do
not forget that there are many ways of inserting data into the table - it is not just your front-end application that must use this logic - it is also every other application that is used to manage data (such as SSMS, SSIS, bcp, etc.)

Problem installing SSL certificate for CPS

I work at a medium-sized University, and we have used
Contribute 3 with CPS1.11 for well over a year. Recently, however,
the Contribute clients began having difficulty logging in to CPS.
At first this was intermittent, but is now constant. Adobe support
suggested replacing the CPS self-signed SSL certificate with a
genuine one, because apparently the self-signed certificate is
causing communication delays and timeouts.
I have the certificate, and am trying to use keytool (see
http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/keytool.html)
to install it, but it is asking me for a keystore password, which I
don't know. Apparently the standard defaults are "changeit" or
"passphrase", but neither of these work.
As a test, I created a fresh install of CPS and attempted to
list the keys in the keystore, but again was asked for a keystore
password and the defaults did not work. Adobe support suggested I
ask here. Anybody have any experience installing a certificate for
CPS?

Are you sure that the certificate needs to be installed to all users? Can you provide more details about the certificate and its purposes?
My weblog: en-us.sysadmins.lv
PowerShell PKI Module: pspki.codeplex.com
PowerShell Cmdlet Help Editor pscmdlethelpeditor.codeplex.com
Check out new:
SSL Certificate Verifier
Check out new:
PowerShell FCIV tool.

Coherence cluster reporting the Error while starting TcpProxyService

In my application we have around 70 node. We have a client which connect to the cluster as an TCP extend client.
As per the coherence log a communication delay between the two node is reported and then cluster service is restarted, but it failed to start the TcpProxyService. And this leads to the client connected by TCP Extend, lost connectivity with the cluster.
Q1 : What is the reason and how to avoid this error ?
2: How to notify the client about the connection failure ?
Cohernace log sinppet
990877871 ERROR [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:13.565 Oracle Coherence GE 3.3/387 <Error> (thread=Clu
ster, member=69): This node appears to have partially lost the connectivity: it receives responses from Member(Id=13, Timestamp=2008-12-01 12:46:41.563, Addr
ess=147.114.89.101:8088, MachineId=47205, Location=process:32223@lonrs00531, Role=storage) which communicates with Member(Id=44, Timestamp=2008-12-17 13:28:1
2.927, Address=147.114.89.52:8089, MachineId=47156, Location=process:18354@lonrs00556, Role=storage), but is not responding directly to this member; that cou
ld mean that either requests are not coming out or responses are not coming in; stopping cluster service.
6990879595 INFO [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:15.299 Oracle Coherence GE 3.3/387 <Info> (thread=main
, member=n/a): Restarting cluster
6990881402 INFO [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:17.120 Oracle Coherence GE 3.3/387 <Info> (thread=Clus
ter, member=n/a): Failed to satisfy the variance: allowed=16, actual=420
6990881402 INFO [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:17.120 Oracle Coherence GE 3.3/387 <Info> (thread=Clus
ter, member=n/a): Increasing allowable variance to 66
6990883309 INFO [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:19.004 Oracle Coherence GE 3.3/387 <Info> (thread=Clus
ter, member=n/a): This Member(Id=27, Timestamp=2008-12-17 13:28:17.141, Address=147.114.89.50:8096, MachineId=47154, Location=process:3763@lonrs00554, Role=s
torage, Edition=Grid Edition, Mode=Development, CpuCount=4, SocketCount=2) joined cluster "RIA_PROD" with senior Member(Id=2, Timestamp=2008-09-27 16:33:38.2
71, Address=147.114.89.55:8091, MachineId=47159, Location=process:28137@lonrs00559, Role=storage, Edition=Grid Edition, Mode=Development, CpuCount=4, SocketC
ount=2)
6990884308 INFO [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:20.027 Oracle Coherence GE 3.3/387 <Info> (thread=main
, member=27): Restarting Service: Management
6990885872 INFO [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:21.589 Oracle Coherence GE 3.3/387 <Info> (thread=main
, member=27): Restarting Service: DistributedCache
6990886228 INFO [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:21.947 Oracle Coherence GE 3.3/387 <Info> (thread=main
, member=27): Restarting Service: InvocationService
6990886353 INFO [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:22.071 Oracle Coherence GE 3.3/387 <Info> (thread=main
, member=27): Restarting Service: TcpProxyService
6990886488 ERROR [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:22.207 Oracle Coherence GE 3.3/387 <Error> (thread=Pro
xy:TcpProxyService, member=27): Terminating ProxyService due to unhandled exception: com.tangosol.util.WrapperException
6990886490 ERROR [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:22.207 Oracle Coherence GE 3.3/387 <Error> (thread=Pro
xy:TcpProxyService, member=27):
(Wrapped: error binding ServerSocket to 147.114.89.50:17061)
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at com.tangosol.coherence.component.comm.connectionManager.acceptor.TcpAcceptor.configureSocket(TcpAcceptor.CDB:24)
at com.tangosol.coherence.component.comm.connectionManager.acceptor.TcpAcceptor.doStart(TcpAcceptor.CDB:27)
at com.tangosol.coherence.component.comm.ConnectionManager.start(ConnectionManager.CDB:3)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.ProxyService.onServiceStarted(ProxyService.CDB:20)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service$MemberConfigRequest$Poll.onCompletion(Service.CDB:17)
at com.tangosol.coherence.component.net.Poll.close(Poll.CDB:13)
at com.tangosol.coherence.component.net.Poll.onResponded(Poll.CDB:32)
at com.tangosol.coherence.component.net.Poll.onResponse(Poll.CDB:3)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.onMessage(Service.CDB:24)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.onNotify(Service.CDB:123)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:35)
at java.lang.Thread.run(Thread.java:595)
6990886490 ERROR [Logger@9229206 3.3/387] Coherence - 2008-12-17 13:28:22.208 Oracle Coherence GE 3.3/387 <Error> (thread=mai
n, member=27): Error while starting service "TcpProxyService": (Wrapped: Failed to start Service "TcpProxyService" (ServiceState=SERVICE_STOPPED)) (Wrapped:
error binding ServerSocket to 147.114.89.50:17061) java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)

Hi user1696,
Adding <reusable>true</reusable> to the tcp-acceptor/local-address element (see http://wiki.tangosol.com/display/COH33UG/tcp-acceptor) should fix this.
Regards,
Dimitri

IP Communicator Delay

Similar Messages

Maybe you are looking for