Other members not responding when a clustered member is not reachable

We have an product that uses Coherence 3.3 we have cluster 5 members.
We are finding that if for whatever reason a cluster member is down or unresponsive (eg doing a garbage collection) the other members are then trying to do some work to compensate, which leaves them not responding to requests. The nodes have been configured in a unicast mode so each node has been specified in the configuration.
Examples we've seen:
1) member 3 does a stop the world gc all other members pause - we see the following in the logs on all other members
2009-12-31 00:20:18.291 Oracle Coherence EE 3.3/387 <Warning> (thread=PacketPublisher, member=6): Experienced a 1982 ms communication delay (probable remote GC) with Member(Id=1, Timestamp=2009-12-30 16:01:26.85, Address=172.29.4.16:8090, MachineId=46352, Location=process:27758@and-msg4); 30 packets rescheduled, PauseRate=0.0030, Threshold=901
In this instance we had a pause of 19 seconds.
Can we configure the nodes to not wait if communication to member is not possible. How?
2) Take a single node down during the day (off peak hours) The other members start logging a lot of coherence debugging which I've been told not to worry about, but during this period the applicatiokn is not processing requests and causes timeout to the calling servers.
When a member is removed from the cluster what are the other nodes doing (redistributing cache?)? How can we minimise the fallout from this so that we can handle this more gracefully.
I would be grateful for any help and any useful links to documentation. It may well be that our configuration is incorrect as we leave most things as the default values. I could do with some pointers on where to start looking and which parameters to tweak as there seem to be quite a few.
Regards
Fez
What can we do to minimise this?
Edited by: user12406699 on 04-Jan-2010 07:31

Hi there thank you for the suggestions.
1) I understand that the pause during the gc will be helped by ensuring they are as short as possible. I have tinkered with the CMS parameters and I think our old was becoming very fragmented and therefore hitting the point where a stop the world occurred. I have limited the number of threads and increased the heap space. After a couple of days we are seeing that the concurrent garbage collector is doing work more frequently but with no stop the worlds. So I think issue 1 has will be resolved by this.
2) This issue still exists. When we stop tomcat server for one of the nodes we see a pause and lots of debug messages from coherence during this time the other node are no longer processing requests as we can see the number of running requests increase and the time to respond increases as well across the node. Because of the amount of traffic we are recieving our clients soon timeout.
Going over the sugestions
a) graceful shutdown of the cluster node
How? Currently we stop traffic being driven to the node and then stop tomcat. Is there a way to also gracefully stop node being apart of the cluster??
b) Set <timeout-milliseconds>
I don't quite understand the documentation for this setting
"Default value is 60000.
Note: For production use, the recommended value is the greater of 60000 and two times the maximum expected full GC duration."
So we will be currently using the default. To me 60 seconds sounds very high for a production timeout which makes me think that this is not what I had in mind. Is there anything in the configuartion that may help?
Regards
Fez

Similar Messages

  • I am suddenly unable to open mail from the dock. dock shows mail is up and running, but stamp does not respond when clicked, and will not quit. help?

    I am suddenly unable to open mail from the dock. dock shows mail is up and running, but stamp icon does not respond when clicked, there is no mail window opened, and will not quit. when double clicked, get new mail is not an option and I get no response with other actions.
    I am retrieving mail on my phone with no problem. this just began happening, no updates or problems with anything else.
    help?

    command-option-esc keys and force quit Mail. Then relaunch.

  • Itunes keeps not responding when i try to log out of my account

    itunes keeps not responding when i try to log out of my account
    I have tried uninstalling the itunes program and re Installing it but it has not worked,
    Is there anyone that can help me?

    Have you tried leaving it to hang for 5 minutes?

  • The pass 4 days now my iTunes Store is not responding, when I reload it all that comes up is a blank white page that says iTunes Store. Help please!!!! Cannot load my gift card to buy music and such!!!

    The pass 4 days now my iTunes Store is not responding, when I reload it all that comes up is a blank white page that says iTunes Store. Help please!!!! Cannot load my gift card to buy music and such!!!

    I tookyour suggestion and SUCCESS!  I can now access the Itunes Store.  A simple fix, and thanks so much!!.
    Below is the advice you forwarded:
    I found a solution to my problem.
    > start menu
    > accessories,
    > right click on the command prompt icon and choose "run as administrator".
    Once it opens, type in the following command...
    netsh winsock reset
    hit enter
    You should get a message that the winsock reset successful and you will need to reboot your computer.
    Reboot and when I reloaded itunes the store loaded fine.
    Thanks again, -Dean Stoneburner

  • My ipod touch is not responding when i plugged it  in even with a different cord and restarting my computer

    my ipod touch is not responding when i plugged it  in even with a different cord and restarting my computer and i am using my moms itunes that she has had for a long time and so i cant uninstall itunes.

    Try:
    - iOS: Not responding or does not turn on
    - Also try DFU mode after try recovery mode
    How to put iPod touch / iPhone into DFU mode « Karthik's scribblings
    - If not successful and you can't fully turn the iOS device fully off, let the battery fully drain. After charging for an least an hour try the above again.
    - Try on another computer
    - If still not successful that usually indicates a hardware problem and an appointment at the Genius Bar of an Apple store is in order.
    Apple Retail Store - Genius Bar       

  • The volume up and down controls on my wireless keyboard show a no entry sign and do not respond when used...please help?

    The volume up and down controls on my wireless keyboard show a no entry sign and do not respond when used...please help?

    If you want to get a little more "exotic" you can try remapping the function keys.  I did a little google searching and the hits that looked promising are,
    Mapping volume and eject keys to 3rd-party keyboard Other Hardware
    Spark
    Spark is a powerful, and easy Shortcuts manager. With Spark you can create Hot Keys to launch applications and documents, execute AppleScript, control iTunes, and more...
    You can also export and import your Hot Keys library, or save it in HTML format to print it.
    Spark is free, so use it without moderation!

  • Access 2010 on 64 Bit Windows 7 Access "Not Responding" when changing from forms view to design view and back

    I am running
    Windows 7 64 bit
    Access 2010 32 bit
    Developing an application with a split FE BE with both files local but continue to have the message "Not Responding" when switching from forms view to design view and back as well as if I try to connect to a subform or object on the sub form.

    I have seen this behavior when the form's RecordSource is a complex query such as a crosstab or a query with several nested queries. To test if this is your case, remove the RecordSource and see if the form starts acting normal again.
    Then again, if the form has several subforms they might be slowing up the loading time.
    Bill Mosca
    www.thatlldoit.com
    http://tech.groups.yahoo.com/group/MS_Access_Professionals

  • Lion Clients 10.7.4 show network accounts are unavailable and server is not responding when binding to Snow Leopard server 10.6.8

    Hello,
    I am running Snow Leopard Server 10.6.8 and my clients are Lion 10.7.4.  While testing I had no issues binding 10.7.4 to our 10.6.8 server's OD.  I created a 10.7.4 image to push to all of our machines and in the beginning of last week I was able to push the image and get the machines to bind with OD and apply preferences on these machines through workgroup manager.  Towards the end of the week though this stopped working.  Now any time I bind a 10.7.4 client to OD it allows me to perform an authenticated bind and the machine shows up in workgroup manager but immediatley after binding the client the status jelly next to the OD server in the directory list is red and says "This server is not responding".  If I reboot the client I get a notification that "Network accounts are unavailable" at the login screen.  My preferences from workgroup manager are also not applying, which is my main concern because without workgroup manager my mac server is somewhat pointless as we use it for very little else. 
    I've since tried to bind a snow leopard machine (10.6.8) and this still is working with a green status jelly.  I've also built a lion machine from scratch, updated to the 10.7.4 combined update and am still getting the same issue where it shows the server is not responding when binding to OD.  I then applied the subsiquent OS update after the 10.7.4 combined update but the problem still persists.
    Is anyone else having this issue?  Any help would help me keep my sanity.
    Thanks,
    Dane

    Have you had any luck finding a solution to this?  The only thing I have found was to unbind and then bind without authentication.  Any help with progress on your end would be appreciated!
    Nick.

  • Photoshop CS6 is not responding when leaving PS and coming back again

    A weired problem: My new installation of Photoshop CS6 (OS X Mavericks on a new MBP) is not responding when leaving PS and coming back again.
    Example: i open a new file in PS - then i click on the desktop - when i go back to PS again, it is not responding to any click on any window (except for the menu bar).
    I can solve the problem when i try going back several times or when i go back to PS via the icon in the dock.
    But it is still very annoying in a quick workflow.
    I would appreciate any idea what i can check more, cause i tried everything and i can’t find this issue on the web…
    Thanks…

    Problem solved… The troublemaker was the Trackball Works Driver for Kensington trackballs / Expert Mouses…
    I’m now using a generic driver for USB devices and everything is fine.

  • My computer had to be shut down this morning when I could move my mouse around but it would not respond when I double clicked on an item to select.  Now I cannot open a file in my finder window by double clicking on it.  Any suggestions?

    My computer had to be shut down this morning when I could move my mouse around but it would not respond when I double clicked on an item to select.  Now I cannot open a file in my finder window by double clicking on it.  Any suggestions?

    Did you reinstall CS3 after CC?
    For that matter, doing an in-place upgrade on the OS is always a gamble with Adobe programs. Reinstalling all the versions you need, in order, would probably solve your problem.
    And you shouldn't need to save as IDML after opening the .inx in CC.

  • I am having autocad not respond when I am making a pdf using autocad 2002 and adobe acrobat XI

    I am having autocad not respond when I am making a pdf using autocad 2002 and adobe acrobat XI
    We have adobe acrobat 8 on another system that seems to work fine,  what elsedo we need to do to get this trial version to work properly.
    We are able to get everything set up , sized, etc,  but when the actual process starts it is stating  "autocad not responding and it  just spins and never makes the actual final pdf.
    Any help on this would be appreciated

    It's probably a compatibility problem with the updated PDF libraries vs. you hopelessly old ACAD that can't be resolved. In any case, I'd consider it one of those things of trying to attack the problem on the wrong end. All major CAD programs have native PDF export these days and even ACAD has had since v2006 at least...
    Mylenium

  • HT4097 my ipad not responding when i connected to pc

    my ipad2 not responding when i connected to pc, like no device connection to cpu

    Do you mean evil your device is not recognised in iTunes for Windows? If so,
    Firstly, make sure that your device is not hidden (left hand pane). If it just reads device then toggle between SHOW and HIDE.
    Secondly, try all the other ports on your computer, even a number of times.
    Thirdly, if you have another computer try plugging your device into it without taking any action, give it a moment, remove it and try it back in your other computer again.
    Failing all that, see here - http://support.apple.com/kb/TS1538
    And failing all that put the device into Recovery mode. See here and note the paragraph 'If you restore from a different computer.... ' down near the bottom of the page -
    http://www.apple.com/support/ipad/assistant/itunes/

  • V8.2 "Not Responding" when updating XY Graph Prpoerties

    LabVIEW 8.2 running on Windows XP crashes (slows to mouse clicks then gets reported as "Not Responding") when I try to format the plot properties (color, weight, etc) on a multi-plot XY Graph.

    ST1,
    Could you provide us with some more details so that we can better help you?
    How much data must be graphed to cause the slowdown (both data-points and plots)?
    Does this happen with any multi-plot XY chart or just in one VI?
    Is your XY chart inside a tab control or similar structure?
    Is your computer running low on memory when this happens?
    Regards,
    Simon H
    Applications Engineer
    National Instruments

  • I pad does not respond when trying to turn on

    I pad does not respond when trying to turn on

    Charge the battery.
    Try restoring your iPad > iTunes: Backing up, updating, and restoring your iPhone, iPad, or iPod touch software

  • Firefox does NOT respond when trying to clear history, cache etc

    Firefox does NOT respond when trying to clear history, cache etc
    windows says not responding and I can not clear anything

    Look this one
    http://kb.mozillazine.org/About:config_entries
    go to PRIVACY and read the following
    privacy. item. cache
    privacy. item. cookies
    privacy. item. downloads
    privacy. item. formdata
    privacy. item. history
    in about:config all the above put it to TRUE if there are FALSE
    thank you

Maybe you are looking for

  • Intel HT (Hyper-Threading) BIOS 1.9?

    Sup People!    My Intel HT (Hyper-Threading) don't work right now, what to do?  I was told that I need the new BIOS 1.9 for HT to work?  I have: Intel P4c with HT             Intel 875P with HT             MS Windows XP Professional Version 2002 & SP

  • How to stop MBP mid 2010 having lots of Kernel Panics?

    I got my first MBP in mid 2010...by that time it was the high end 15' MBP (Processor  2.66 GHz Intel Core i7, Memory  4 GB 1067 MHz DDR3, Graphics  NVIDIA GeForce GT 330M 512 MB). After 4-5 months of purchasing it I started having a lot of crashes ou

  • File 2 IDOC scenario

    Hi friends, I got an error which shows as follows in sxmb_moni. SAPHTML_E_CALL_METHOD_FAILED: 桓睯慄慴("HTML000018") I don't understand what does it mean? Can you please help me in this regard? Thanks, Srinivas.

  • How do I get an icloud apple ID? My apple ID "isnt set up" for icloud

    I have an itunes accound and an apple ID but I keep getting told its not good for icloud. Any ideas why and how to fix the problem please?  I have reset the password several times with no changes. I am trying to install it on my PC but with no luck.

  • Title in Brower Window

    Hi - I use Captivate 4. Title of course in browser window is incorrect when project is preview or published.  I have checked preferences, properties, publish settings, etc.  All titles are correct.  On this forum I found info about changing this in e