[(partially?) SOLVED] HP dm1z (4000 model) freezes under heavy load

Alright, well I've had this laptop for a few weeks now, and it's been pretty good. But I've noticed somewhat random freezes when in Gnome 3 (as I explain below, I've been unable to reproduce this on demand, so I do not know if it would happen were I to use something else or not use X), in which the system is totally unresponsive (pointer doesn't move, unable to switch to a virtual console, no sign of I/O) and has to be turned off manually. The laptop has an AMD E-450 APU, with Radeon HD 6320M graphics built in.
Magic sysrq stuff sort of works. The last time it happened, I was able to reboot using sysrq, however, I was not able to restore the framebuffer console and thus was unable to get any data from it. And I do know that Alt + printScreen + v works when the system is not in a crashy state.
The problem has occurred since I got it, and I've kept the system fairly up to date, so the issue has occurred with every kernel version that's been available from three weeks or so ago to now. The problem occurs when using the stock kernel as well as the K8-optimized linux-ck kernel.
I have read every single log in /var/log, and I have not noticed any messages that seem like they could be associated with the problem. A couple do pop out at me, however:
Jan 14 10:25:13 localhost logger: ACPI group/action undefined: thermal_zone / LNXTHERM:00
This has often shown up around the time of the freezes, but also shows up a lot when the system is operating normally. I've got sensord running, however, and the CPU is almost always in the range of 60-70°C, which as far as I can tell is safely below the critical threshold.
[ 292.154] (II) RADEON(0): radeon_dri2_schedule_flip:670 fevent[0x17eb6d0]
[ 292.175] (II) RADEON(0): radeon_dri2_flip_event_handler:1067 fevent[0x17eb6d0] width 1366 pitch 5632 (/4 1408)
Xorg.log gets spammed with this quite a bit, with it showing up anywhere from two or three times a second to once every 30 seconds. Twice so far, I've noticed (after examining Xorg.log.old) post-freeze that only part of the message was written to the log (e.g. just the timestamp, and a handful of characters from the start of the message, or just the first message of the pair).
Jan 14 10:49:17 localhost kernel: [ 6.170288] [Firmware Bug]: Invalid critical threshold (0)
This gets written on every boot up.
I've been unable to find anyone experiencing a similar issue. I am also unable to intentionally reproduce the problem. I've used stress-testing tools to hammer the CPU with at least 3-4 threads, as well as hammer the memory and I/O, but running it for around an hour has not worked.
Also, it seems to be much more likely to freeze after having just rebooted from a freeze. This suggests to me that it is a thermal issue, but nothing I've seen corroborates that.
I've really got no idea how to troubleshoot this, so any help at all would be appreciated. And if you need any additional information, just ask. Thanks in advance.
In case it helps:
rc.conf
lsmod output
lspci output
Last edited by Guff (2012-07-17 16:36:01)

@dimath: thanks for the link. There was a link there to another bug report, which mentioned some kernel parameters to try out. Didn't help.
Since I was having issues with getting my wireless to work properly (which I ultimately was not able to do, so just got a tiny USB dongle for the time being), I did a lot of mucking around in the system. Mucking around with many things that I did not understand. So, I figured well maybe I did something stupid that somehow brought this curse down upon me, so last week I did a total reinstall.
It didn't work.
I've tried KDE, and the problem persists, so it's not gnome-specific for me. I'm trying the catalyst driver now, but of course it's fairly unusable with gnome3 at the moment, so I'm only testing it in KDE. It seems like it might be okay, because I've been doing my best to try and trigger a crash/freeze.
Then again, it took a long while for the issue to pop up once I started using KDE. I still haven't quite worked out how to trigger the damn thing.
It often shows up when building some larger packages, installing packages, and when launching Firefox (I often have an ungodly amount of tabs open, so it is fairly resource-intensive). As mentioned, however, it's not at all consistent.
dimath, have you tried another DE yet? And if the problem persisted, have you tried catalyst? Obviously, I don't expect you to use it with gnome given that it's buggy as hell, but I'd be interested to see if it helps for you at all.

Similar Messages

  • JRockit freezes/hangs under heavy load

    We have a problem with JRockit freezing/hanging when under heavy load.
    It does not crash, but it looks like it is spinning on the CPU. The CPU
    is a XEON with hyper threading, so it is viewed as two CPUs on our linux
    host.
    Any advice on options to apply for further debugging on the issue?
    Here is the scenario:
    a) suddenly, CPU goes up to 50% and stays there (looks like JRockit is
    spinning/busy looping on the one "cpu")
    b) while spinning at 50% CPU, the heap keeps growing until it hits max
    c) the application and JRockit stops responding (but it does not crash)
    d) There is no information on stdout or stderr. Also no information in
    weblogic.log
    The setup is:
    * BEA WebLogic 5.1
    * SuSE Linux 9.0 (*not* enterprise linux)
    * Kernel: 2.4.21_166, SMP 4G
    * JRockit v26.4.0.63 (JDK 1.4.2_11)
    * XEON 3GHz with HT
    * JAVA_OPTIONS="-jrockit -Xms512m -Xmx512m
    -Djrockit.ctrlbreak.enableforce_crash=true"
    Has anyone else experienced this or things like it? I've found many
    issues about JRockit crashes (which actually gives some information to
    work with), but our process is not crashing. It just freezes/hangs and
    uses a lot of CPU.

    We have exactly the same problem on our Windows servers, also Dual core Xeon, running Windows Server 2003 SP 1, with 2Gb physical RAM. We have two Windows servers behind Cisco load balancers.
    When we get this problem, CPU and memory usage seems normal. The Throughput graph in Monitoring -> Performance flat lines at zero, the Queue length goes up by the second and the Garbage collection graph flat lines at whatever value it was at when the service appeared to hang.
    The Java memory settings are: -Xms512m -Xmx1024m
    Can anyone help, please? I have had to manually restart the services on both servers about 10 times so far this morning! This is unusual, but we do have an unsually heavy load this morning.
    Many thanks

  • How hot does your Helix get under heavy load?

    I've had my Helix for a few days now and have finally got around to installing some software on it.  I have the i7 with 256gb SSD and 8GB RAM.. so far I've been very impressed with how 'snappy' it all runs and besides a few annoyances with the touchscreen/pen not working correctly 100% of the time am very happy with the system.
    Anyways, I wanted to see how the Helix would run when pushed really hard so I had Solidworks running (3d modeling/CAD software) and Netflix going on on an external monitor.  The back of the Tablet became super hot--hot enough that you would definitely not want to touch it (let alone try to hold it) or have it on your lap.  I installed TP fan control and it read that the CPU got up to 86 degrees Celsius.  Granted this is not representative of normal operating conditions, but I'm concerned about how hot it got.  Has anyone else noticed their Helix's running really hot under heavy load? Is this normal?
    Solved!
    Go to Solution.

    ryanpm wrote:
    I'm in the same boat as you on this. It can get very bad sometimes. Under light use, it never get's as cool as I feel it should. I go through the currently running applications and try to run as minimally as possible, but it still creates a fair amount of heat. My battery life is around 6 hours with both batteries too. I definitely don't get the 8-9 hours it's supposed to be getting.
    I think the 10 hrs thing was for the i5 model. It's still five times longer than my X201t with 4cell battery, so I'm happy with this. Shorter battery time is something you have to live with if you are going to use an i7 machine on the go.
    I also heard that actually the RTM version of the Helix IS the revised version from the initial batch(or the previewers' batch)... I heard the first machines were truely HOT.
    [Added] Also, I think long term heat damage will be of less concern in case of the Helix. The business already has many years of slate tablets with standard CPUs that did have problems with heat(namely, Asus EP121 which had its LCD turn yellow because of heat, and Samsung Slate 7 which had serious throttling issues in its early days). If Lenovo HAS intelligence to learn from the history, which I want to believe they do, they should probably would have dealt with that, considering this machine may actually be a second revision. And even if I don't believe Lenovo, I still trust in Yamato Lab.
    Kim
    W540( ), Helix(Sold), Tablet 2(Sold) Tablet 2( ), X120e(Sold), X201 Tablet, X41 Tablet(Sold), X41, X32*4, 701C*4

  • Command Execution fails only under heavy load and JPA cache store.

    Using...
    Coherence: 3.5.2
    Commad Pattern: 2.5.0
    I have a servlet that simply creates a simple POJO puts in the cache and then fires off a command that does additional formatting to the POJO.
    With JDBC cachestore I have no problem. It works fine.
    With JPA I get the bellow exception under heavy load only. I.e: If I send one-two requests. Everything works fine and my data is present in the database after 30 second write-behind. Once I bump up to 200 users I get the exception.
    I figure I'm running out of memory on the cache because of JPA enitites take more resources?
    2010-01-21 10:50:01.051/159.235 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=3): Failed to execute CommandExecutionRequest.Key
    {contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1158}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 10:50:01.051/159.235 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=3): (Wrapped: Failed request execution for ModelService service on Member(Id=3, Timestamp=2010-01-21 10:47:22.79, Address=xxxxxx:8088, MachineId=2616, Location=site:xxxxxx,machine:xxxxx,process:2
    560, Role=CoherenceServer)) java.lang.NullPointerException
    at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:36)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:80)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$InvokeRequest.run(DistributedCache.CDB:1)
    at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:12)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:9)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
    at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
    at java.lang.Thread.run(Thread.java:619)
    Caused by: java.lang.NullPointerException
    at ca.xxxxxx.coherence.command.EncryptProcessor.process(EncryptProcessor.java:41)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$Storage.invoke(DistributedCache.CDB:20)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:50)
    ... 7 more
    Servlet
         protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {     
              PrintStream out = new PrintStream(response.getOutputStream());
              NamedCache logCache = CacheFactory.getCache("ca.xxxxxx.model.Log");
              Log log = new Log(idGenerator.generateIdentity(), "This is a log...");
              // Do some work here...
              log.setStampOut(new Date());          
              logCache.put(log.getId(), log);
              CommandSubmitter commandSubmitter = DefaultCommandSubmitter.getInstance();                    
              commandSubmitter.submitCommand(contextIdentifier, new EncryptCommand(log));
              out.println("Log Id: " + log.getId() + " - Message: " + log.getMessage() + "++99");
              response.flushBuffer();
         }Command
    00000001 package ca.xxxxxx.coherence.command;
    00000002
    00000003 import java.io.IOException;
    00000004
    00000005 import com.oracle.coherence.patterns.command.Command;
    00000006 import com.oracle.coherence.patterns.command.ExecutionEnvironment;
    00000007 import com.tangosol.io.pof.PofReader;
    00000008 import com.tangosol.io.pof.PofWriter;
    00000009 import com.tangosol.io.pof.PortableObject;
    00000010 import com.tangosol.net.CacheFactory;
    00000011 import com.tangosol.net.NamedCache;
    00000012 import com.tangosol.util.InvocableMap;
    00000013 import com.tangosol.util.processor.AbstractProcessor;
    00000014
    00000015 import ca.xxxxxx.coherence.util.identity.sequence.Sequence;
    00000016 import ca.xxxxxx.coherence.util.identity.sequence.SequenceBlock;
    00000017 import ca.xxxxxx.model.Log;
    00000018
    00000019 @SuppressWarnings( { "unchecked" })
    00000020 public class EncryptCommand implements Command, PortableObject {
    00000021
    00000022      private Log log;
    00000023
    00000024      public EncryptCommand() {
    00000025      }
    00000026
    00000027      public EncryptCommand(Log log) {
    00000028           this.log = log;
    00000029      }
    00000030
    00000031      public void execute(ExecutionEnvironment executionEnvironment) {
    00000032
    00000033           NamedCache cache = CacheFactory.getCache("ca.xxxxxx.model.Log");
    00000034
    00000035           cache.invoke(log.getId(), new EncryptProcessor(log));
    00000036           
    00000037           //cache.put(log.getId(), log);
    00000038      }
    00000039
    00000040      public void readExternal(PofReader reader) throws IOException {
    00000041           this.log = (Log) reader.readObject(0);
    00000042      }
    00000043
    00000044      public void writeExternal(PofWriter writer) throws IOException {
    00000045           writer.writeObject(0, log);
    00000046      }
    00000047
    00000048 /*
    00000049      public String toString() {
    00000050           return String.format("LoggingCommand{%s, id=%d, message=%s}", super
    00000051                     .toString(), log.getId(), log.getMessage());
    00000052      }
    00000053 */
    00000054 } Processor
    00000001 package ca.xxxxxx.coherence.command;
    00000002
    00000003 import java.io.IOException;
    00000004
    00000005 import ca.xxxxxx.coherence.util.identity.sequence.Sequence;
    00000006 import ca.xxxxxx.coherence.util.identity.sequence.SequenceBlock;
    00000007 import ca.xxxxxx.model.Log;
    00000008
    00000009 import com.tangosol.io.pof.PofReader;
    00000010 import com.tangosol.io.pof.PofWriter;
    00000011 import com.tangosol.io.pof.PortableObject;
    00000012 import com.tangosol.util.InvocableMap;
    00000013 import com.tangosol.util.processor.AbstractProcessor;
    00000014
    00000015 public class EncryptProcessor extends AbstractProcessor implements PortableObject{
    00000016
    00000017      private static final long serialVersionUID = -6272835614833329999L;
    00000018      private Log log;
    00000019      
    00000020      public EncryptProcessor(){
    00000021           // deserialization constructor          
    00000022      }
    00000023      
    00000024      public EncryptProcessor(Log log){
    00000025           this.log = log;
    00000026      }
    00000027
    00000028      public Object process(InvocableMap.Entry entry)
    00000029      {
    00000030           Log log = (Log)entry.getValue();
    00000031           
    00000032           // Pretend we are encrypting. Proof that async processing is better.
    00000033           try
    00000034           {
    00000035                Thread.sleep(50);
    00000036           }
    00000037           catch(Exception ex)
    00000038           {
    00000039                System.out.println("Exception: " + ex.toString());
    00000040           }
    00000041           
    00000042           log.setMessage("!@#$%^&*()-_=+");
    00000043
    00000044           entry.setValue(log);
    00000045
    00000046           return log;          
    00000047      }
    00000048
    00000049      public void readExternal(PofReader reader) throws IOException {
    00000050           this.log = (Log) reader.readObject(0);
    00000051      }
    00000052
    00000053      public void writeExternal(PofWriter writer) throws IOException {
    00000054           writer.writeObject(0, log);
    00000055      }     
    00000056 }Edited by: user12249856 on Jan 21, 2010 8:12 AM
    Edited by: user12249856 on Jan 21, 2010 8:13 AM
    Edited by: user12249856 on Jan 21, 2010 8:14 AM
    Edited by: user12249856 on Jan 21, 2010 9:07 AM
    Edited by: user12249856 on Jan 21, 2010 10:15 AM

    Done look at the original thread logs are below... I'm thinking that the key is no longer in the cache at the point the processor fires?
    I can also assure you that there is only to instance of coherence running
    1- The actual cache server node
    2- The client, set to localstorage = false
    I ensure the cache server starts and then I start the client.
    Logs
    2010-01-21 13:36:12.596/0.234 Oracle Coherence 3.5.2/463 <Info> (thread=main, member=n/a): Loaded operational configuration from resource "jar:file:/xxxxxx/coherence/lib/coherence.jar!/tangosol-coherence.xml"
    2010-01-21 13:36:12.596/0.234 Oracle Coherence 3.5.2/463 <Info> (thread=main, member=n/a): Loaded operational overrides from resource "jar:file:/xxxxxx/coherence/lib/coherence.jar!/tangosol-coherence-override-dev.xml"
    2010-01-21 13:36:12.596/0.234 Oracle Coherence 3.5.2/463 <Info> (thread=main, member=n/a): Loaded operational overrides from resource "jar:file:/xxxxxx/coherence/lib/coherence-common-1.5.0.jar!/tangosol-coherence-override.xml"
    2010-01-21 13:36:12.596/0.234 Oracle Coherence 3.5.2/463 <D5> (thread=main, member=n/a): Optional configuration override "/custom-mbeans.xml" is not specified
    Oracle Coherence Version 3.5.2/463
    Grid Edition: Development mode
    Copyright (c) 2000, 2009, Oracle and/or its affiliates. All rights reserved.
    2010-01-21 13:36:12.784/0.422 Oracle Coherence GE 3.5.2/463 <Info> (thread=main, member=n/a): Loaded cache configuration from "file:/xxxxxx/coherence/bin/test-cache-config.xml"
    2010-01-21 13:36:12.799/0.437 Oracle Coherence GE 3.5.2/463 <Info> (thread=main, member=n/a): Loaded cache configuration from "jar:file:/xxxxxx/coherence/lib/coherence-commandpattern-2.5.0.jar!/coherence-commandpattern-pof-cache-config.xml"
    2010-01-21 13:36:12.799/0.437 Oracle Coherence GE 3.5.2/463 <Info> (thread=main, member=n/a): Loaded cache configuration from "jar:file:/xxxxxx/coherence/lib/coherence-common-1.5.0.jar!/coherence-common-cache-config.xml"
    2010-01-21 13:36:13.159/0.797 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a
    2010-01-21 13:36:16.409/4.047 Oracle Coherence GE 3.5.2/463 <Info> (thread=Cluster, member=n/a): Created a new cluster "cluster:0xD3FB" with Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=4, SocketCount=2) UID=0x0A00004200000126522B95890A421F98
    2010-01-21 13:36:16.424/4.062 Oracle Coherence GE 3.5.2/463 <D5> (thread=Invocation:Management, member=1): Service Management joined the cluster with senior service member 1
    2010-01-21 13:36:16.627/4.265 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForSequenceGenerators, member=1): Service DistributedCacheForSequenceGenerators joined the cluster with senior service member 1
    2010-01-21 13:36:16.643/4.281 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Service DistributedCacheForCommandPattern joined the cluster with senior service member 1
    2010-01-21 13:36:16.659/4.297 Oracle Coherence GE 3.5.2/463 <Info> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Loading POF configuration from resource "file:/xxxxxx/coherence/bin/test-pof-config.xml"
    2010-01-21 13:36:16.659/4.297 Oracle Coherence GE 3.5.2/463 <Info> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Loading POF configuration from resource "jar:file:/xxxxxx/coherence/lib/coherence.jar!/coherence-pof-config.xml"
    2010-01-21 13:36:16.659/4.297 Oracle Coherence GE 3.5.2/463 <Info> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Loading POF configuration from resource "jar:file:/xxxxxx/coherence/lib/coherence-common-1.5.0.jar!/coherence-common-pof-config.xml"
    2010-01-21 13:36:16.674/4.312 Oracle Coherence GE 3.5.2/463 <Info> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Loading POF configuration from resource "jar:file:/xxxxxx/coherence/lib/coherence-commandpattern-2.5.0.jar!/coherence-commandpattern-pof-config.xml"
    2010-01-21 13:36:16.752/4.390 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForCommandPatternDistributedCommands, member=1): Service DistributedCacheForCommandPatternDistributedCommands joined the cluster with senior service member 1
    2010-01-21 13:36:16.752/4.390 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:SequencesService, member=1): Service SequencesService joined the cluster with senior service member 1
    2010-01-21 13:36:16.752/4.390 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:ModelService, member=1): Service ModelService joined the cluster with senior service member 1
    2010-01-21 13:36:16.768/4.406 Oracle Coherence GE 3.5.2/463 <Info> (thread=main, member=1): Started DefaultCacheServer...
    SafeCluster: Name=cluster:0xD3FB
    Group{Address=224.3.5.2, Port=35463, TTL=4}
    MasterMemberSet
    ThisMember=Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)
    OldestMember=Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)
    ActualMemberSet=MemberSet(Size=1, BitSetCount=2
    Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)
    RecycleMillis=120000
    RecycleSet=MemberSet(Size=0, BitSetCount=0
    Services
    TcpRing{TcpSocketAccepter{State=STATE_OPEN, ServerSocket=xxxxxx:8088}, Connections=[]}
    ClusterService{Name=Cluster, State=(SERVICE_STARTED, STATE_JOINED), Id=0, Version=3.5, OldestMemberId=1}
    InvocationService{Name=Management, State=(SERVICE_STARTED), Id=1, Version=3.1, OldestMemberId=1}
    DistributedCache{Name=DistributedCacheForSequenceGenerators, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=257, BackupPartitions=0}
    DistributedCache{Name=DistributedCacheForCommandPattern, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=257, BackupPartitions=0}
    DistributedCache{Name=DistributedCacheForCommandPatternDistributedCommands, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=257, BackupPartitions=0}
    DistributedCache{Name=SequencesService, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=257, BackupPartitions=0}
    DistributedCache{Name=ModelService, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=1, AssignedPartitions=257, BackupPartitions=0}
    2010-01-21 13:36:47.534/35.172 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member(Id=2, Timestamp=2010-01-21 13:36:47.346, Address=xxxxxx:8088, MachineId=2648, Location=site:xxxxxx.net,machine:xxxxxx,process:1912, Role=MortbayStartMain) joined Cluster with senior member 1
    2010-01-21 13:36:47.580/35.218 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member 2 joined Service Management with senior member 1
    2010-01-21 13:36:47.768/35.406 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member 2 joined Service SequencesService with senior member 1
    2010-01-21 13:36:47.877/35.515 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:SequencesService, member=1): Service SequencesService: sending ServiceConfigSync containing 258 entries to Member 2
    2010-01-21 13:36:47.971/35.609 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member 2 joined Service DistributedCacheForCommandPattern with senior member 1
    2010-01-21 13:36:47.971/35.609 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Service DistributedCacheForCommandPattern: sending ServiceConfigSync containing 258 entries to Member 2
    2010-01-21 13:36:48.034/35.672 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Context Identifier{TestEncrypt} has been inserted into this member
    2010-01-21 13:36:48.034/35.672 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Creating CommandExecutor for Identifier{TestEncrypt}
    2010-01-21 13:36:48.065/35.703 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Created CommandExecutor for Identifier{TestEncrypt}
    2010-01-21 13:36:48.065/35.703 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForCommandPattern, member=1): Scheduling ContextExecutor for Identifier{TestEncrypt} to start
    2010-01-21 13:36:48.065/35.703 Oracle Coherence GE 3.5.2/463 <D5> (thread=CommandExecutor:Thread-2, member=1): Starting CommandExecutor for Identifier{TestEncrypt}
    2010-01-21 13:36:48.080/35.718 Oracle Coherence GE 3.5.2/463 <D5> (thread=CommandExecutor:Thread-2, member=1): CommandExecutor for Identifier{TestEncrypt} has been configured as DefaultContextConfiguration{managementStrategy=DISTRIBUTED}
    2010-01-21 13:36:48.080/35.718 Oracle Coherence GE 3.5.2/463 <D5> (thread=CommandExecutor:Thread-2, member=1): Recovering unexecuted commands for CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:36:48.127/35.765 Oracle Coherence GE 3.5.2/463 <D5> (thread=CommandExecutor:Thread-2, member=1): No commands to recover for CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:36:48.127/35.765 Oracle Coherence GE 3.5.2/463 <D5> (thread=CommandExecutor:Thread-2, member=1): Registering JMX management extensions for CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:36:48.127/35.765 Oracle Coherence GE 3.5.2/463 <D5> (thread=CommandExecutor:Thread-2, member=1): No commands to execute for CommandExecutor Identifier{TestEncrypt}. (waiting for commands to be submitted)
    2010-01-21 13:36:48.127/35.765 Oracle Coherence GE 3.5.2/463 <D5> (thread=CommandExecutor:Thread-2, member=1): Started CommandExecutor for Identifier{TestEncrypt}
    2010-01-21 13:36:48.424/36.062 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): TcpRing: connecting to member 2 using TcpSocket{State=STATE_OPEN, Socket=Socket[addr=/xxxxxx,port=8088,localport=3301]}
    2010-01-21 13:37:26.940/74.578 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member 2 joined Service ModelService with senior member 1
    2010-01-21 13:37:26.940/74.578 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:ModelService, member=1): Service ModelService: sending ServiceConfigSync containing 258 entries to Member 2
    2010-01-21 13:37:27.362/75.000 Oracle Coherence GE 3.5.2/463 <D5> (thread=Cluster, member=1): Member 2 joined Service DistributedCacheForCommandPatternDistributedCommands with senior member 1
    2010-01-21 13:37:27.377/75.015 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:DistributedCacheForCommandPatternDistributedCommands, member=1): Service DistributedCacheForCommandPatternDistributedCommands: sending ServiceConfigSync containing 259 entries to Member 2
    2010-01-21 13:38:30.721/138.359 Oracle Coherence GE 3.5.2/463 <Error> (thread=DistributedCache:ModelService, member=1): Attempting recovery (due to soft timeout) of Daemon{Thread="Thread[WriteBehindThread:CacheStoreWrapper(com.tangosol.coherence.jpa.JpaCacheStore),5,WriteBehindThread:CacheStoreWrapper(com.tangosol.coherence.jpa.JpaCacheStore)]", State=Running}
    2010-01-21 13:38:34.268/141.906 Oracle Coherence GE 3.5.2/463 <Error> (thread=DistributedCache:ModelService, member=1): Terminating guarded execution (due to hard timeout) of Daemon{Thread="Thread[WriteBehindThread:CacheStoreWrapper(com.tangosol.coherence.jpa.JpaCacheStore),5,WriteBehindThread:CacheStoreWrapper(com.tangosol.coherence.jpa.JpaCacheStore)]", State=Running}
    2010-01-21 13:38:34.268/141.906 Oracle Coherence GE 3.5.2/463 <Error> (thread=Termination Thread, member=1): Write-behind thread timed out; stopping the cache service
    2010-01-21 13:38:34.284/141.922 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:ModelService, member=1): Service ModelService left the cluster
    2010-01-21 13:38:34.487/142.125 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=1): Failed to execute CommandExecutionRequest.Key{contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1223}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:38:34.487/142.125 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): java.lang.RuntimeException: Service has been terminated
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$BinaryMap.onMissingStorage(DistributedCache.CDB:9)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$BinaryMap.ensureRequestTarget(DistributedCache.CDB:34)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$BinaryMap.invoke(DistributedCache.CDB:20)
         at com.tangosol.util.ConverterCollections$ConverterInvocableMap.invoke(ConverterCollections.java:2110)
         at com.tangosol.util.ConverterCollections$ConverterNamedCache.invoke(ConverterCollections.java:2565)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$ViewMap.invoke(DistributedCache.CDB:11)
         at com.tangosol.coherence.component.util.SafeNamedCache.invoke(SafeNamedCache.CDB:1)
         at ca.xxxxxx.coherence.command.EncryptCommand.execute(EncryptCommand.java:92)
         at com.oracle.coherence.patterns.command.internal.CommandExecutor.execute(CommandExecutor.java:889)
         at com.oracle.coherence.patterns.command.internal.CommandExecutor$3.run(CommandExecutor.java:960)
         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:619)
    2010-01-21 13:38:34.487/142.125 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): Restarting Service: ModelService
    2010-01-21 13:38:34.502/142.140 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:ModelService, member=1): Service ModelService joined the cluster with senior service member 2
    2010-01-21 13:38:34.518/142.156 Oracle Coherence GE 3.5.2/463 <D5> (thread=DistributedCache:ModelService, member=1): Service ModelService: received ServiceConfigSync containing 259 entries
    2010-01-21 13:38:34.518/142.156 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): Restarting NamedCache: ca.xxxxxx.model.Log
    2010-01-21 13:38:34.549/142.187 Oracle Coherence GE 3.5.2/463 <Warning> (thread=DistributedCache:ModelService, member=1): Assigned 257 orphaned primary partitions
    2010-01-21 13:38:34.549/142.187 Oracle Coherence GE 3.5.2/463 <D4> (thread=DistributedCache:ModelService, member=1): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256,
    2010-01-21 13:38:34.643/142.281 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=1): Failed to execute CommandExecutionRequest.Key{contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1224}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:38:34.643/142.281 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): (Wrapped: Failed request execution for ModelService service on Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)) java.lang.NullPointerException
         at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:36)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:80)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$InvokeRequest.run(DistributedCache.CDB:1)
         at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:12)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:9)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
         at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
         at java.lang.Thread.run(Thread.java:619)
    Caused by: java.lang.NullPointerException
         at ca.xxxxxx.coherence.command.EncryptProcessor.process(EncryptProcessor.java:41)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$Storage.invoke(DistributedCache.CDB:20)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:50)
         ... 7 more
    2010-01-21 13:38:34.705/142.343 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=1): Failed to execute CommandExecutionRequest.Key{contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1225}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:38:34.721/142.359 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): (Wrapped: Failed request execution for ModelService service on Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)) java.lang.NullPointerException
         at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:36)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:80)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$InvokeRequest.run(DistributedCache.CDB:1)
         at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:12)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:9)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
         at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
         at java.lang.Thread.run(Thread.java:619)
    Caused by: java.lang.NullPointerException
         at ca.xxxxxx.coherence.command.EncryptProcessor.process(EncryptProcessor.java:41)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$Storage.invoke(DistributedCache.CDB:20)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:50)
         ... 7 more
    2010-01-21 13:38:34.784/142.422 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=1): Failed to execute CommandExecutionRequest.Key{contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1226}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:38:34.784/142.422 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): (Wrapped: Failed request execution for ModelService service on Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)) java.lang.NullPointerException
         at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:36)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:80)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$InvokeRequest.run(DistributedCache.CDB:1)
         at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:12)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:9)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
         at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
         at java.lang.Thread.run(Thread.java:619)
    Caused by: java.lang.NullPointerException
         at ca.xxxxxx.coherence.command.EncryptProcessor.process(EncryptProcessor.java:41)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$Storage.invoke(DistributedCache.CDB:20)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:50)
         ... 7 more
    2010-01-21 13:38:34.846/142.484 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=1): Failed to execute CommandExecutionRequest.Key{contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1227}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:38:34.862/142.500 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): (Wrapped: Failed request execution for ModelService service on Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)) java.lang.NullPointerException
         at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:36)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:80)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$InvokeRequest.run(DistributedCache.CDB:1)
         at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:12)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:9)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
         at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
         at java.lang.Thread.run(Thread.java:619)
    Caused by: java.lang.NullPointerException
         at ca.xxxxxx.coherence.command.EncryptProcessor.process(EncryptProcessor.java:41)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$Storage.invoke(DistributedCache.CDB:20)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:50)
         ... 7 more
    2010-01-21 13:38:34.924/142.562 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=1): Failed to execute CommandExecutionRequest.Key{contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1228}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:38:34.924/142.562 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): (Wrapped: Failed request execution for ModelService service on Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)) java.lang.NullPointerException
         at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:36)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:80)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$InvokeRequest.run(DistributedCache.CDB:1)
         at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:12)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:9)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
         at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
         at java.lang.Thread.run(Thread.java:619)
    Caused by: java.lang.NullPointerException
         at ca.xxxxxx.coherence.command.EncryptProcessor.process(EncryptProcessor.java:41)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$Storage.invoke(DistributedCache.CDB:20)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:50)
         ... 7 more
    2010-01-21 13:38:35.002/142.640 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=1): Failed to execute CommandExecutionRequest.Key{contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1229}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:38:35.002/142.640 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): (Wrapped: Failed request execution for ModelService service on Member(Id=1, Timestamp=2010-01-21 13:36:13.065, Address=xxxxxx:8088, MachineId=2626, Location=site:xxxxxx.net,machine:xxxxxx,process:40488, Role=CoherenceServer)) java.lang.NullPointerException
         at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:36)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:80)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$InvokeRequest.run(DistributedCache.CDB:1)
         at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:12)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:9)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
         at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
         at java.lang.Thread.run(Thread.java:619)
    Caused by: java.lang.NullPointerException
         at ca.xxxxxx.coherence.command.EncryptProcessor.process(EncryptProcessor.java:41)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$Storage.invoke(DistributedCache.CDB:20)
         at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onInvokeRequest(DistributedCache.CDB:50)
         ... 7 more
    2010-01-21 13:38:35.065/142.703 Oracle Coherence GE 3.5.2/463 <Error> (thread=CommandExecutor:Thread-4, member=1): Failed to execute CommandExecutionRequest.Key{contextIdentifier=Identifier{TestEncrypt}, ticket=Ticket{1.1230}, managementStrategy=DISTRIBUTED} with CommandExecutor Identifier{TestEncrypt}
    2010-01-21 13:38:35.065/142.703 Oracle Coherence GE 3.5.2/463 <Info> (thread=CommandExecutor:Thread-4, member=1): (Wrapped:

  • WebLogic is not responding to client request under heavy load

    Under heavy loading, our WLS 6.0 SP2 will be not be responding to client requests
    after some time, say, 1 hour. From the administration console, the request throughput
    is 0 while request waiting accumulates to over 1,000. Even the clients stop making
    request to the server, the request waiting will not drop. It seems that WLS blocked
    on some threads and unable to serve the requests (see attached thread dump for
    details).
    Our configuration is as follow:
    OS: SunOS 5.7
    Database: MS SQL Server 7.0 running on NT4 Enterprise Edition.
    (JDBC Driver is the one bundled with WLS 6.0)
    We know that using 'DriverManager.getConnection()' can cause deadlock in number
    of cases but we have gone through all the code but all of our Database connections
    are obtained through datasource. And the connection is closed properly in finally
    block.
    Moreover, the application is originally running on WLS 4.51 but no problem is
    encountered.
    Does any expert know what the problem and solution are?
    Thanks!
    [session.log]

    It appears that some threads were possibly blocked at some jDriver calls
    initially (ExecuteThread 0,10,15,23 of the default thread pool). WLS JTA
    subsequently timed out and rolled back the transactions asynchronously. The
    first rollback attempt was blocked at the jDriver level (ExecuteThread 18 of
    the default thread pool). Each subsequent rollback retry blocks an
    additional execute thread (due to a JTA bug that is fixed in WLS 6.1, but
    not 6.0 SPs) - ExecuteThread 1-9,11-26 of the default thread pool.
    Eventually, the server ran out of execute threads and became unresponsive.
    It is unclear that whether the initial blocking of threads by the jDriver is
    a jDriver issue or an application issue. Please report to BEA support at
    [email protected] for further assistance.
    Regards,
    Priscilla
    Gary Mok <[email protected]> wrote in message
    news:[email protected]..
    >
    Under heavy loading, our WLS 6.0 SP2 will be not be responding to clientrequests
    after some time, say, 1 hour. From the administration console, the requestthroughput
    is 0 while request waiting accumulates to over 1,000. Even the clientsstop making
    request to the server, the request waiting will not drop. It seems thatWLS blocked
    on some threads and unable to serve the requests (see attached thread dumpfor
    details).
    Our configuration is as follow:
    OS: SunOS 5.7
    Database: MS SQL Server 7.0 running on NT4 Enterprise Edition.
    (JDBC Driver is the one bundled with WLS 6.0)
    We know that using 'DriverManager.getConnection()' can cause deadlock innumber
    of cases but we have gone through all the code but all of our Databaseconnections
    are obtained through datasource. And the connection is closed properly infinally
    block.
    Moreover, the application is originally running on WLS 4.51 but no problemis
    encountered.
    Does any expert know what the problem and solution are?
    Thanks!

  • Under heavy load, flush attribute gives ArrayIndexOutOfBoundsException

    Hi,
    Under heavy load conditions, the jsp include directive's flush attribute which is set to true gives java.lang.ArrayIndexOutOfBoundsException
    A similar bug 6295722 had been reported earlier. But solution to this is not present. Can you please help in resolving this issue. Is this a known issue?
    The following code generates the ArrayIndexOutOfBoundsException
    <jsp:include page="<%=file%>" flush="true" />
    Exception generated is
    java.lang.ArrayIndexOutOfBoundsException: -2147483648
         at java.text.SimpleDateFormat.subFormat(SimpleDateFormat.java:1049)
         at java.text.SimpleDateFormat.format(SimpleDateFormat.java:882)
         at java.text.SimpleDateFormat.format(SimpleDateFormat.java:852)
         at java.text.DateFormat.format(DateFormat.java:316)
         at org.apache.jserv.JServUtils.encodeCookie(JServUtils.java:217)
         at org.apache.jserv.JServConnection.sendHttpHeaders(JServConnection.java:703)
         at org.apache.jserv.JServConnection$JServOutputStream.write(JServConnection.java:1969)
         at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
         at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272)
         at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:276)
         at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122)
         at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212)
         at java.io.PrintWriter.flush(PrintWriter.java:276)
         at oracle.jsp.runtime.OracleJspWriter.flush(OracleJspWriter.java:554)
         at oa_html._Login._jspService(_Login.java:974)
         at oracle.jsp.runtime.HttpJsp.service(HttpJsp.java:119)
         at oracle.jsp.app.JspApplication.dispatchRequest(JspApplication.java:417)
         at oracle.jsp.JspServlet.doDispatch(JspServlet.java:267)
         at oracle.jsp.JspServlet.internalService(JspServlet.java:186)
         at oracle.jsp.JspServlet.service(JspServlet.java:156)
         at javax.servlet.http.HttpServlet.service(HttpServlet.java:588)
         at org.apache.jserv.JServConnection.processRequest(JServConnection.java:456)
         at org.apache.jserv.JServConnection.run(JServConnection.java:294)
         at java.lang.Thread.run(Thread.java:619)
    Thanks,
    Shruthi

    Format objects, like SimpleDateFormat, are generally not thread-safe. This almost looks like a case where one is being used concurrently and it's causing a problem. Take gimbal2's suggestion and contact support for your app server.
    Edit:
    Just did some quick searching and came up with a link to JServUtils.java source code (not sure what version you've got or where the "real" code lives withing Apache):
    http://turbine.apache.org/turbine/turbine-2.2.0/xref/org/apache/jserv/JServUtils.html
    You can see on line 78 that a static SimpleDateFormat field exists. You can also see it being used in the encodeCookie method on lines 103 and 111 without synchronization.
    Edited by: kschneid on Dec 18, 2009 12:39 PM

  • Oracle "IO Error" during SELECT query under heavy load

    We're experiencing a strange connection break during SELECT queries under heavy load.
    Platform Details: Solaris, Oracle 11G, JDK 1.6, 
    Application: Spring + Hibernate (C3p0 connection pooling)
    Exact error messages from a lengthy stack trace are mentioned below:
        2013/06/05 18:49:02 | Caused by: org.springframework.dao.DataAccessResourceFailureException: Hibernate operation: could not execute query; SQL [SQL Ommitted]; IO Error: No such file or directory;      nested exception is java.sql.SQLException: IO Error: No such file or directory 
        2013/06/05 18:49:02 | Caused by: java.sql.SQLException: IO Error: No such file or directory
        2013/06/05 18:49:02 |    at oracle.jdbc.driver.T4CPreparedStatement.fetch(T4CPreparedStatement.java:1091)
        2013/06/05 18:49:02 |    at oracle.jdbc.driver.OracleResultSetImpl.close_or_fetch_from_next(OracleResultSetImpl.java:369)
        2013/06/05 18:49:02 |    at oracle.jdbc.driver.OracleResultSetImpl.next(OracleResultSetImpl.java:273)
        2013/06/05 18:49:02 |    at com.mchange.v2.c3p0.impl.NewProxyResultSet.next(NewProxyResultSet.java:2706)
        2013/06/05 18:49:02 |    at org.hibernate.loader.Loader.doQuery(Loader.java:697)
        2013/06/05 18:49:02 | Caused by: java.net.SocketException: No such file or directory
        2013/06/05 18:49:02 |    at java.net.SocketInputStream.socketRead0(Native Method)
        2013/06/05 18:49:02 |    at java.net.SocketInputStream.read(SocketInputStream.java:129)
        2013/06/05 18:49:02 |    at oracle.net.ns.Packet.receive(Packet.java:282)
    We've started looking at TCP connection settings (Max. TCP connections allowed, Max File descriptors allowed for socket connections at system level). Anything we're missing?
    Why "IO Error: No such file or directory"? Any clue?

    user2951561 wrote:
    That's a better answer indeed.
    I can refine my question if it does not provide you enough information.
    The stack trace i displayed here states that oracle jdbc driver has found the connection to be closed, interrupted etc.
    Application behaves perfectly under normal load but blows up as soon as we reach 3000 concurrent sessions. No firewall is breaking connections, the select query that we observe this behavior for is part of a larger workflow that write data, update some, delete some as well in different tables. Then we see above stack trace for the select query.
    I am trying to explore possible options to investigate. One i mentioned is related to Solaris file descriptors. Could it be database it self?
    Any possible course of action for investigation? Help is much appreciated.
    Oracle errors get reported with error code & message; like ORA-01555 Snapshot Too Old; which is not present in your post.
    You indicated that Connection Pooling is used.
    Is there some (artificial) limit within the application that falls off the cliff at 3000 sessions?
    Oracle does not know or care about the "flavor" of client connection. It treats jdbc the same as OCI or ODBC connections.
    Is OS limited to fixed number of open file handles?

  • Javax.naming.NameNotFoundException of DataSource under heavy load

    WL 6.1 sp2, Solaris 2.8, JDK 1.3.1_02
    Getting the following error on production under heavy-load server, occurring
    infrequently (~1% of request) and irregularly:
    javax.naming.NameNotFoundException::Unable to resolve srPoolDS. Resolved:
    ""; Unresolved:"srPoolDS";
    srPoolDS is an oracle pool DataSource, and a BMP ejb is doing the lookup
    with a local new InitialContext(). Anyone experienced this before, and know
    of a solution?
    Gene

    You definitely bring up good point. Here my concerns:
    1) Even if I don't cache my JNDI lookups, I expect subsequent local lookups
    of an already-found object to not fail!
    2) If indeed JNDI lookup in 6.1/7.0 is now more expensive than it was in
    5.1, shouldn't the local caching be done by a WL proxy to Context? Why give
    the onus to the developer?
    Gene
    "Wenjin Zhang" <[email protected]> wrote in message
    news:3cf8f7b1$[email protected]..
    >
    Is is possible for you to cache the data source after one lookup and onlyto refresh
    after some system failure since JNDI lookup is not a cheap process?
    "Gene Chuang" <[email protected]> wrote:
    WL 6.1 sp2, Solaris 2.8, JDK 1.3.1_02
    Getting the following error on production under heavy-load server,
    occurring
    infrequently (~1% of request) and irregularly:
    javax.naming.NameNotFoundException::Unable to resolve srPoolDS. Resolved:
    ""; Unresolved:"srPoolDS";
    srPoolDS is an oracle pool DataSource, and a BMP ejb is doing the lookup
    with a local new InitialContext(). Anyone experienced this before, andknow
    of a solution?
    Gene

  • AdflibREADME.txt causing FileNotFound exceptions under heavy loads

    Top of the stack trace:
    [oracle.adf.library.rc.dependencies.LibDepsServiceStrategy] [tid: [ACTIVE].ExecuteThread: '13' for queue: 'weblogic.kernel.Default (self-tuning)'] [userId: edozer] [ecid: 4bc2e4dc1d398eeb:-3c344db6:133416f5ca2:-7ffd-0000000000007aeb,0] [APP: XXXXXXXXXX#V9.0.0] [[
    java.io.FileNotFoundException: file:<REDACTED FOR CONFIDENTIALITY>.jar!/adflibREADME.txt
    The error is originating from
         at oracle.adf.library.rc.dependencies.LibDepsServiceStrategy.getResources(LibDepsServiceStrategy.java:96)
         at oracle.adf.library.rc.dependencies.LibDepsServiceStrategy.<init>(LibDepsServiceStrategy.java:64)
    whose source reads
    URL url = getDepFileURL( dc.getJarURL() );
    r = new BufferedReader(
    new InputStreamReader( URLFileSystem.openInputStream(url), "UTF8" ) );
    oracle.adf.library.rc.dependencies.LibDepsServiceStrategy.getResources is trying to read adflibREADME.txt from one of our application's ADF Library Jars and failing to find it. THIS IS ONLY HAPPENING UNDER HEAVY LOAD._
    Anyone?
    Edited by: Hyangelo on Oct 28, 2011 6:27 AM
    Edited by: Hyangelo on Oct 28, 2011 7:40 AM

    Additional info:
    Decompiled "javatools-nodeps.jar" which contains the class JarIndex
    Based from stacktrace in log file:
    JarIndex.getLOC(JarIndex.java:1193) threw FileNotFoundException
    private int[] getLOC(RandomAccessFile jar, byte[] localHeader, String entryName)
    throws IOException
    if (this._entryNamesPool == null)
    return getLOCFromHash(jar, localHeader, entryName);
    int[] offsetAndSizes = getOffsetAndSizes(entryName);
    int offset = offsetAndSizes[0];
    if (offset >= 0)
    jar.seek(offset);
    jar.read(localHeader, 0, 30);
    if ((localHeader[0] == 80) && (localHeader[1] == 75) && (localHeader[2] == 3) && (localHeader[3] == 4))
    int filenameLength = (localHeader[26] & 0xFF) + ((localHeader[27] & 0xFF) << 8);
    byte[] filenameBytes = new byte[filenameLength];
    jar.readFully(filenameBytes);
    String filename = new String(filenameBytes, "UTF8");
    if (ModelUtil.areDifferent(filename, entryName))
    throw new IOException("Mismatched entry names '" + filename + "' and '" + entryName + "' for jar file" + this._jarFileURL.toString());
    int extraFieldLength = (localHeader[28] & 0xFF) + ((localHeader[29] & 0xFF) << 8);
    jar.skipBytes(extraFieldLength);
    return offsetAndSizes;
    throw new IOException("Corrupt entry offset for URL " + this._jarFileURL.toString());
    throw new FileNotFoundException(getJarEntryString(entryName)); [According to decompiler, this is JarIndex.java:Line 1193]
    Based on the decompiled code and the stack trace, this is happening because the above code failed to validate the jar file it was trying to read.
    if ((localHeader[0] == 80) && (localHeader[1] == 75) && (localHeader[2] == 3) && (localHeader[3] == 4)) returned false
    So it threw the file not found exception. The file most probably exists but didn't have the expected header byte values when the method above was reading it.

  • Broken TCP stack in latest kernel when under heavy load

    I'm running an Arch box with a decent amount of HTTP traffic. After upgrading to the latest kernel I've seen that packets are send from the wrong source and destination address. This only applies during heavy load (100+ requests per second). tcpdump shows the following:
    18:52:58.512573 IP 0.0.0.0.80 > 0.0.0.0.4316: Flags [FP.], seq 0, ack 1, win 14400, length 0
    18:52:58.512600 IP 0.0.0.0.80 > 0.0.0.0.56546: Flags [FP.], seq 0, ack 1, win 14400, length 0
    18:52:58.512621 IP 0.0.0.0.80 > 0.0.0.0.4535: Flags [FP.], seq 0, ack 1, win 14600, length 0
    18:52:58.512641 IP 0.0.0.0.80 > 0.0.0.0.3528: Flags [FP.], seq 0, ack 1, win 14600, length 0
    18:52:58.512662 IP 0.0.0.0.80 > 0.0.0.0.4509: Flags [FP.], seq 0, ack 1, win 14400, length 0
    18:52:58.512682 IP 0.0.0.0.80 > 0.0.0.0.65040: Flags [FP.], seq 0, ack 1, win 14600, length 0
    18:52:58.512702 IP 0.0.0.0.80 > 0.0.0.0.2455: Flags [FP.], seq 0, ack 1, win 10240, length 0
    18:52:58.512722 IP 0.0.0.0.80 > 0.0.0.0.16545: Flags [FP.], seq 0:268, ack 1, win 15008, length 268
    18:52:58.519258 IP 0.0.0.0.80 > 0.0.0.0.29802: Flags [FP.], seq 0:268, ack 1, win 980, options [nop,nop,TS val 745514 ecr 1317559555], length 268
    18:52:58.565907 IP 0.0.0.0.80 > 0.0.0.0.32376: Flags [FP.], seq 0, ack 1, win 14400, length 0
    18:52:58.619241 IP 0.0.0.0.80 > 0.0.0.0.50493: Flags [FP.], seq 0:268, ack 1, win 11256, options [nop,nop,TS val 745544 ecr 9539361], length 268
    18:52:58.805927 IP 0.0.0.0.80 > 0.0.0.0.20852: Flags [FP.], seq 3025419976:3025420244, ack 3037671074, win 967, options [nop,nop,TS val 745600 ecr 6445640], length 268
    18:52:58.805953 IP 0.0.0.0.80 > 0.0.0.0.65025: Flags [FP.], seq 1663827778:1663828046, ack 2127675352, win 707, options [nop,nop,TS val 745600 ecr 457812708], length 268
    18:52:58.845918 IP 0.0.0.0.80 > 0.0.0.0.2217: Flags [FP.], seq 0:268, ack 1, win 707, options [nop,nop,TS val 745612 ecr 546643], length 268
    18:52:59.099245 IP 0.0.0.0.80 > 0.0.0.0.5112: Flags [FP.], seq 0:268, ack 1, win 15008, length 268
    18:52:59.152582 IP 0.0.0.0.80 > 0.0.0.0.1175: Flags [FP.], seq 0:268, ack 1, win 15008, length 268
    18:52:59.232612 IP 0.0.0.0.80 > 0.0.0.0.47217: Flags [FP.], seq 684621876:684622144, ack 3544859356, win 11256, length 268
    18:52:59.659258 IP 0.0.0.0.80 > 0.0.0.0.3098: Flags [FP.], seq 2105858244:2105858512, ack 3896053916, win 980, options [nop,nop,TS val 745856 ecr 52041], length 268
    18:52:59.659290 IP 0.0.0.0.80 > 0.0.0.0.3099: Flags [FP.], seq 18772067:18772335, ack 2568646283, win 980, options [nop,nop,TS val 745856 ecr 52041], length 268
    18:52:59.759244 IP 0.0.0.0.80 > 0.0.0.0.18780: Flags [FP.], seq 0:268, ack 1, win 707, options [nop,nop,TS val 745886 ecr 168876], length 268
    18:52:59.845907 IP 0.0.0.0.80 > 0.0.0.0.58449: Flags [FP.], seq 0, ack 1, win 980, options [nop,nop,TS val 745912 ecr 528058426], length 0
    18:52:59.925936 IP 0.0.0.0.80 > 0.0.0.0.65137: Flags [FP.], seq 0:268, ack 1, win 15008, length 268
    18:52:59.979497 IP 0.0.0.0.80 > 0.0.0.0.2920: Flags [FP.], seq 0:268, ack 1, win 980, options [nop,nop,TS val 745952 ecr 18879], length 268
    18:52:59.979527 IP 0.0.0.0.80 > 0.0.0.0.2922: Flags [FP.], seq 0:268, ack 1, win 980, options [nop,nop,TS val 745952 ecr 18879], length 268
    18:52:59.979553 IP 0.0.0.0.80 > 0.0.0.0.2940: Flags [FP.], seq 0:268, ack 1, win 980, options [nop,nop,TS val 745952 ecr 18879], length 268
    Source and destination ports are correctly set. Wireshark shows the correct HTML inside the packets that are returned to 0.0.0.0. The web server log also looks normal; the correct IP address is displayed and logged as a successful request.
    When dropping incomming traffic on port 80 on eth0 everything works as expected (when requesting the server on eth1, which otherwise fails).
    I'm running on "Linux srv 3.0-ARCH #1 SMP PREEMPT Wed Oct 19 12:14:48 UTC 2011 i686" which is the latest kernel in the repos. When booting the fallback image this problem does not exist, all packets are correctly addressed no matter how much load I put on the server.
    Does anyone else have this problem?
    Edit:
    Running lighttpd 1.4.29. No tweaked kernel/TCP parameters whatsoever.
    Last edited by nullvoid (2011-10-29 17:19:57)

    Did a full reinstall of Arch on another machine and the problem still persist. Tried with Apache and Nginx, same behaviour as with Lighttpd. Could anyone else using an arch box under heavy load see if there's activity from 0.0.0.0?
    Hint:
    # tcpdump -n host 0.0.0.0
    I'll do a bug report upstream later today.

  • Session bleed under heavy load...any suggestions?

    hello.
    i'm working on an application that has user sensitive data and we are seeing session bleed under heavy load (ie users reporting seeing other users data, error reports with missing session values, things along thoes lines).  the app itself is typical stuff; a user logs in, they see information specific to their user account and do things with it.  some of that information comes from the session.  this all seems to work fine under normal load (100 or less users), or with a few users testing, but fails under heavy load (1000+ concurrent users).  we cannot reporoduce it locally, nor can we see it when we log into the system ourselves and click around during peak load times.
    here is some more detail.  as i mentioned, we are storing certain user informaiton in the session.  we use an exclusive lock of the session scope to write that info, and a readonly lock of the session scope to read it (i am quadruple checking this now).  this app is running in a multi-instance clustered environment (all on the same server).  CF8 with IIS.  we are using j2ee session management, with sticky sessions and session  replication on.  we were seeing the session bleed before the clustering was introduced however...
    one caveat is that a huge number of our users come from behind a proxy  system, meaning they all have the same IP.  i did some searching on this, but could not find any definitive information that it would create a problem with session variables.
    i was wondering if anyone else had seen this kind of problem and/or had any suggestions in dealing with it?
    thanks.

    the jury is still out to a degree, but i think we've identified the culprit(s) of our session bleed for anyone interested.  it boiled down to two problems.
    1.  var scoping issues.  unfortunately this was a fairly old application, written before we strictly employed best pracitices on var scoping variables within functions in all our cfc's.  we've fixed the bad code and our session bleed problems seemed to have stopped.  there is a great utility for checking code for var scoping problems available at: http://varscoper.riaforge.org/
    2.  a misunderstanding of how cflock with a timeout setting (and no throw on error)  behaves.  aside from session bleed, it turned out we had another issue in there, which was expected session values missing all together.  the crux of the problem is that we had set our cflock read/write timeout's to 30 seconds.  under the extremely heavy load, requests were routinely exceeding those timeout thresholds.  the locks were not set to throw on error, so when the timeout threshold was exceeded, the code within the lock ended up just being skipped.  this was leading to missing data in the session. temporarily we've simply increased the timeout setting to a large number, which has fixed our problem.  eventually we'll set these locks to throw on error and handle the exception in a more graceful manner.
    hope this helps someone.

  • Decryption fails under heavy load condition

    I am using XML Security API's with JCE security provider and under heavy load conditions (1000 XML messages per second) Invalid PKCS #5 padding exception is thrown. XML Security processor is running on a dual processor Windows XP machine using jdk1.5. Also, this exception is thrown only few times if we keep the system running for more than 12 hours.
    I was wondering if anyone has tested JCE api's for encryption/decryption using RSA_AES-256 under heavy load conditions. I will appreciate any response.
    Thanks,
    Najeeb Andrabi

    I have not heard of a problem like this and I have been using the JCE for a long time and have used it on twice on very high volume websites without any problems.
    Sounds to me like you are assuming that some code is thread safe when it is not. Without seeing your code it is going to be difficult for anyone to make a comment.

  • Server hangs or freezes during heavy load

    During peak times of the day, especially during heavy load on the Calendar Server,
    the application seems to hang. The client side application will not respond on
    the user's desktop, and uni* commands on the server itself respond considerably
    slow.
    <P>
    There are two parameters in the server configuration file that are strongly
    believed to be a trigger of server hangs or freezes in large deployments and/or
    busy servers. Here is a description of the problem:
    <P>
    Large deployments tend to be 3000+ users per node. This could be a single or
    multi-node environment.
    <P>
    A lock manager fix was implemented in 4.0 to correct a problem that was
    found in 3.51 where the server would hang. At that time, the parameters called
    read/writelocktimeouts
    were introduced as a failover mechanism in case the
    database was not available, which would then trigger the client process to
    disconnect rather than hang the whole server.
    <P>
    These timeouts effectively will terminate a process whose read or write exceeds
    the specified periods. The default of 20 seconds is quite a large amount of time;
    however, it is not totally unlikely that such a value could be met on a
    very busy system. If this is the case, and there is some relation between a
    process being terminated by one of these timeouts and subsequent system
    instability, then the "solution" would not be to extend the values of the
    timeouts but rather to exclude them. This way, it will ensure that no process is
    terminated this way and therefore the process would be allowed to continue until
    it had completed its job.
    <P>
    The timeouts were not removed from the product, but under normal circumstances
    they probably won't be needed anymore anyhow. It seems that on a busy calendar
    server, setting the db timeout alarms may actually trigger the server to freeze.
    Below are some examples of errors that appear in the log files which show
    that the database is no longer accepting client requests:
    <P>
    db_VISTA ERROR -920 -> cst_d_open: d_open
    db_SchedBaseOpen: unable to open database
    probable cause: unilckd is down or "/users/unison/tmp/unisonlckm"
    was removed
    uniengd: database lock timeout
    ITEM: "NA,NA" <0,0>
    CLIENT: "unises", "A.02.80"
    INET-NAME:
    INET-ADDR:
    CALL: "SessionsInfoGet"
    <P>
    To make the fix:
    <OL>
    <LI>Using your favorite editor, edit the /users/unison/misc/unison.ini file.
    In the following section you will see these two parameters:
    <P>
    [ENG]
    writelocktimeout = 20
    readlocktimeout = 20
    <P>
    <LI>Place a "#" sign (or the appropriate comment symbol for your OS) in front of
    these two lines and save the file.
    <P>
    <LI>The server will now have to be restarted in order for the changes to take
    effect.
    </OL>

    This looks similar to what I'm seeing.
    DPM 2010, there's one backup set (for me a file server disk) that every time I try to run the initial replica on it the server hangs and needs to be rebooted by iLO. It doesn't just die suddenly, first the data stream on the backup stops then the OS becomes
    less responsive but there is no resource issue. trying to open event view will cause a few things to lock up then over a few mins the server is complete froze. like the disk drives have been locked.
    Suspecting McAfee, I added in all the exclusions, that didn't help so I added the process exclusions which are done by setting dpmra and csc to low risk and that didn't help either. I could reproduce it just by kicking off a backup for this one file servers
    drive so it's easy to test with.
    Tonight, I had some permissions in EPO to let me stop the scanning completely and disable the on-access scan and for the first time it worked!
    There is definitely an issue between DPM and McAfee beyond what is on MS's web page for AV checks.
    I don't have a workaround yet other than stopping the AV completely... Something to follow up on next week. For the moment I made some progress though.

  • How to redirect to a server busy page when under heavy load

    Hello,
              I have been doing extensive load testing of a web application using Weblogic
              5.10 sp 06. I have found a point where under extremly heavy load the server
              just does not respond anymore. Fair enough.
              What I want to do is at a certain load level (before it stops responding) I
              want to redirect users to a "Server Busy - try again later" page. Is there a
              setting in weblogic that allows me to do this ? Or do I need to have other
              monitoring software to take care of this ?
              Thanks and Regards,
              Nick H
              

    Cameron Purdy <[email protected]> wrote:
              > Hi Nick,
              > Unfortunately, last I check there was no such processing. Weblogic maintains
              > a big (2^16) queue that it puts requests into and (if I remember correctly)
              > it doesn't gracefully handle overflow. I believe the architecture should
              > have been a smaller queue with the overflow condition being
              > protocol-specific (such as HTTP doing a "too busy" error).
              It is possible to create and use your own execute queue in 6.1 and specify it's
              length, so I expected this to happen when queue length reaches this number, but
              it didn't.
              > Peace,
              > --
              > Cameron Purdy
              > Tangosol, Inc.
              > http://www.tangosol.com
              > Tangosol: How Weblogic applications are customized
              > "NH" <[email protected]> wrote in message
              > news:[email protected]...
              >> Hello,
              >>
              >> I have been doing extensive load testing of a web application using
              > Weblogic
              >> 5.10 sp 06. I have found a point where under extremly heavy load the
              > server
              >> just does not respond anymore. Fair enough.
              >>
              >> What I want to do is at a certain load level (before it stops responding)
              > I
              >> want to redirect users to a "Server Busy - try again later" page. Is there
              > a
              >> setting in weblogic that allows me to do this ? Or do I need to have other
              >> monitoring software to take care of this ?
              >>
              >> Thanks and Regards,
              >>
              >> Nick H
              >>
              >>
              Dimitri
              

  • System locks up under heavy load 890fx-gd70 T1055

    So i just upgraded from 790fx-gd70 x4 620, to my new 890fx-gd70 x6 t1055. same mushkin ram 996657 - 4GB (2x2GB) DDR3 PC3-12800 7-7-7-20 Blackline. never had any stabilty issues with my 790fx. Now i cant get the thing to run much at all. in order to get system to boot up windows 7 i had to manualy set voltage to ram up to 1.9 then it would let me install win 7. When i run a stabilty test or put heavy load on the system it just locks up. everything in bios is set to auto exept Dram voltage. help?
    video card is a ati 5770 and power is a Tuniq Ripper PSU-RIP1000W-BK.
    i THINK it has to do with voltage for cpu or NB but dont know where to begin.
    also cpu doesnt get hotter then 35c i using a 120mm Rifle CPU Cooler. under no/low load it is stuck at a cool 17c. my case is a NZXT TEMPEST EVO. so dont think heat is a issue.

    put just 1 stick of ram in dimm #3 slot and it seems to be working fine under stability test so far. gonna run it for a while and see how that works out. windows still wont load unless i manually set voltage on ram to 1.8 or higher.

Maybe you are looking for

  • GR POsting date as Base line date in MIRO.

    Hi SAP Expert, Using StandardConfiguration, Base line date in MIRO can be populated as Posting date/Invoice document date/ new entry of Invoice Reciept using configuration of Terms of Payment. My client requirement is to populate GR posting date as a

  • Web Gallery in Lightroom HELP

    Ok, so I want to use Lightroom to create a web gallery, which is easy enough, but I would like to use a different design other than the ones provided with the software.  The included templates are fine, but are a little boring and look a bit dated. A

  • Implementing sockets and threads in a jframe gui program

    Hi, I am trying to find a solution to a problem I am having designing my instant messenger application. I am creating listening sockets and threads for each client logged into the system. i want to know if there is a way to listen to other clients re

  • Calendar fetchxml

    Hi, i need to get the start date and end date of each quarter from calendar entity in FetchXML,because the settings of CRM : Year any idea ? Mark as answer or vote as helpful if you find it useful | Ammar Zaied [MCP]

  • Sql plus password typing problem

    Hello there I'm trying to access sql plus 11g on windows 7, i correctly write the username but when i try to type the password, nothing is typed at all, not letters nor asteresks!! plz help meee