Coherence Warning Causing our production managed servers down/non-functional

Dear Legends,
We are facing a number of "Coherence Warning" and we came to know that one issue was in Cluster the servers are starting in Multicast even it has been set to Unicast and after an SR with ORACLE they suggested to add WKA host to be added with the startup scripts. We added and closely monitoring the servers continuously through the weekend, as usual the servers went non-functional in every 5 - 6 days.
This is in our HOST1 where Admin and Managed Server1(soa-server1)
-Dtangosol.coherence.wka1=soams1.com
-Dtangosol.coherence.wka2=soams2.com
-Dtangosol.coherence.localhost=soams1.com
This is in our HOST2 where Managed Server2(soa-server2)
-Dtangosol.coherence.wka1=soams1.com
-Dtangosol.coherence.wka2=soams2.com
-Dtangosol.coherence.localhost=soams2.com
We were closely looking into the logs and it says
<Oct 26, 2013 2:51:50 AM EDT> <Warning> <Coherence> <BEA-000000> <2013-10-26 02:51:50.417/439530.132 Oracle Coherence GE 3.7.1.1 <Warning> (thread=PacketPublisher, member=1): Experienced a 13215 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2013-10-21 00:56:32.745, Address=10.2.0.35:8088, MachineId=14118, Location=site:,machine:soams2,process:15664, Role=WeblogicServer); 82 packets rescheduled, PauseRate=0.0, Threshold=1976>
<Oct 26, 2013 2:52:35 AM EDT> <Warning> <Coherence> <BEA-000000> <2013-10-26 02:52:35.298/439575.013 Oracle Coherence GE 3.7.1.1 <Warning> (thread=PacketPublisher, member=1): Experienced a 4094 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2013-10-21 00:56:32.745, Address=10.2.0.35:8088, MachineId=14118, Location=site:,machine:soams2,process:15664, Role=WeblogicServer); 37 packets rescheduled, PauseRate=0.0, Threshold=1878>
<Oct 26, 2013 2:54:04 AM EDT> <Warning> <Coherence> <BEA-000000> <2013-10-26 02:54:04.144/439663.859 Oracle Coherence GE 3.7.1.1 <Warning> (thread=PacketPublisher, member=1): Experienced a 5936 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2013-10-21 00:56:32.745, Address=10.2.0.35:8088, MachineId=14118, Location=site:,machine:soams2,process:15664, Role=WeblogicServer); 46 packets rescheduled, PauseRate=0.0, Threshold=1696>
<Oct 26, 2013 2:55:04 AM EDT> <Warning> <Coherence> <BEA-000000> <2013-10-26 02:55:04.396/439724.111 Oracle Coherence GE 3.7.1.1 <Warning> (thread=PacketPublisher, member=1): Experienced a 19188 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2013-10-21 00:56:32.745, Address=10.2.0.35:8088, MachineId=14118, Location=site:,machine:soams2,process:15664, Role=WeblogicServer); 112 packets rescheduled, PauseRate=0.0, Threshold=1612>
<Oct 26, 2013 2:56:55 AM EDT> <Warning> <Coherence> <BEA-000000> <2013-10-26 02:56:55.540/439835.255 Oracle Coherence GE 3.7.1.1 <Warning> (thread=PacketPublisher, member=1): Experienced a 32323 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2013-10-21 00:56:32.745, Address=10.2.0.35:8088, MachineId=14118, Location=site:,machine:soams2,process:15664, Role=WeblogicServer); 178 packets rescheduled, PauseRate=1.0E-4, Threshold=1532>
<Oct 26, 2013 2:58:09 AM EDT> <Warning> <Coherence> <BEA-000000> <2013-10-26 02:58:09.435/439909.150 Oracle Coherence GE 3.7.1.1 <Warning> (thread=PacketPublisher, member=1): Experienced a 33213 ms communication delay (probable remote GC) with Member(Id=2, Timestamp=2013-10-21 00:56:32.745, Address=10.2.0.35:8088, MachineId=14118, Location=site:,machine:soams2,process:15664, Role=WeblogicServer); 182 packets rescheduled, PauseRate=2.0E-4, Threshold=1456>
What would be the reason even after adding the wka?
1. But still in logs while starting up am able to see the "-Dtangosol.coherence.clusteraddress=227.7.7.9 -Dtangosol.coherence.clusterport=9778". Is this the issue?
2. Or Else I need to add the "-Dtangosol.coherence.localport.adjust=true -Dtangosol.coherence.localport=8089 -Dtangosol.coherence.wka1.port=8089 -Dtangosol.coherence.wka2.port=8089" ?
Any kind of help would be much appreciated. Thanks in advance.
Regards,
Karthik

Also
for HOST1
-Dtangosol.coherence.wka1=soams1.com
-Dtangosol.coherence.wka2=soams2.com
-Dtangosol.coherence.wka1.port=8089
-Dtangosol.coherence.wka2.port=8090
-Dtangosol.coherence.localhost=soams1.com
-Dtangosol.coherence.localport.adjust=true
-Dtangosol.coherence.localport=8089
for HOST2
-Dtangosol.coherence.wka1=soams1.com
-Dtangosol.coherence.wka2=soams2.com
-Dtangosol.coherence.wka1.port=8089
-Dtangosol.coherence.wka2.port=8090
-Dtangosol.coherence.localhost=soams2.com
-Dtangosol.coherence.localport.adjust=true
-Dtangosol.coherence.localport=8090
regards,
Leo_TA

Similar Messages

  • Product returned fom repair non-functional

    I received my touchpad back from being repaired.As soon as I opened the box I noticed there was a large black area on the screen that looks like water is trapped under the glass.I charged the touchpad up for 24 hours,went to turn it on,only to find it completely unresponsive.
    This is after waiting three weeks to get it back from the repair depot.I would really like a touchpad that works as the two days it did work I found the product to be wonderful.
    MY old sro# S1-511043920199    Any help expediting this would be appreciated.
    Post relates to: HP TouchPad (WiFi)
    This question was solved.
    View Solution.

    I have notified someone about this and hopefully they'll contact you soon.  I'm sorry that the device came back to you that way, that shouldn't happen.

  • Weblogic managed servers connecting to the servers in different cluster

              Hi All,
              We have a weired problem going on for a while. We have a cluster configuration
              with an admin server and two managed servers. We have the similar configuration
              in DEV, TEST and PROD. The problem is that the managed server members in DEV cluster
              are making connections to managed servers which are member of PROD cluster for
              session replication. The same way TEST servers are trying to connect to PROD and
              DEV.
              Has anyone seen this kind of problem before. BEA seems to be cluless so far.
              Thanks in adavnce for your input.
              Udit
              

              Venkat,
              Thats a good suggestion but these things are too obvious to ignore. We have different
              multicast address in DEV and PROD and also hosts are on different sub net. I do
              not know if cluster name will make any differene though.
              Thanks for your input anyway,
              Udit
              "venkat" <[email protected]> wrote:
              >
              >Udit,
              > You can check the sub net, multicast address and the cluster name.
              >If the dev
              >and prod servers are in the same sub net with same multicast address,
              >then change
              >the multicast and try.
              >
              >Venkat
              >"venkat" <[email protected]> wrote:
              >>
              >>Udit,
              >>
              >>
              >>"Udit Singh" <[email protected]> wrote:
              >>>
              >>>Kumar,
              >>>Thanks for the reply.
              >>>The situation is that managed server in DEV try to replicate the session
              >>>to a
              >>>managed server in PROD and TEST and vice versa.
              >>>Let us say our dev managed servers are running on abc01 and abc02 and
              >>>prod managed
              >>>servers are running on xyz01 and xyz02. All the managed servers are
              >>running
              >>>on
              >>>port 7005.
              >>>If I do the netstat on abc01 or abc02 I could the see established connections
              >>>between abc01/02 and xyz01/02.
              >>>Why is that happening? We are running 6.1SP2.
              >>>
              >>>Udit
              >>>
              >>>Kumar Allamraju <[email protected]> wrote:
              >>>>We do not restrict intercluster communication as of 61 SP3.
              >>>>Once we get the IP from the cookie, we can safely make a
              >>>>connection to the other clustered node. We were not checking
              >>>>if the server is part of the same cluster or not. This is
              >>>>already fixed in 7.x and 61 SP4(not yet released) If you are
              >>>>on 61 Sp2 or SP3 then you should contact support and
              >>>>reference CR # CR089798 to get a one off patch.
              >>>>
              >>>>Regardless, are you traversing from DEV to PROD cluster and
              >>>>vice-versa. If not then this problem shouldn't happen unless
              >>>>plugin is routing the request to wrong cluster.
              >>>>
              >>>>--
              >>>>Kumar
              >>>>
              >>>>Udit Singh wrote:
              >>>>> Hi All,
              >>>>> We have a weired problem going on for a while. We have a cluster
              >>configuration
              >>>>> with an admin server and two managed servers. We have the similar
              >>>configuration
              >>>>> in DEV, TEST and PROD. The problem is that the managed server members
              >>>>in DEV cluster
              >>>>> are making connections to managed servers which are member of PROD
              >>>>cluster for
              >>>>> session replication. The same way TEST servers are trying to connect
              >>>>to PROD and
              >>>>> DEV.
              >>>>> Has anyone seen this kind of problem before. BEA seems to be cluless
              >>>>so far.
              >>>>>
              >>>>> Thanks in adavnce for your input.
              >>>>> Udit
              >>>>
              >>>
              >>
              >
              

  • SCOM 2012 SP1 UR4 management servers grey state

    Hi,
    My SCOM environment is made up of the below :-
    SCOM 2012 SP1 UR4.
    3 SCOM Management Servers all on Windows 2008 R2 SP1.
    Shared SQL 2008 cluster with 2 Windows nodes also on same OS.
    Just recently all our SCOM management servers have been flipping in and out from grey to green state.  Gateways/agents all look ok as showing green.  Alerting from agents appears normal as can see lots of them in console.
    Have flushed the health state cache folder on all 3 SCOM MS's and still the same issue.
    Appreciate any help on this one.

    Event id: 7011 - Was your server recently patched (Installed by any automatic updates) ?
    IS SCCM Configured in your MS? If Yes disable and check?
    Is Windows update service running ? Stop if for one or two days and check if this issue still appears
    Reference threads:
    http://social.technet.microsoft.com/Forums/en-US/b86e5a3d-0c2e-4d5e-9d3d-905da91fc982/scom-2012-event-id-7011-service-control-manager-error-when-fep-definition-updates-apply?forum=configmanagersecurity
    http://stefanroth.net/2012/09/26/scom-2012-event-id-7011-service-control-manager-error/
    Solution also available in: http://technet.microsoft.com/en-us/library/cc756319(v=ws.10).aspx
    ===========================================
    For Event id 20026 - 
    1. Does your Operationsmanager database have enough space ? Check that first.
    What is you DB size ?
    How much is the free space left ?
    2. Was there any resent change in the SCOM Action accoutn password ? Or has the password expired. Try re entering the SCOM Action password by re directing your self to Administration tab --> Run as Config -- > Accounts --> SCOM Action account.
    The description would be - This is the user account under which all rules run by default on the agent.
    Right click and go to properties and re enter the account name and password there and check.
    Refer the below screen shot
    Check this article as well:
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/102d443c-db0e-4bf2-b0d6-31d7f9328537/all-agents-greyed-out-multiple-event-errors-with-ids-20026-20028?forum=operationsmanagergeneral
    ============================================
    Event id : 11904 - As per my knowledge appears due to incorrect Alrting string display name selected in any Rule or monitor.
    Also the description what you pasted in Event id : 11904 says Microsoft.SystemCenter.HealthService.ActionAccountConfigured.Error as highlighted below.
    Suggest to re enter the action account password and tell the results.
    Also is the Healthservice on the MS running using System account or Domain account ?
    =================================================================
    Description : The Microsoft Operations Manager Expression filter Module failed to query the delivered item, item was dropped.
    Property Expression: Reachability/State
    Error : 0XC00EE22
    One or more workflows were affected by this. Workflow
    name: Microsoft.SystemCenter.HealthService.ActionAccountConfigured.Error
    Gautam.75801

  • Running managed servers from different WL versions under a domain

    Hi,
    Would it be possible to run managed servers (in non clustered way) from different versions of WebLogic under a single domain, with admin server of higher WebLogic version controlling the domain and the managed servers?
    Regards,
    Gobi

    Hi Gobi,
    we can make it run such a way but it may lead to many complication with patch sets.
    Few thing work in one environment and other will not so it is not at all recommended way to set such configuration.
    Regards,
    Kal

  • Hi, we need to create the test environment from our production for oracle AP Imaging. we have soa,ipm,ucm and capture managed servers in our weblogic. can anyone tell me what is the best way to clone the environment, can I just tar the weblogic file syste

    Hi, we need to create the test environment from our production for oracle AP Imaging. we have soa,ipm,ucm and capture managed servers in our weblogic..
    Can anyone tell me what is the best way to cloning the application from different environment, the test and production are in different physical server.
    Can I just tar the weblogic file system and untar it to the new server and make the necessary changes?
    Can anyone share their experiences and how to with me?
    Thank in advance.
    Katherine

    Hi Katherine,
    yes and no . You need as well weblogic + soa files as the database schemas (soa_infra, mds...).
    Please refer to the AMIS Blog: https://technology.amis.nl/2011/08/11/clone-your-oracle-fmw-soa-suite-11g/
    HTH
    Borys

  • The Managed servers are going down  frequently due to leasing renewal.

    We have 3 SOA Managed Servers in our Production Environment running each of them on 3 different Machines.
    The servers are going down frequently due to leasing renewal issue.
    Please see the errors below. SOA_MS3 was the Cluster Leader.
    From SOA_MS1 logs we see -
    <Jun 15, 2011 12:11:24 AM MDT> <Warning> <Cluster> <WL-000147> <Server "SOA_MS1" failed to renew lease in the leasing basis hosted by SOA_MS3.>
    <Jun 15, 2011 12:11:24 AM MDT> <Error> <Cluster> <WL-000150> <Server failed to get a connection to the leasing basis hosted by SOA_MS3 in the past 30 seconds for lease renewal. Server will shut itself down.>
    <Jun 15, 2011 12:11:24 AM MDT> <Critical> <Health> <WL-310006> <Critical Subsystem ServerMigration has failed. Setting server state to FAILED.
    Reason: ServerSOA_MS1 failed to renew lease in the leasing basis hosted by SOA_MS3>
    <Jun 15, 2011 12:11:24 AM MDT> <Critical> <WebLogicServer> <WL-000385> <Server health failed. Reason: health of critical service 'ServerMigration' failed>
    <Jun 15, 2011 12:11:24 AM MDT> <Notice> <WebLogicServer> <WL-000365> <Server state changed to FAILED>
    <Jun 15, 2011 12:11:24 AM MDT> <Error> <com.bea.weblogic.kernel> <BEA-000000> <cannot load libary 'stackdump': java.lang.UnsatisfiedLinkError: no stackdump in java.library.path
    >
    <Jun 15, 2011 12:11:24 AM MDT> <Error> <WebLogicServer> <WL-000383> <A critical service failed. The server will shut itself down>
    <Jun 15, 2011 12:11:24 AM MDT> <Notice> <WebLogicServer> <WL-000365> <Server state changed to FORCE_SHUTTING_DOWN>
    <Jun 15, 2011 12:11:24 AM MDT> <Notice> <Cluster> <WL-000163> <Stopping "async" replication service>
    From SOA_MS2 logs we see -
    15, 2011 12:11:16 AM MDT> <Warning> <Cluster> <WL-000147> <Server "SOA_MS2" failed to renew lease in the leasing basis hosted by SOA_MS3.>
    <Jun 15, 2011 12:11:16 AM MDT> <Error> <Cluster> <WL-000150> <Server failed to get a connection to the leasing basis hosted by SOA_MS3 in the past 30 seconds for lease renewal. Server will shut itself down.>
    <Jun 15, 2011 12:11:16 AM MDT> <Critical> <Health> <WL-310006> <Critical Subsystem ServerMigration has failed. Setting server state to FAILED.
    Reason: ServerSOA_MS2 failed to renew lease in the leasing basis hosted by SOA_MS3>
    <Jun 15, 2011 12:11:16 AM MDT> <Critical> <WebLogicServer> <WL-000385> <Server health failed. Reason: health of critical service 'ServerMigration' failed>
    <Jun 15, 2011 12:11:16 AM MDT> <Notice> <WebLogicServer> <WL-000365> <Server state changed to FAILED>
    <Jun 15, 2011 12:11:16 AM MDT> <Error> <com.bea.weblogic.kernel> <BEA-000000> <cannot load libary 'stackdump': java.lang.UnsatisfiedLinkError: no stackdump in java.library.path
    >
    <Jun 15, 2011 12:11:16 AM MDT> <Error> <WebLogicServer> <WL-000383> <A critical service failed. The server will shut itself down>
    <Jun 15, 2011 12:11:16 AM MDT> <Notice> <WebLogicServer> <WL-000365> <Server state changed to FORCE_SHUTTING_DOWN>
    Also, In SOA_MS3 logs we see the followi error -
    ####<Jun 15, 2011 11:43:18 PM MDT> <Error> <Cluster> <soaprdi3> <SOA_MS3> <[ACTIVE] ExecuteThread: '15' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1308202998662> <WL-000168> <Failed to restart/migrate server "SOA_MS3" because of Failed to start the migratable server on one of the candidate machines
    Failed to start the migratable server on one of the candidate machines
    at weblogic.cluster.singleton.MigratableServerState.serverUnresponsive(MigratableServerState.java:95)
    at weblogic.cluster.singleton.MigratableServersMonitorImpl.timerExpired(MigratableServersMonitorImpl.java:164)
    at weblogic.timers.internal.TimerImpl.run(TimerImpl.java:273)
    at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:528)
    at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
    at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
    Thanks,

    Its been many days support working on this, they do not have any clue. we escalated, not much of clue they have yet.
    So, wanted put it across industry experts on SOA.
    Any idea on what causes this Leasing renewal issue and the potential resolution?
    Thanks

  • 21406 events getting generated on all our production servers

    Hi,
    We are getting below events on almost all our production servers (Exchange, Lync,etc.). Does anyone have any idea about this event.
    Log Name: Operations Manager
    Source: Health Service Modules
    Date: 2/10/2014 10:36:09 AM
    Event ID: 21406
    Task Category: None
    Level: Warning
    Keywords: Classic
    User: N/A
    Computer: XXXXX
    Description:
    The process started at 10:36:06 AM failed to create System.PropertyBagData. Errors found in output:
    C:\Program Files\System Center Operations Manager\Agent\Health Service State\Monitoring Host Temporary Files 3\655255\CollectLogicalDiskStatistics.vbs(94, 9) (null): Data error (cyclic redundancy check).
    Command executed: "C:\Windows\system32\cscript.exe" /nologo "CollectLogicalDiskStatistics.vbs" false
    Working Directory: C:\Program Files\System Center Operations Manager\Agent\Health Service State\Monitoring Host Temporary Files 3\655255\
    One or more workflows were affected by this.
    Workflow name: Windows.LogicalDrives.CollectMFTPercentInUse.Rule
    Instance name: C:
    Instance ID: {8D70B92A-C3C2-1421-2D3D-64CE73206D61}
    Management group: XXXXX
    Please assist me with any link or topic that already covers this issue.
    Thanks

    This error is Data error (cyclic redundancy check)
    Use the Windows Check Disk utility to check your hard drive for errors.
    Open a command prompt.
    At the command prompt, type chkdsk /f, and then press ENTER.
    Note If you receive a message similar to the following
    The type of the file system is NTFS. Cannot lock current drive.
    Chkdsk cannot run because the volume is in use by another process. Would you like to schedule this volume to be checked the next time the system restarts? (Y/N)
    press Y, press ENTER, and then restart your computer.
    When Chkdsk finishes, close the command prompt.
    Please remember, if you see a post that helped you please click (Vote As Helpful" and if it answered your question, please click (Mark As Answer).

  • What is the Ideal Production Setup For One Admin and 4 Managed Servers

    Dear Experts
    I will be starting with production setup including one Admin server and 4 managed servers in one single domain.
    I am thinking of creating a single node environment(no clusters) as the machine has following configuration
    OS : Windows Server 2008 R2 Datacenter
    RAM : 48 GB
    System Type : 64 bit
    Processor : Intel(Xenon) 4 processors [email protected]
    Can you please let me know if this configuration would suffice for the 4 managed servers if i assign Xmx and Xms as 4096 and Heap Space as 1024 to all the Managed Servers.
    It is very urgent and i need to convey to the Infrastructure team if harware procurement is required.
    We are looking at somewhere around 300 concurrent users(maximum load) and 100(minimum load) at a given point of time.
    Please reply ASAP.
    Thanks in advance
    Edited by: Abhinav Mittal on Apr 23, 2013 7:58 PM
    Edited by: Abhinav Mittal on Apr 23, 2013 8:03 PM

    Heap size must be calculated according to the applications that are been deployed on each JVM.
    With no deployments, you dont need more than 256k for managed servers heap size and 512k for adminserver. As biggest its your heap size, longer will take your garbage collection. And if you can prevent it, better do it.
    Kinds,
    Gabriel Abelha

  • Node Manager not shutting down the Managed Servers

    I have WebLogic Server 8.1 SP4 installed on around 9 separate boxes - 3 servers form one WLS domain of one admin + two managed servers. The managed servers are started using Node Manager from the admin console.
    On one domain start/shutdown using Admin Console works just fine however the other two domains the managed servers startup fine but shutdown fails - or takes over 15 minutes. I see the following error in the Node Manager logs
    NMMessage: Command read failed 'Connection reset' on socket/ <ip_address>
    Any ideas on what I can do to fix this ? The servers which will not shutdown have EJB's installed on them. Dont think this could be related as shutting down the application results in the same problem as listed above.
    Thanks,
    Kevin.

    The name of remote machine configured on admin server it must to be resolved on same machine in /etc/hosts:
    Example:
    On the admin server is configured the unix machine N1234
    On the machine N1234 in the file nodemanager.hosts it must be present the IP of admin server
    On the admin server in /etc/hosts it must be present
    IP_N1234 N1234

  • Getting Error Oracle Coherence GE 3.6.0.4 Error (thread=ReplicatedCache in SOA managed servers

    Hi All,
    In my SOA Domain having one Admin and 4 Managed Servers (SOA). We are getting the following errors from my manager server logs:
    [2014-11-18T05:55:13.009-05:00] [domain_name01] [ERROR] [] [Coherence] [tid: Logger@9242415 3.6.0.4] [userId: <anonymous>] [ecid: 0000KazENYdAPPTGUAFCFs1KQ^kW000002,1:31105] [APP: soa-infra] 2014-11-18 05:55:13.009/55706.396 Oracle Coherence GE 3.6.0.4 <Error> (thread=ReplicatedCache:domain_name_domain_clusterCacheService, member=1): Detected soft timeout) of {WrapperGuardable Guard{Daemon=ReplicatedCache:domain_name_domain_clusterCacheService:EventDispatcher} Service=ReplicatedCache{Name=domain_name_domain_clusterCacheService, State=(SERVICE_STARTED), Id=2, Version=3.0, OldestMemberId=4}}
    [2014-11-18T05:55:13.018-05:00] [domain_name01] [ERROR] [] [Coherence] [tid: Logger@9242415 3.6.0.4] [userId: <anonymous>] [ecid: 0000KazENYdAPPTGUAFCFs1KQ^kW000002,1:31105] [APP: soa-infra] 2014-11-18 05:55:13.014/55706.401 Oracle Coherence GE 3.6.0.4 <Error> (thread=ReplicatedCache:domain_name_domain_clusterCacheService:EventDispatcher, member=1): An exception occurred while dispatching the following event:[[
    [2014-11-18T05:55:13.020-05:00] [domain_name01] [ERROR] [] [Coherence] [tid: Logger@9242415 3.6.0.4] [userId: <anonymous>] [ecid: 0000KazENYdAPPTGUAFCFs1KQ^kW000002,1:31105] [APP: soa-infra] 2014-11-18 05:55:13.014/55706.401 Oracle Coherence GE 3.6.0.4 <Error> (thread=ReplicatedCache:domain_name_domain_clusterCacheService:EventDispatcher, member=1): The following exception was caught by the event dispatcher:
    [2014-11-18T05:55:13.030-05:00] [domain_name01] [ERROR] [] [Coherence] [tid: Logger@9242415 3.6.0.4] [userId: <anonymous>] [ecid: 0000KazENYdAPPTGUAFCFs1KQ^kW000002,1:31105] [APP: soa-infra] 2014-11-18 05:55:13.014/55706.401 Oracle Coherence GE 3.6.0.4 <Error> (thread=ReplicatedCache:domain_name_domain_clusterCacheService:EventDispatcher, member=1): [[
            at oracle.integration.platform.blocks.deploy.CoherenceCompositeDeploymentCoordinatorImpl.storeError(CoherenceCompositeDeploymentCoordinatorImpl.java:672)
    [2014-11-18T05:55:13.034-05:00] [domain_name01] [ERROR] [] [Coherence] [tid: Logger@9242415 3.6.0.4] [userId: <anonymous>] [ecid: 0000KazENYdAPPTGUAFCFs1KQ^kW000002,1:31105] [APP: soa-infra] 2014-11-18 05:55:13.014/55706.401 Oracle Coherence GE 3.6.0.4 <Error> (thread=ReplicatedCache:domain_name_domain_clusterCacheService:EventDispatcher, member=1): (The service event thread has logged the exception and is continuing.)
    Can you pleae help on this to resolve this issue.
    Thanks,
    Srini..

    Hi Paul
    I have submitted the service request:
    - Problem Summary: Oracle Coherence GE 3.7.0.0 <Error> - Failed to apply delta
    - SR number: 3-4238261231
    When using two or more nodes and use PortableObject as the serialization mechanism the problem is constantly reproducible.

  • WebLogic 12c : Use wlst to list coherence managed servers

    Hi,
    I would like to list only coherence managed servers via wlst. I tried various options and some workarounds work but they only work in certain conditions.
    Just curious as to how other use wlst to list out coherence managed servers.

    Yeah, you call that a solution, I call that an annoying workaround. I'm familiar with that "solution". Using it simply means that we acknowledge that NodeManager can't use the server config information for the admin server, it has to re-specify all of it again manually. I would assume that if you specified a particular classpath for the admin server in the server config, you'll have to manually specify that here also. What's even worse is that if your server config referenced "$CLASSPATH" in the string, so it can reference the implicit classpath that WebLogic itself needs, you may not be able to use "$CLASSPATH" here (I'm not certain about this), and would instead have to manually specify the full entire classpath that the admin server needs. If you didn't have to specify a particular classpath value for the admin server, then you're probably ok, on at least this point.

  • Admin server is going down, but managed servers are fine.

    HI,
    Admin server is going after sometime of starting, but nodemanger and managed servers are running fine, I didn't anything in admin.log everything looks fine.
    WebLogic Server 8.1 SP4
    jrockit81sp4_142_05
    Pls let me whats the problem with admin server.

    Way too little information to go on. What operating system? Are you looking at stdout and stderr for messages? Do any random files show up in the directory in which you started the adminserver (core files, hotspot error files, etc)? Does it occur if you switch to using Sun's JDK? Have you tried undeploying any applications (other than the console, of course) or startup classes from the admin server?

  • Unable to start managed servers from Admin Console "FAILED_NOT_RESTARTABLE"

    I recently installed WebLogic 10.3.5, JRockit, and ECM 11.1.1.5. I'm to the point where I am trying to get Nodemanager configured so I can stop/start managed servers through the admin console. I have the following listed in Environment > Servers:
    AdminConsole - running
    IBR_server1 - FAILED_NOT_RESTARTABLE
    managedServer1 - FAILED_NOT_RESTARTABLE
    UCM_server1 - FAILED_NOT_RESTARTABLE
    I followed the tutorial at http://blogs.oracle.com/jamesbayer/entry/weblogic_nodemanager_quick_sta to get to the point where the AdminServer starts when the server reboots, so I think Nodemanager is working there. If I view Machines > Monitoring, it says Nodemanager "Reachable." However, when I go to restart IBR_server1, managedServer1, or UCM_server1, I get the FAILED_NOT_RESTARTABLE status. I am using the weblogic server's IP for the machine name/host because our DNS is still screwed up. Would that have any affect on this?
    I'm completely unsure what to do now. The Nodemanager log shows:
    INFO - <Loading domains file: E:\oracle\MIDDLE~1\WLSERV~1.3\common\nodemanager\nodemanager.domains>
    WARNING - <Domains file not found: E:\oracle\MIDDLE~1\WLSERV~1.3\common\nodemanager\nodemanager.domains>
    INFO - <Loading identity key store: FileName=E:/oracle/MIDDLE~1/WLSERV~1.3/server\lib\DemoIdentity.jks, Type=jks, PassPhraseUsed=true>
    WARNING - <Node manager configuration properties file 'E:\oracle\MIDDLE~1\WLSERV~1.3\common\nodemanager\nodemanager.properties' not found. Using default settings.>
    So I'm confused why it's saying the properties file and domains file are not found... I've checked and they exist at
    E:\oracle\middleware\wlserver_10.3\common\nodemanager\nodemanager.properties
    and
    E:\oracle\middleware\wlserver_10.3\common\nodemanager\nodemanager.domains
    The domains file contains
    base_domain=e\:\\oracle\\middleware\\user_projects\\domains\\base_domain
    Which is correct.
    Any idea what I'm missing here? I can provide more detail if needed. Thank you.
    EDIT: I should add that I can start the UCM and IBR managed servers using the start up scripts on the server, but that's not what I'm after. I want everything controlled through the AdminConsole and want to be sure that when the server reboots, all the managed servers come back up correctly.
    Edited by: user5824683 on Oct 5, 2011 5:04 PM
    Edited by: user5824683 on Oct 5, 2011 5:09 PM

    I did a bit of digging, and it seems I have an issue with -Xnohup... I've verified this arugment exists in all of my managed servers properties files, yet it still bombs when I try to restart from the webLogic console. I should note that if I start fresh from the server, I can start all my manager servers using WLST nmStart().
    starting weblogic with Java version:
    java version "1.6.0_24"
    Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
    Oracle JRockit(R) (build R28.1.3-11-141760-1.6.0_24-20110301-1430-windows-x86_64, compiled mode)
    Starting WLS with line:
    E:\java\JROCKI~1.1\bin\java -jrockit -Xms256m -Xmx512m -Dweblogic.Name=IBR_server1 -Djava.security.policy=E:\oracle\MIDDLE~1\WLSERV~1.3\server\lib\weblogic.policy -Dweblogic.system.BootIdentityFile=E:\oracle\middleware\user_projects\domains\base_domain\servers\IBR_server1\data\nodemanager\boot.properties -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.security.SSL.ignoreHostnameVerification=false -Dweblogic.ReverseDNSAllowed=false -Xnohup -Xverify:none -da -Dplatform.home=E:\oracle\MIDDLE~1\WLSERV~1.3 -Dwls.home=E:\oracle\MIDDLE~1\WLSERV~1.3\server -Dweblogic.home=E:\oracle\MIDDLE~1\WLSERV~1.3\server -Dcommon.components.home=E:\oracle\MIDDLE~1\ORACLE~1 -Djrf.version=11.1.1 -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger -Ddomain.home=e:\oracle\MIDDLE~1\USER_P~1\domains\BASE_D~1 -Djrockit.optfile=E:\oracle\MIDDLE~1\ORACLE~1\modules\oracle.jrf_11.1.1\jrocket_optfile.txt -Doracle.server.config.dir=e:\oracle\MIDDLE~1\USER_P~1\domains\BASE_D~1\config\FMWCON~1\servers\IBR_server1 -Doracle.domain.config.dir=e:\oracle\MIDDLE~1\USER_P~1\domains\BASE_D~1\config\FMWCON~1 -Digf.arisidbeans.carmlloc=e:\oracle\MIDDLE~1\USER_P~1\domains\BASE_D~1\config\FMWCON~1\carml -Digf.arisidstack.home=e:\oracle\MIDDLE~1\USER_P~1\domains\BASE_D~1\config\FMWCON~1\arisidprovider -Doracle.security.jps.config=e:\oracle\MIDDLE~1\USER_P~1\domains\BASE_D~1\config\fmwconfig\jps-config.xml -Doracle.deployed.app.dir=e:\oracle\MIDDLE~1\USER_P~1\domains\BASE_D~1\servers\IBR_server1\tmp\_WL_user -Doracle.deployed.app.ext=\- -Dweblogic.alternateTypesDirectory=E:\oracle\MIDDLE~1\ORACLE~1\modules\oracle.ossoiap_11.1.1,E:\oracle\MIDDLE~1\ORACLE~1\modules\oracle.oamprovider_11.1.1 -Djava.protocol.handler.pkgs=oracle.mds.net.protocol -Dweblogic.jdbc.remoteEnabled=false -Ducm.oracle.home=E:\oracle\MIDDLE~1\ORACLE~2 -Dem.oracle.home=E:\oracle\middleware\oracle_common -Djava.awt.headless=true -Dweblogic.management.discover=false -Dweblogic.management.server=http://138.126.180.177:7001 -Dwlw.iterativeDev= -Dwlw.testConsole= -Dwlw.logErrorsToConsole= -Dweblogic.ext.dirs=e:\oracle\MIDDLE~1\patch_wls1035\profiles\default\sysext_manifest_classpath weblogic.Server
    Exception in thread "Main Thread" java.lang.NoClassDefFoundError: –Xnohup
    Caused by: java.lang.ClassNotFoundException: –Xnohup
         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:305)
         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:246)
    Could not find the main class: –Xnohup. Program will exit.
    <Oct 6, 2011 12:38:13 PM> <FINEST> <NodeManager> <Waiting for the process to die: 4252>
    <Oct 6, 2011 12:38:13 PM> <INFO> <NodeManager> <Server failed during startup so will not be restarted>
    <Oct 6, 2011 12:38:13 PM> <FINEST> <NodeManager> <runMonitor returned, setting finished=true and notifying waiters>

  • Shutdown(block='false') only works via Admin Server, not managed servers

    Hello,
    We want to shut down all 10 managed servers in our domain gracefully, and in parallel. 5 of those managed servers are on the same OS/host as the AdminServer, the other 5 are on a separate OS/host (i.e. we have 5 x 2-node clusters).
    I'm writing scripts to gracefully shut down all of our servers upon an OS reboot command. On the OS with the AdminServer (and 5 managed servers), I can simply do this:
    connect(user, pwd, http://localhost:7001)
    shutdown('myserver1', 'Server', block='false')
    shutdown('myserver3', 'Server', block='false')
    shutdown('myserver5', 'Server', block='false')
    shutdown('myserver7', 'Server', block='false')
    shutdown('myserver9', 'Server', block='false')
    All 5 servers will be shutting down at the same time. If they each took 2 minutes to shut down, the whole shutdown process would take 2 minutes.
    However, on the OS running the other managed servers, I cannot assume that the AdminServer [on the other OS] will be online. Hence, I would like to gracefully shut down each server locally like this:
    connect(user, pwd, url='t3://localhost:8001')
    shutdown(block='false')
    disconnect()
    connect(user, pwd, url='t3://localhost:9001')
    shutdown(block='false')
    disconnect()
    connect(user, pwd, url='t3://localhost:10001')
    shutdown(block='false')
    disconnect()
    connect(user, pwd, url='t3://localhost:11001')
    shutdown(block='false')
    disconnect()
    connect(user, pwd, url='t3://localhost:12001')
    shutdown(block='false')
    disconnect()
    NOTE that the ports are different (i.e. I connect via WLST to the managed servers themselves rather than to the AdminServer).
    Unfortunately, in this scenario, the block='false' does not work. WLST waits until the managed server is shut down before proceeding to the next connect() command. So if each server took 2 minutes to shut down, the whole shutdown process now takes 10 minutes.
    We don't want to use the nmKill() command on the local Node Manager because we want a graceful shutdown.
    What options do we have to issue a graceful shutdown command either to a local Node Manager or to the managed servers themselves (since we can't expect the AdminServer to be online when our script runs)?
    Best regards,
    Michael

    Hi Mike,
    WLST will create it temp directory with /var/tmp/wlstTemp which is shared by all users. Since we cannot differentiated by different users which cause block on particular user at that time.
    One solution is to grant write access to all users to the directory where the WLST temporary directory is created (e.g. on Solaris /var/tmp; you can verify the default on your system by executing java utils.getProperty and search for java.io.tmpdir).
    If for whatever reason you cannot grant these access modes on the temporary directory, then you can create a directory somewhere on the file system where every user has the correct permissions.
    Start the scripting tool with one the following options to redirect the cache files to the specified directory. Dependent on your environment one or the other will apply.
    java -Djava.io.tmpdir=<path-to-tmpDir> weblogic.WLST
    or
    java -Dpython.cachedir=<path-to-tmpDir> weblogic.WLST
    This will be useful to make shutdown of servers without any problem.
    Note: block=false will always works it should not be the problem.
    Regards,
    Kal

Maybe you are looking for