Cluster service is requested to stop on all nodes when DNS is unavailable

Our 6 node coherence cluster has been running fine for few days. All coherence nodes were requested to stop the cluster service when the DNS server was not available for few mins due to a scheduled maintenance activity. Cluster services didn’t come back up until the DNS server is available. Why would it need a DNS server when the cluster is already started and running fine for few days?
Here’s the error message and thread dump from the logs:
2010-12-18 18:07:18.819/3464791.277 Oracle Coherence GE 3.6.0.3 <Error> (thread=IpMonitor, member=7): Detected hard timeout) of {WrapperGuardable Guard{Daemon=Cluster} Service=ClusterService{Name=Cluster, State=(SERVICE_STARTED, STATE_JOINED), Id=0, Version=3.6, OldestMemberId=5}}
2010-12-18 18:07:18.823/3464791.281 Oracle Coherence GE 3.6.0.3 <Error> (thread=Termination Thread, member=7): Full Thread Dump
Thread[Invocation:Management:EventDispatcher,5,Cluster]
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.daemon.queueProcessor.Service$EventDispatcher.onWait(Service.CDB:7)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Thread.java:619)
Thread[Logger@9250962 3.6.0.3,3,main]
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Thread.java:619)
Thread[Signal Dispatcher,9,system]
Thread[Finalizer,8,system]
java.lang.Object.wait(Native Method)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
Thread[Invocation:Management,5,Cluster]
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onWait(Grid.CDB:6)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Thread.java:619)
ThreadCluster
java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:850)
java.net.InetAddress.getAddressFromNameService(InetAddress.java:1201)
java.net.InetAddress.getAllByName0(InetAddress.java:1154)
java.net.InetAddress.getAllByName(InetAddress.java:1084)
java.net.InetAddress.getAllByName(InetAddress.java:1020)
java.net.InetAddress.getByName(InetAddress.java:970)
java.net.InetSocketAddress.<init>(InetSocketAddress.java:124)
com.tangosol.net.ConfigurableAddressProvider$AddressHolder.getAddress(ConfigurableAddressProvider.java:426)
com.tangosol.net.ConfigurableAddressProvider$1.next(ConfigurableAddressProvider.java:167)
java.util.AbstractCollection.contains(AbstractCollection.java:89)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.isWellKnown(ClusterService.CDB:5)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.compareImportance(ClusterService.CDB:7)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.getWitnessMemberSet(ClusterService.CDB:49)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.verifyMemberLeft(ClusterService.CDB:91)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.onNotifyTcmpTimeout(ClusterService.CDB:11)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService$NotifyTcmpTimeout.onReceived(ClusterService.CDB:1)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:11)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:33)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.onNotify(ClusterService.CDB:3)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
java.lang.Thread.run(Thread.java:619)
Thread[main,5,main]
java.lang.Object.wait(Native Method)
com.tangosol.net.DefaultCacheServer.monitorServices(DefaultCacheServer.java:270)
com.tangosol.net.DefaultCacheServer.startAndMonitor(DefaultCacheServer.java:56)
com.tangosol.net.DefaultCacheServer.main(DefaultCacheServer.java:197)
Thread[PacketReceiver,7,Cluster]
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketReceiver.onWait(PacketReceiver.CDB:2)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Thread.java:619)
Thread[PacketSpeaker,8,Cluster]
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.queue.ConcurrentQueue.waitForEntry(ConcurrentQueue.CDB:16)
com.tangosol.coherence.component.util.queue.ConcurrentQueue.remove(ConcurrentQueue.CDB:7)
com.tangosol.coherence.component.util.Queue.remove(Queue.CDB:1)
com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketSpeaker.onNotify(PacketSpeaker.CDB:21)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
java.lang.Thread.run(Thread.java:619)
Thread[Termination Thread,6,Cluster]
java.lang.Thread.dumpThreads(Native Method)
java.lang.Thread.getAllStackTraces(Thread.java:1487)
com.tangosol.net.GuardSupport.logStackTraces(GuardSupport.java:810)
com.tangosol.coherence.component.net.Cluster$DefaultFailurePolicy.onGuardableTerminate(Cluster.CDB:4)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid$WrapperGuardable.terminate(Grid.CDB:1)
com.tangosol.net.GuardSupport$Context$2.run(GuardSupport.java:677)
java.lang.Thread.run(Thread.java:619)
Thread[Reference Handler,10,system]
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
Thread[PacketPublisher,6,Cluster]
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketPublisher.onWait(PacketPublisher.CDB:2)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Thread.java:619)
Thread[DistributedCache,5,Cluster]
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onWait(Grid.CDB:6)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Thread.java:619)
Thread[IpMonitor,6,Cluster]
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.daemon.IpMonitor.onWait(IpMonitor.CDB:4)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Thread.java:619)
Thread[PacketListener1P,8,Cluster]
java.net.PlainDatagramSocketImpl.receive0(Native Method)
java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:136)
java.net.DatagramSocket.receive(DatagramSocket.java:725)
com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:22)
com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:1)
com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:20)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
java.lang.Thread.run(Thread.java:619)
Thread[PacketListener1,8,Cluster]
java.net.PlainDatagramSocketImpl.receive0(Native Method)
java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:136)
java.net.DatagramSocket.receive(DatagramSocket.java:725)
com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:22)
com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:1)
com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:20)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
java.lang.Thread.run(Thread.java:619)
2010-12-18 18:07:18.823/3464791.281 Oracle Coherence GE 3.6.0.3 <Warning> (thread=Termination Thread, member=7): Terminating Guard{Daemon=Cluster}
2010-12-18 18:07:18.823/3464791.281 Oracle Coherence GE 3.6.0.3 <Error> (thread=StopService, member=7): Requested to stop cluster service.
2010-12-18 18:07:18.826/3464791.284 Oracle Coherence GE 3.6.0.3 <D5> (thread=DistributedCache, member=7): Service DistributedCache left the cluster
2010-12-18 18:07:18.826/3464791.284 Oracle Coherence GE 3.6.0.3 <D5> (thread=Invocation:Management, member=7): Service Management left the cluster
2010-12-18 18:07:24.904/3464797.362 Oracle Coherence GE 3.6.0.3 <Error> (thread=main, member=7): Failed to restart services: com.tangosol.net.RequestTimeoutException: Timeout while waiting for cluster to stop.
2010-12-18 18:07:33.915/3464806.373 Oracle Coherence GE 3.6.0.3 <Error> (thread=main, member=7): Failed to restart services: com.tangosol.net.RequestTimeoutException: Timeout while waiting for cluster to stop.
2010-12-18 18:07:42.924/3464815.382 Oracle Coherence GE 3.6.0.3 <Error> (thread=main, member=7): Failed to restart services: com.tangosol.net.RequestTimeoutException: Timeout while waiting for cluster to stop.
2010-12-18 18:07:51.936/3464824.394 Oracle Coherence GE 3.6.0.3 <Error> (thread=main, member=7): Failed to restart services: com.tangosol.net.RequestTimeoutException: Timeout while waiting for cluster to stop.

The log file shows that list of the addresses are formed by IP, but they are configured by using hostname in override file.
Here's the log entry:
WellKnownAddressList(Size=2,
WKA{Address=165.X.X.XX7, Port=8088}
WKA{Address=165.X.X.XX8, Port=8088}
Here's the configuration from tangosol-coherence-override-prod.xml:
<well-known-addresses>
<socket-address id="1">
<address system-property="tangosol.coherence.wka">serverA</address>
<port system-property="tangosol.coherence.wka.port">8088</port>
</socket-address>
<socket-address id="2">
<address system-property="tangosol.coherence.wka">serverB</address>
<port system-property="tangosol.coherence.wka.port">8088</port>
</socket-address>
</well-known-addresses>
Thanks,
Ramesh

Similar Messages

  • The cluster service terminated, error 7024, cannot create a file when that file already exists

    I have a test 2-node Failover cluster using Server 2012 R2
    As of last night the cluster service on one of the 2 nodes is down with this error:
    The Cluster Service service terminated with the following service-specific error: 
    Cannot create a file when that file already exists.
    EventID 7024
    The Cluster service waits 60 sec, tries to start, and the same error occurs again. 
    Any idea where to look to identify which file this error is referring to, or how to go about identifying root cause and getting a solution?
    thank you.
    samb

    Hi Yeswanth
    Then you can try with a "Add Counter". This will create new file each time with the same name but a counter will be added to the file name at the end specifying the number of times it is created.
    You can also the specify the format to create the counter once select this option u can correspondingly fill the Format and step fields.
    Will this be fine.
    Regards
    Ashmi

  • Stop, start all nodes.

    To shutdown database instance on all the nodes in a clusterd env/. I use
    srvctl stop/start database -d dbname
    Likewise what is the best way for ASM and cluster?
    Thanks!

    there is no command to startup from a single node or group command. Correct? Yes, that's correct.

  • EAR file is not deployed on all nodes when using SDM/Visual admin

    Hi
    We have a High availability portal landscape with multiple App Servers. When we deploy our custom applications (.EAR) using either SDM or Visual Administrator the file always deploys only onto the Central Instance and the end user some times sees blank screen becoz the load balancer routs the request to some other node (non Central Instance).
    Any helpful answer will be awarded with points !!
    Thanks
    Lakshmi

    Hi Lakshmi,
    Restarting the SAP System should synchronize the components among the application servers.
    You can also check if any of the components need to be updated by going to the deployment overview @ System Administation ->Support -> Support Desk-> Portal Runtime -> Deployment Overview
    This link might be of help.
    http://help.sap.com/saphelp_nw70/helpdata/en/f7/71b842b714b211e10000000a155106/frameset.htm
    Regards,
    Abhishek

  • The Cluster service is shutting down because quorum was lost

    Hi, we recently experienced the above issue and after looking for explanations I haven't been able to find any satisfying answers when other people have posted this issue.
    Our problem is as follows:
    2 node 2008R2 cluster running SQL 2012
    Each node is a HP BL460c running in a HP C7000 Blade Chassis.
    We were updating the flexfabric cards on one of the chassis.  The other chassis had been patched the previous week with no problems. 
    During the update process the flexfabric cards, which hold the Ethernet and FC connections, reboot so before work had begun all active cluster services had been failed over to the node in the chassis not being worked on.  However despite this the cluster
    service shut down on this one particular cluster.  All other clusters running across these 2 chassis continued to run as expected.
    As other people have posted before we saw the following errors in the system log.
    1564: File share witness resource 'File Share Witness' failed to arbitrate for the file share
    1069: Cluster resource 'File Share Witness' in clustered service or application 'Cluster Group' failed.
    1172: The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
    Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected
    such as hubs, switches, or bridges.
    However we cant understand what could cause this to happen when the service is running on the node in the chassis not being updated, especially when the same update was performed the week before with no issues.  How can both nodes lose connectivity
    to the File Share Witness at the same time?
    Cluster Validation tests run fine and don't highlight any issues.  The file share witness is accessible from both servers.

    Hi,
    Please confirm you have install the Recommended hotfixes and updates for Windows Server 2008 R2 SP1 Failover Clusters update, especially the following hotfix.
    The network location profile changes from "Domain" to "Public" in Windows 7 or in Windows Server 2008 R2
    http://support.microsoft.com/kb/2524478/EN-US
    A hotfix is available that adds two new cluster control codes to help you determine which cluster node is blocking a GUM update in Windows Server 2008 R2 and Windows Server
    2012
    http://support.microsoft.com/kb/2779069/EN-US
    Hope this helps.
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • Cluster services UNKNOWN state

    Hi,
    I am having two node cluster database. I have some doubt
    If cluster services will go UNKNOWN state in first node existing connection will failover to second node?
    New connections will try to connect first node?

    user2017273 wrote:
    Hi,
    I am having two node cluster database. I have some doubtQuit doubting and TEST it for yourself. Also actually reading the documentation will help
    >
    If cluster services will go UNKNOWN state in first node existing connection will failover to second node?
    Maybe...
    New connections will try to connect first node?If nodex is down any connection attempt should go to the remaining nodes.

  • Error in coherence-- stopping cluster service.

    i do have found the error in one of my coherence server log files can some one explain me what does it mean?
    Coherence Logger@9272718 3.4.2/411 ERROR 2009-06-01 16:08:31.396/1217.130 Oracle Coherence GE 3.4.2/411 <Error> (thread=Cluster, member=3): Received cluster heartbeat from the senior Member(Id=7, Timestamp=2009-04-24 12:29:25.802, Address=xx.xxx.xx.xxx:8093, MachineId=55400, Location=machine:server72,process:11324, Role=WeblogicServer) that does not contain this Member(Id=3, Timestamp=2009-06-01 15:48:09.18, Address=xx.xxx.xxx.xx:8091, MachineId=47428, Location=site:ops.company.org,machine:cohserverbox1,process:14401, Role=CoherenceServer); stopping cluster service.
    Thanks Much

    Hi,
    This error essentially means what it says: The process received a cluster heartbeat that did not include the process as a member of the cluster. The process, therefore, stops its cluster service and will attempt to join the cluster again when appropriate. There are few reasons that the senior member may not have included the process in its heartbeat. Based on the timestamps and roles, I would first want to confirm the intent to cluster these processes. If the intent is not to cluster these processes, I would adjust their configurations appropriately (eg. use a distinct port) to form separate clusters. If the intent is to cluster these processes and the error (with the timestamp spread) reproduces, I would want to examine the network topology and look for reasons the members are being dropped from the cluster.
    Regards,
    Harv

  • Pre-check for cluster services setup was unsuccessful on all the nodes.

    hi
    when i am running the fixup script getting:
    if i run cluvfy again i am getting another fixup script.what exactly to do?
    [root@rac-1 grid1]# sh /tmp/CVU_11.2.0.1.0_grid1/runfixup.sh
    Response file being used is :/tmp/CVU_11.2.0.1.0_grid1/fixup.response
    Enable file being used is :/tmp/CVU_11.2.0.1.0_grid1/fixup.enable
    Log file location: /tmp/CVU_11.2.0.1.0_grid1/orarun.log
    uid=1100(grid1) gid=1000(oinstall) groups=1000(oinstall),1100(dba),1200(asmdba),1300(asmadmin),1202(asmoper)
    grid1     hard    nproc    16384
    Value of MAX PROCESSES HARDLIMIT in response file is not greater than value in/etc/security/limits.conf. Hence not changing it.
    grid1     hard    nofile   65536
    Value of FILE OPEN MAX HARDLIMIT in response file is not greater than value in /etc/security/limits.conf.Hence not changing it.
    uid=1100(grid1) gid=1000(oinstall) groups=1000(oinstall),1100(dba),1200(asmdba),1300(asmadmin),1202(asmoper)
    [root@rac-1 grid1]#
    Performing pre-checks for cluster services setup
    Checking node reachability...
    Check: Node reachability from node "rac-1"
      Destination Node                      Reachable?             
      rac-2                                 yes                    
      rac-1                                 yes                    
    Result: Node reachability check passed from node "rac-1"
    Checking user equivalence...
    Check: User equivalence for user "grid1"
      Node Name                             Comment                
      rac-2                                 passed                 
      rac-1                                 passed                 
    Result: User equivalence check passed for user "grid1"
    Checking node connectivity...
    Checking hosts config file...
      Node Name     Status                    Comment                
      rac-2         passed                                           
      rac-1         passed                                           
    Verification of the hosts config file successful
    Interface information for node "rac-2"
    Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU  
    eth0   192.168.1.3     192.168.1.0     0.0.0.0         192.168.1.1     00:1D:72:39:3A:E4 1500 
    virbr0 192.168.122.1   192.168.122.0   0.0.0.0         192.168.1.1     00:00:00:00:00:00 1500 
    eth1   192.168.181.20  192.168.181.0   0.0.0.0         192.168.1.1     00:00:00:00:00:00 1500 
    Interface information for node "rac-1"
    Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU  
    eth0   192.168.1.2     192.168.1.0     0.0.0.0         192.168.1.1     00:00:E8:F7:02:B0 1500 
    eth1   192.168.181.10  192.168.181.0   0.0.0.0         192.168.1.1     00:26:18:59:EE:49 1500 
    virbr0 192.168.122.1   192.168.122.0   0.0.0.0         192.168.1.1     00:00:00:00:00:00 1500 
    Check: Node connectivity of subnet "192.168.1.0"
      Source                          Destination                     Connected?     
      rac-2:eth0                      rac-1:eth0                      yes            
    Result: Node connectivity passed for subnet "192.168.1.0" with node(s) rac-2,rac-1
    Check: TCP connectivity of subnet "192.168.1.0"
      Source                          Destination                     Connected?     
      rac-1:192.168.1.2               rac-2:192.168.1.3               passed         
    Result: TCP connectivity check passed for subnet "192.168.1.0"
    Check: Node connectivity of subnet "192.168.122.0"
      Source                          Destination                     Connected?     
      rac-2:virbr0                    rac-1:virbr0                    yes            
    Result: Node connectivity passed for subnet "192.168.122.0" with node(s) rac-2,rac-1
    Check: TCP connectivity of subnet "192.168.122.0"
    Result: TCP connectivity check failed for subnet "192.168.122.0"
    Check: Node connectivity of subnet "192.168.181.0"
      Source                          Destination                     Connected?     
      rac-2:eth1                      rac-1:eth1                      yes            
    Result: Node connectivity passed for subnet "192.168.181.0" with node(s) rac-2,rac-1
    Check: TCP connectivity of subnet "192.168.181.0"
      Source                          Destination                     Connected?     
      rac-1:192.168.181.10            rac-2:192.168.181.20            passed         
    Result: TCP connectivity check passed for subnet "192.168.181.0"
    Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
    rac-2 eth0:192.168.1.3
    rac-1 eth0:192.168.1.2
    Interfaces found on subnet "192.168.122.0" that are likely candidates for a private interconnect are:
    rac-2 virbr0:192.168.122.1
    rac-1 virbr0:192.168.122.1
    Interfaces found on subnet "192.168.181.0" that are likely candidates for a private interconnect are:
    rac-2 eth1:192.168.181.20
    rac-1 eth1:192.168.181.10
    Result: Node connectivity check passed
    Check: Total memory
      Node Name     Available                 Required                  Comment  
      rac-2         1.96GB (2050416.0KB)      1.5GB (1572864.0KB)       passed   
      rac-1         1.96GB (2058984.0KB)      1.5GB (1572864.0KB)       passed   
    Result: Total memory check passed
    Check: Available memory
      Node Name     Available                 Required                  Comment  
      rac-2         1.7GB (1780600.0KB)       50MB (51200.0KB)          passed   
      rac-1         1.56GB (1636896.0KB)      50MB (51200.0KB)          passed   
    Result: Available memory check passed
    Check: Swap space
      Node Name     Available                 Required                  Comment  
      rac-2         4GB (4194296.0KB)         2.93GB (3075624.0KB)      passed   
      rac-1         4GB (4192956.0KB)         2.95GB (3088476.0KB)      passed   
    Result: Swap space check passed
    Check: Free disk space for "rac-2:/tmp"
      Path              Node Name     Mount point   Available     Required      Comment    
      /tmp              rac-2         /             24.03GB       1GB           passed     
    Result: Free disk space check passed for "rac-2:/tmp"
    Check: Free disk space for "rac-1:/tmp"
      Path              Node Name     Mount point   Available     Required      Comment    
      /tmp              rac-1         /             16.54GB       1GB           passed     
    Result: Free disk space check passed for "rac-1:/tmp"
    Check: User existence for "grid1"
      Node Name     Status                    Comment                
      rac-2         exists                    passed                 
      rac-1         exists                    passed                 
    Result: User existence check passed for "grid1"
    Check: Group existence for "oinstall"
      Node Name     Status                    Comment                
      rac-2         exists                    passed                 
      rac-1         exists                    passed                 
    Result: Group existence check passed for "oinstall"
    Check: Group existence for "dba"
      Node Name     Status                    Comment                
      rac-2         exists                    passed                 
      rac-1         exists                    passed                 
    Result: Group existence check passed for "dba"
    Check: Membership of user "grid1" in group "oinstall" [as Primary]
      Node Name         User Exists   Group Exists  User in Group  Primary       Comment    
      rac-2             yes           yes           yes           yes           passed     
      rac-1             yes           yes           yes           yes           passed     
    Result: Membership check for user "grid1" in group "oinstall" [as Primary] passed
    Check: Membership of user "grid1" in group "dba"
      Node Name         User Exists   Group Exists  User in Group  Comment        
      rac-2             yes           yes           no            failed         
      rac-1             yes           yes           yes           passed         
    Result: Membership check for user "grid1" in group "dba" failed
    Check: Run level
      Node Name     run level                 Required                  Comment  
      rac-2         5                         3,5                       passed   
      rac-1         5                         3,5                       passed   
    Result: Run level check passed
    Check: Hard limits for "maximum open file descriptors"
      Node Name         Type          Available     Required      Comment        
      rac-2             hard          65536         65536         passed         
      rac-1             hard          65536         65536         passed         
    Result: Hard limits check passed for "maximum open file descriptors"
    Check: Soft limits for "maximum open file descriptors"
      Node Name         Type          Available     Required      Comment        
      rac-2             soft          1024          1024          passed         
      rac-1             soft          65536         1024          passed         
    Result: Soft limits check passed for "maximum open file descriptors"
    Check: Hard limits for "maximum user processes"
      Node Name         Type          Available     Required      Comment        
      rac-2             hard          16384         16384         passed         
      rac-1             hard          16384         16384         passed         
    Result: Hard limits check passed for "maximum user processes"
    Check: Soft limits for "maximum user processes"
      Node Name         Type          Available     Required      Comment        
      rac-2             soft          2047          2047          passed         
      rac-1             soft          16384         2047          passed         
    Result: Soft limits check passed for "maximum user processes"
    Check: System architecture
      Node Name     Available                 Required                  Comment  
      rac-2         x86_64                    x86_64                    passed   
      rac-1         x86_64                    x86_64                    passed   
    Result: System architecture check passed
    Check: Kernel version
      Node Name     Available                 Required                  Comment  
      rac-2         2.6.18-92.el5             2.6.18                    passed   
      rac-1         2.6.18-164.el5            2.6.18                    passed   
    WARNING:
    PRVF-7524 : Kernel version is not consistent across all the nodes.
    Kernel version = "2.6.18-164.el5" found on nodes: rac-1.
    Kernel version = "2.6.18-92.el5" found on nodes: rac-2.
    Result: Kernel version check passed
    Check: Kernel parameter for "semmsl"
      Node Name     Configured                Required                  Comment  
      rac-2         250                       250                       passed   
      rac-1         250                       250                       passed   
    Result: Kernel parameter check passed for "semmsl"
    Check: Kernel parameter for "semmns"
      Node Name     Configured                Required                  Comment  
      rac-2         32000                     32000                     passed   
      rac-1         32000                     32000                     passed   
    Result: Kernel parameter check passed for "semmns"
    Check: Kernel parameter for "semopm"
      Node Name     Configured                Required                  Comment  
      rac-2         100                       100                       passed   
      rac-1         100                       100                       passed   
    Result: Kernel parameter check passed for "semopm"
    Check: Kernel parameter for "semmni"
      Node Name     Configured                Required                  Comment  
      rac-2         142                       128                       passed   
      rac-1         142                       128                       passed   
    Result: Kernel parameter check passed for "semmni"
    Check: Kernel parameter for "shmmax"
      Node Name     Configured                Required                  Comment  
      rac-2         1049812992                536870912                 passed   
      rac-1         4398046511104             536870912                 passed   
    Result: Kernel parameter check passed for "shmmax"
    Check: Kernel parameter for "shmmni"
      Node Name     Configured                Required                  Comment  
      rac-2         4096                      4096                      passed   
      rac-1         4096                      4096                      passed   
    Result: Kernel parameter check passed for "shmmni"
    Check: Kernel parameter for "shmall"
      Node Name     Configured                Required                  Comment  
      rac-2         3279547                   2097152                   passed   
      rac-1         1073741824                2097152                   passed   
    Result: Kernel parameter check passed for "shmall"
    Check: Kernel parameter for "file-max"
      Node Name     Configured                Required                  Comment  
      rac-2         6815744                   6815744                   passed   
      rac-1         6815744                   6815744                   passed   
    Result: Kernel parameter check passed for "file-max"
    Check: Kernel parameter for "ip_local_port_range"
      Node Name     Configured                Required                  Comment  
      rac-2         between 9000 & 65500      between 9000 & 65500      passed   
      rac-1         between 9000 & 65500      between 9000 & 65500      passed   
    Result: Kernel parameter check passed for "ip_local_port_range"
    Check: Kernel parameter for "rmem_default"
      Node Name     Configured                Required                  Comment  
      rac-2         262144                    262144                    passed   
      rac-1         4194304                   262144                    passed   
    Result: Kernel parameter check passed for "rmem_default"
    Check: Kernel parameter for "rmem_max"
      Node Name     Configured                Required                  Comment  
      rac-2         4194304                   4194304                   passed   
      rac-1         4194304                   4194304                   passed   
    Result: Kernel parameter check passed for "rmem_max"
    Check: Kernel parameter for "wmem_default"
      Node Name     Configured                Required                  Comment  
      rac-2         262144                    262144                    passed   
      rac-1         262144                    262144                    passed   
    Result: Kernel parameter check passed for "wmem_default"
    Check: Kernel parameter for "wmem_max"
      Node Name     Configured                Required                  Comment  
      rac-2         1048576                   1048576                   passed   
      rac-1         1048576                   1048576                   passed   
    Result: Kernel parameter check passed for "wmem_max"
    Check: Kernel parameter for "aio-max-nr"
      Node Name     Configured                Required                  Comment  
      rac-2         3145728                   1048576                   passed   
      rac-1         3145728                   1048576                   passed   
    Result: Kernel parameter check passed for "aio-max-nr"
    Check: Package existence for "ocfs2-tools-1.2.7"
      Node Name     Available                 Required                  Comment  
      rac-2         ocfs2-tools-1.2.7-1.el5   ocfs2-tools-1.2.7         passed   
      rac-1         ocfs2-tools-1.4.2-1.el5   ocfs2-tools-1.2.7         passed   
    Result: Package existence check passed for "ocfs2-tools-1.2.7"
    Check: Package existence for "make-3.81"
      Node Name     Available                 Required                  Comment  
      rac-2         make-3.81-3.el5           make-3.81                 passed   
      rac-1         make-3.81-3.el5           make-3.81                 passed   
    Result: Package existence check passed for "make-3.81"
    Check: Package existence for "binutils-2.17.50.0.6"
      Node Name     Available                 Required                  Comment  
      rac-2         binutils-2.17.50.0.6-6.el5  binutils-2.17.50.0.6      passed   
      rac-1         binutils-2.17.50.0.6-12.el5  binutils-2.17.50.0.6      passed   
    Result: Package existence check passed for "binutils-2.17.50.0.6"
    Check: Package existence for "gcc-4.1.2"
      Node Name     Available                 Required                  Comment  
      rac-2         gcc-4.1.2-42.el5          gcc-4.1.2                 passed   
      rac-1         gcc-4.1.2-46.el5          gcc-4.1.2                 passed   
    Result: Package existence check passed for "gcc-4.1.2"
    Check: Package existence for "libaio-0.3.106 (i386)"
      Node Name     Available                 Required                  Comment  
      rac-2         libaio-0.3.106-3.2 (i386)  libaio-0.3.106 (i386)     passed   
      rac-1         libaio-0.3.106-3.2 (i386)  libaio-0.3.106 (i386)     passed   
    Result: Package existence check passed for "libaio-0.3.106 (i386)"
    Check: Package existence for "libaio-0.3.106 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         libaio-0.3.106-3.2 (x86_64)  libaio-0.3.106 (x86_64)   passed   
      rac-1         libaio-0.3.106-3.2 (x86_64)  libaio-0.3.106 (x86_64)   passed   
    Result: Package existence check passed for "libaio-0.3.106 (x86_64)"
    Check: Package existence for "glibc-2.5-24 (i686)"
      Node Name     Available                 Required                  Comment  
      rac-2         glibc-2.5-24 (i686)       glibc-2.5-24 (i686)       passed   
      rac-1         glibc-2.5-42 (i686)       glibc-2.5-24 (i686)       passed   
    Result: Package existence check passed for "glibc-2.5-24 (i686)"
    Check: Package existence for "glibc-2.5-24 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         glibc-2.5-24 (x86_64)     glibc-2.5-24 (x86_64)     passed   
      rac-1         glibc-2.5-42 (x86_64)     glibc-2.5-24 (x86_64)     passed   
    Result: Package existence check passed for "glibc-2.5-24 (x86_64)"
    Check: Package existence for "compat-libstdc++-33-3.2.3 (i386)"
      Node Name     Available                 Required                  Comment  
      rac-2         compat-libstdc++-33-3.2.3-61 (i386)  compat-libstdc++-33-3.2.3 (i386)  passed   
      rac-1         compat-libstdc++-33-3.2.3-61 (i386)  compat-libstdc++-33-3.2.3 (i386)  passed   
    Result: Package existence check passed for "compat-libstdc++-33-3.2.3 (i386)"
    Check: Package existence for "compat-libstdc++-33-3.2.3 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         compat-libstdc++-33-3.2.3-61 (x86_64)  compat-libstdc++-33-3.2.3 (x86_64)  passed   
      rac-1         compat-libstdc++-33-3.2.3-61 (x86_64)  compat-libstdc++-33-3.2.3 (x86_64)  passed   
    Result: Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)"
    Check: Package existence for "elfutils-libelf-0.125 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         elfutils-libelf-0.125-3.el5 (x86_64)  elfutils-libelf-0.125 (x86_64)  passed   
      rac-1         elfutils-libelf-0.137-3.el5 (x86_64)  elfutils-libelf-0.125 (x86_64)  passed   
    Result: Package existence check passed for "elfutils-libelf-0.125 (x86_64)"
    Check: Package existence for "elfutils-libelf-devel-0.125"
      Node Name     Available                 Required                  Comment  
      rac-2         elfutils-libelf-devel-0.125-3.el5  elfutils-libelf-devel-0.125  passed   
      rac-1         elfutils-libelf-devel-0.137-3.el5  elfutils-libelf-devel-0.125  passed   
    Result: Package existence check passed for "elfutils-libelf-devel-0.125"
    Check: Package existence for "glibc-common-2.5"
      Node Name     Available                 Required                  Comment  
      rac-2         glibc-common-2.5-24       glibc-common-2.5          passed   
      rac-1         glibc-common-2.5-42       glibc-common-2.5          passed   
    Result: Package existence check passed for "glibc-common-2.5"
    Check: Package existence for "glibc-devel-2.5 (i386)"
      Node Name     Available                 Required                  Comment  
      rac-2         glibc-devel-2.5-24 (i386)  glibc-devel-2.5 (i386)    passed   
      rac-1         glibc-devel-2.5-42 (i386)  glibc-devel-2.5 (i386)    passed   
    Result: Package existence check passed for "glibc-devel-2.5 (i386)"
    Check: Package existence for "glibc-devel-2.5 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         glibc-devel-2.5-24 (x86_64)  glibc-devel-2.5 (x86_64)  passed   
      rac-1         glibc-devel-2.5-42 (x86_64)  glibc-devel-2.5 (x86_64)  passed   
    Result: Package existence check passed for "glibc-devel-2.5 (x86_64)"
    Check: Package existence for "glibc-headers-2.5"
      Node Name     Available                 Required                  Comment  
      rac-2         glibc-headers-2.5-24      glibc-headers-2.5         passed   
      rac-1         glibc-headers-2.5-42      glibc-headers-2.5         passed   
    Result: Package existence check passed for "glibc-headers-2.5"
    Check: Package existence for "gcc-c++-4.1.2"
      Node Name     Available                 Required                  Comment  
      rac-2         gcc-c++-4.1.2-42.el5      gcc-c++-4.1.2             passed   
      rac-1         gcc-c++-4.1.2-46.el5      gcc-c++-4.1.2             passed   
    Result: Package existence check passed for "gcc-c++-4.1.2"
    Check: Package existence for "libaio-devel-0.3.106 (i386)"
      Node Name     Available                 Required                  Comment  
      rac-2         libaio-devel-0.3.106-3.2 (i386)  libaio-devel-0.3.106 (i386)  passed   
      rac-1         libaio-devel-0.3.106-3.2 (i386)  libaio-devel-0.3.106 (i386)  passed   
    Result: Package existence check passed for "libaio-devel-0.3.106 (i386)"
    Check: Package existence for "libaio-devel-0.3.106 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         libaio-devel-0.3.106-3.2 (x86_64)  libaio-devel-0.3.106 (x86_64)  passed   
      rac-1         libaio-devel-0.3.106-3.2 (x86_64)  libaio-devel-0.3.106 (x86_64)  passed   
    Result: Package existence check passed for "libaio-devel-0.3.106 (x86_64)"
    Check: Package existence for "libgcc-4.1.2 (i386)"
      Node Name     Available                 Required                  Comment  
      rac-2         libgcc-4.1.2-42.el5 (i386)  libgcc-4.1.2 (i386)       passed   
      rac-1         libgcc-4.1.2-46.el5 (i386)  libgcc-4.1.2 (i386)       passed   
    Result: Package existence check passed for "libgcc-4.1.2 (i386)"
    Check: Package existence for "libgcc-4.1.2 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         libgcc-4.1.2-42.el5 (x86_64)  libgcc-4.1.2 (x86_64)     passed   
      rac-1         libgcc-4.1.2-46.el5 (x86_64)  libgcc-4.1.2 (x86_64)     passed   
    Result: Package existence check passed for "libgcc-4.1.2 (x86_64)"
    Check: Package existence for "libstdc++-4.1.2 (i386)"
      Node Name     Available                 Required                  Comment  
      rac-2         libstdc++-4.1.2-42.el5 (i386)  libstdc++-4.1.2 (i386)    passed   
      rac-1         libstdc++-4.1.2-46.el5 (i386)  libstdc++-4.1.2 (i386)    passed   
    Result: Package existence check passed for "libstdc++-4.1.2 (i386)"
    Check: Package existence for "libstdc++-4.1.2 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         libstdc++-4.1.2-42.el5 (x86_64)  libstdc++-4.1.2 (x86_64)  passed   
      rac-1         libstdc++-4.1.2-46.el5 (x86_64)  libstdc++-4.1.2 (x86_64)  passed   
    Result: Package existence check passed for "libstdc++-4.1.2 (x86_64)"
    Check: Package existence for "libstdc++-devel-4.1.2 (x86_64)"
      Node Name     Available                 Required                  Comment  
      rac-2         libstdc++-devel-4.1.2-42.el5 (x86_64)  libstdc++-devel-4.1.2 (x86_64)  passed   
      rac-1         libstdc++-devel-4.1.2-46.el5 (x86_64)  libstdc++-devel-4.1.2 (x86_64)  passed   
    Result: Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)"
    Check: Package existence for "sysstat-7.0.2"
      Node Name     Available                 Required                  Comment  
      rac-2         sysstat-7.0.2-1.el5       sysstat-7.0.2

    In both node oracle user & group should be same but in your case looks different.
    Once again check your user & group.
    Babu

  • SAP Cluster service issue

    Here is the description of the PRD cluster scenario. ( windows 2008 + oracle)
    We have 2 nodes .
    1. host-erpn01 ( Have ASCS , Database instance, Enqueue and Dialog
    Instance installed)
    2. host-erp02 ( Have Central Instance, Dialog Instance and Enqueue installed)
    When we move "SAP SID" service using "failover cluster management tool" from one node to another its fails and we have to manually select the  "SAP SID cluster service" and "SAP SID cluster instance" to online.
    These both service and instance were coming online after manual selection, however after some time in the mmc console of node 2 the sap instances hosted on node1 are in red cross and are giving " cannot connect to sap service dcom interface error 800706BA"
    We replaced the sapstartsrv.exe from working directory of ASCS instance to CI executable directory.
    Now the disp+work is stopped for CI instance. Also in the CI instance executable directory we can see five files with name of sapstartsrv i.e
    sapstartsrv.exe.new , sapstartsrv.exe.tmp, sapstartsrv.new, sapstartsrv.pdb and actual sapstartsrv.exe file.
    Here is the log of sapstartsrv.log  CI work directory from node2.
    trc file: "sapstartsrv.log", trc level: 0, release: "701"
    pid        1968
    Mon Oct 11 15:55:33 2010
    SAP HA Trace: Build in SAP Microsoft Cluster library '701, patch 32, changelist 1046543' initialized
    Initializing SAPControl Webservice
    SapSSLInit failed => https support disabled
    Starting WebService Named Pipe thread
    Starting WebService thread
    Webservice named pipe thread started, listening on port
    .\pipe\sapcontrol_01
    Webservice thread started, listening on port 50113
    GCCIA\csrvadmin is starting SAP System at 2010/10/11 16:09:07
    SAP HA Trace: FindClusterResource: SAP resource not found [sapwinha.cpp, line 334]
    SAP HA Trace: SAP_HA_FindSAPInstance returns: SAP_HA_NOT_CLUSTERED [sapwinha.cpp, line 907]"
    or you can view other logs from the work directory dump at
    http://s000.tinyupload.com/index.php?file_id=45384422007535688902
    Now when we try to start the SAPSID_00 service manually its giving error "The SAPSID_00 service failed to start due to the following error: The system cannot find the path specified.
    Please advice.
    Regards
    Edited by: Tech GCCIA on Oct 11, 2010 3:27 PM
    Edited by: Tech GCCIA on Oct 11, 2010 3:28 PM

    Hi Sunil ,
                       On node 1 there is no  listener.trc at /oracle_home/network/trace folder , here is the log of listener.log file in case if it is helpful.
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 10:37:37
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=3116
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCP.WORLDipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCPipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=gccia-erpn01.gccia.com.sa)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 11:59:37
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=5036
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCP.WORLDipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCPipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
    10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60592)) * establish * GCP * 0
    10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60593)) * establish * GCP * 0
    10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60594)) * establish * GCP * 0
    10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60595)) * establish * GCP * 0
    10-OCT-2010 12:00:31 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=60596)) * establish * GCP * 0
    10-OCT-2010 13:01:19 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61336)) * establish * GCP * 0
    10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61340)) * establish * GCP * 0
    10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61341)) * establish * GCP * 0
    10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61342)) * establish * GCP * 0
    10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61343)) * establish * GCP * 0
    10-OCT-2010 13:01:37 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61344)) * establish * GCP * 0
    10-OCT-2010 13:08:27 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61485)) * establish * GCP * 0
    10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61489)) * establish * GCP * 0
    10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61490)) * establish * GCP * 0
    10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61491)) * establish * GCP * 0
    10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61492)) * establish * GCP * 0
    10-OCT-2010 13:08:42 * (CONNECT_DATA=(SID=GCP)(GLOBAL_NAME=GCP.WORLD)(CID=(PROGRAM=D:\oracle\OFS\SRV\fs\fssvr\bin\FsSurrogate.exe)(HOST=GCCIA-ERPN01)(USER=csrvadmin))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=61493)) * establish * GCP * 0
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:09:57
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=2336
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCP.WORLDipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCPipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:14:34
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=4948
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCP.WORLDipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCPipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 13:38:12
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=2456
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCP.WORLDipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCPipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 14:03:35
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=2756
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCP.WORLDipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCPipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 10-OCT-2010 14:10:42
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=4812
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCP.WORLDipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=
    .\pipe\GCPipc)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 11-OCT-2010 09:34:05
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=1920
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
    TNSLSNR for 64-bit Windows: Version 10.2.0.4.0 - Production on 11-OCT-2010 21:12:29
    Copyright (c) 1991, 2007, Oracle.  All rights reserved.
    System parameter file is D:\oracle\GCP\102\network\admin\listener.ora
    Log messages written to D:\oracle\GCP\102\network\log\listener.log
    Trace information written to D:\oracle\GCP\102\network\trace\listener.trc
    Trace level is currently 0
    Started with pid=1952
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.11.13)(PORT=1527)))
    Listener completed notification to CRS on start
    TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE

  • He cluster resource host subsystem (RHS) stopped unexpectedly

    i m getting following error on a daily at night , my all services got restarted.... Pls help 
    Running windows 2008 r2 Enterprise 
    The cluster resource host subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually due to a problem in a resource DLL. Please determine which resource DLL is causing the issue and report the problem to the resource vendor. 

    Hi,
    this hotfix contains the latest clusres.dll. Can you try it?
    http://support.microsoft.com/kb/2854082
    and check this list:
    http://social.technet.microsoft.com/wiki/contents/articles/2008.list-of-cluster-hotfixes-for-windows-server-2008-r2.aspx
    http://OpsMgr.ru/

  • Why am I getting ExchangeWebServices Inbox Error: Error, ErrorServerBusy. The server cannot service this request right now. Try again later

    I recently switched my application that uses EWS from an on-premise Exchange Server to Exchage Online through Office356.
    The process worked just fine for several days, then I started getting the following errors;
    Error accessing [USERNAME] email account.; ExchangeWebServices Inbox Error: Error, ErrorServerBusy, The server cannot service this request right now. Try again later. --> 
    This has been happening for the past 14 hours now. 
    I contacted my Office365 support team and they acted like they had never heard of the Exchange Web Services API, so no help there.
    I can access the mailbox using the O365 web portal and I can access the mailbox account using the Outlook 2013 desktop client. The issue seems specific to EWS
    My program is a Windows service, written in VB.Net. It connects to EWS, goes to the user account inbox, iterates through the inbox extracting attachments from messages, then moves the messages to a saved folder below the inbox.
    I created the wrapper for EWS that I can reference in my project code using the following, run from an elevated VS2012 command prompt;
    wsdl.exe /namespace:ExchangeWebServices /out:EWS.cs https://outlook.office365.com/ews/services.wsdl /username:[email protected] /password:p@ssw0rd
    csc /out:EWS_E2K13_release /target:library EWS.cs
    I bind to EWS in my class, using the following code;
    Imports System.Net
    Imports ExchangeWebServices
    Public Class Exchange2013WebServiceClass
        Private ExchangeBinding As New ExchangeServiceBinding
        Public Sub New(ByVal userEmail As String, ByVal userPassword As String, ByVal URL As String)
            ExchangeBinding.Credentials = New NetworkCredential(userEmail, userPassword)
            ExchangeBinding.Url = URL
        End Sub
    The error that is logged gets triggered when my code makes a call to the following method;
        Public Function GetInboxMessageIDs() As ArrayOfRealItemsType
            Dim returnInboxMessageIds As ArrayOfRealItemsType = Nothing
            Dim errMsg As String = String.Empty
            'Create the request and specify the travesal type.
            Dim FindItemRequest As FindItemType
            FindItemRequest = New FindItemType
            FindItemRequest.Traversal = ItemQueryTraversalType.Shallow
            'Define which item properties are returned in the response.
            Dim ItemProperties As ItemResponseShapeType
            ItemProperties = New ItemResponseShapeType
            ItemProperties.BaseShape = DefaultShapeNamesType.IdOnly
            'Add properties shape to the request.
            FindItemRequest.ItemShape = ItemProperties
            'Identify which folders to search to find items.
            Dim FolderIDArray(0) As DistinguishedFolderIdType
            FolderIDArray(0) = New DistinguishedFolderIdType
            FolderIDArray(0).Id = DistinguishedFolderIdNameType.inbox
            'Add folders to the request.
            FindItemRequest.ParentFolderIds = FolderIDArray
            Try
                'Send the request and get the response.
                Dim FindItemResponse As FindItemResponseType
                FindItemResponse = ExchangeBinding.FindItem(FindItemRequest)
                'Get the response messages.
                Dim ResponseMessage As ResponseMessageType()
                ResponseMessage = FindItemResponse.ResponseMessages.Items
                Dim FindItemResponseMessage As FindItemResponseMessageType
                If ResponseMessage(0).ResponseClass = ResponseClassType.Success Then
                    FindItemResponseMessage = ResponseMessage(0)
                    returnInboxMessageIds = FindItemResponseMessage.RootFolder.Item
                Else
                    '' Server error
                    Dim responseClassStr As String = [Enum].GetName(GetType(ExchangeWebServices.ResponseClassType), ResponseMessage(0).ResponseClass).ToString
                    Dim responseCodeStr As String = [Enum].GetName(GetType(ExchangeWebServices.ResponseCodeType), ResponseMessage(0).ResponseCode).ToString
                    Dim messageTextStr As String = ResponseMessage(0).MessageText.ToString
                    Dim thisErrMsg As String = String.Format("ExchangeWebServices Inbox Error: {0}, {1}, {2}", responseClassStr, responseCodeStr, messageTextStr)
                    errMsg = If(errMsg.Equals(String.Empty), String.Empty, errMsg & "; ") & thisErrMsg
                End If
            Catch ex As Exception
                'errMsg = String.Join("; ", errMsg, ex.Message)
                errMsg = If(errMsg.Equals(String.Empty), String.Empty, errMsg & "; ") & ex.Message
            End Try
            If Not errMsg.Equals(String.Empty) Then
                returnInboxMessageIds = Nothing
                Throw New System.Exception(errMsg)
            End If
            Return returnInboxMessageIds
        End Function  
    Since the code worked just fine for several days and then suddenly stopped working with a server busy error, I have to think that this is some type of limit or throttling by EWS on the account. I process several thousand emails per day, in chunks of 300
    at a time. 
    But I have no idea how to check for any limits exceeded. I am nowhere close to my O365 mailbox size limit. Right now, there are over 4,000 messages in my inbox, and growing. 
    Thanks in advance for any ideas you can offer.
    Dave

    All the API's EWS, MAPI, ActiveSync,Remote powershell are throttled on Office365 (based around what 1 particular user could resonably do). If you have had a read of this already i would recommend
    http://msdn.microsoft.com/en-us/library/office/jj945066(v=exchg.150).aspx
     You can't adjust or even find your current throttle usage so you have to try to design your code around living inside the default limits. If your using One Service Account to access multiple Mailboxes (or if that account is because used across multiple
    applications) that can cause problems. In this case using EWS Impersonation is good solution as described in
    http://blogs.msdn.com/b/exchangedev/archive/2012/04/19/more-throttling-changes-for-exchange-online.aspx (this basically means the Target Mailbox is charged instead of the Service Account).
     Looking at the code one thing I notice missing is your don't appear to be paging the results of FindItems, also have versioned your requests to Exchagne2013. eg ". When the value of the
    RequestServerVersion element indicates Exchange 2010 or an earlier version of Exchange, the server sends a failure response with error code
    ErrorServerBusy. If the value of the RequestServerVersion
    element indicates a version of Exchange starting with Exchange 2010 SP1
    or Exchange Online, and the client is using paging, EWS may return a
    partial result set instead of an error"
    To Page FindItems Correctly you should use the IndexedPageViewType class and page the Items at no more the 1000 at a time eg something like
    IndexedPageViewType indexedPageView = new IndexedPageViewType();
    indexedPageView.BasePoint = IndexBasePointType.Beginning;
    indexedPageView.Offset = 0;
    indexedPageView.MaxEntriesReturned = 1000;
    indexedPageView.MaxEntriesReturnedSpecified = true;
    FindItemType findItemrequest = new FindItemType();
    findItemrequest.Item = indexedPageView;
    findItemrequest.ItemShape = new ItemResponseShapeType();
    findItemrequest.ItemShape.BaseShape = DefaultShapeNamesType.IdOnly;
    BasePathToElementType[] beAdditionproperties = new BasePathToElementType[3];
    PathToUnindexedFieldType SubjectField = new PathToUnindexedFieldType();
    SubjectField.FieldURI = UnindexedFieldURIType.itemSubject;
    beAdditionproperties[0] = SubjectField;
    PathToUnindexedFieldType RcvdTime = new PathToUnindexedFieldType();
    RcvdTime.FieldURI = UnindexedFieldURIType.itemDateTimeReceived;
    beAdditionproperties[1] = RcvdTime;
    PathToUnindexedFieldType ReadStatus = new PathToUnindexedFieldType();
    ReadStatus.FieldURI = UnindexedFieldURIType.messageIsRead;
    beAdditionproperties[2] = ReadStatus;
    findItemrequest.ItemShape.AdditionalProperties = beAdditionproperties;
    DistinguishedFolderIdType[] faFolderIDArray = new DistinguishedFolderIdType[1];
    faFolderIDArray[0] = new DistinguishedFolderIdType();
    faFolderIDArray[0].Mailbox = new EmailAddressType();
    faFolderIDArray[0].Mailbox.EmailAddress = "[email protected]";
    faFolderIDArray[0].Id = DistinguishedFolderIdNameType.inbox;
    bool moreAvailible = false;
    findItemrequest.ParentFolderIds = faFolderIDArray;
    int loopCount = 0;
    do
    FindItemResponseType frFindItemResponse = esb.FindItem(findItemrequest);
    if (frFindItemResponse.ResponseMessages.Items[0].ResponseClass == ResponseClassType.Success)
    foreach (FindItemResponseMessageType firmtMessage in frFindItemResponse.ResponseMessages.Items)
    Console.WriteLine("Number of Items retreived : " + ((ArrayOfRealItemsType)firmtMessage.RootFolder.Item).Items.Length);
    if (firmtMessage.RootFolder.IncludesLastItemInRange == false)
    moreAvailible = true;
    else
    moreAvailible = false;
    ((IndexedPageViewType)findItemrequest.Item).Offset += ((ArrayOfRealItemsType)firmtMessage.RootFolder.Item).Items.Length;
    Console.WriteLine("Offset : " + ((IndexedPageViewType)findItemrequest.Item).Offset);
    if (firmtMessage.RootFolder.TotalItemsInView > 0)
    foreach (ItemType miMailboxItem in ((ArrayOfRealItemsType)firmtMessage.RootFolder.Item).Items)
    Console.WriteLine(miMailboxItem.Subject);
    else
    throw new Exception("error " + frFindItemResponse.ResponseMessages.Items[0].MessageText);
    } while (moreAvailible);
    The support people should be able to help you as long as you can get past the first level. The EWS Managed API has a RequestId header that gets submitted with requests
    http://blogs.msdn.com/b/exchangedev/archive/2012/06/18/exchange-web-services-managed-api-1-2-1-now-released.aspx . In theory they should be able to take this and then from the Logs tell more information about why your request failed etc.
    Cheers
    Glen

  • Cluster multi-block requests were consuming significant database time

    Hi,
    DB : 10.2.0.4 RAC ASM
    OS : AIX 5.2 64-bit
    We are facing too much performance issues and CPU idle time becoming 20%.Based on the AWR report , the top 5 events are showing that problem is in cluster side.I placed 1st node AWR report here for your suggestions.
    WORKLOAD REPOSITORY report for
    DB Name DB Id Instance Inst Num Release RAC Host
    PROD 1251728398 PROD1 1 10.2.0.4.0 YES msprod1
    Snap Id Snap Time Sessions Curs/Sess
    Begin Snap: 26177 26-Jul-11 14:29:02 142 37.7
    End Snap: 26178 26-Jul-11 15:29:11 159 49.1
    Elapsed: 60.15 (mins)
    DB Time: 915.85 (mins)
    Cache Sizes
    ~~~~~~~~~~~ Begin End
    Buffer Cache: 23,504M 23,504M Std Block Size: 8K
    Shared Pool Size: 27,584M 27,584M Log Buffer: 14,248K
    Load Profile
    ~~~~~~~~~~~~ Per Second Per Transaction
    Redo size: 28,126.82 2,675.18
    Logical reads: 526,807.26 50,105.44
    Block changes: 3,080.07 292.95
    Physical reads: 962.90 91.58
    Physical writes: 157.66 15.00
    User calls: 1,392.75 132.47
    Parses: 246.05 23.40
    Hard parses: 11.03 1.05
    Sorts: 42.07 4.00
    Logons: 0.68 0.07
    Executes: 930.74 88.52
    Transactions: 10.51
    % Blocks changed per Read: 0.58 Recursive Call %: 32.31
    Rollback per transaction %: 9.68 Rows per Sort: 4276.06
    Instance Efficiency Percentages (Target 100%)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Buffer Nowait %: 99.87 Redo NoWait %: 100.00
    Buffer Hit %: 99.84 In-memory Sort %: 99.99
    Library Hit %: 98.25 Soft Parse %: 95.52
    Execute to Parse %: 73.56 Latch Hit %: 99.51
    Parse CPU to Parse Elapsd %: 9.22 % Non-Parse CPU: 99.94
    Shared Pool Statistics Begin End
    Memory Usage %: 68.11 71.55
    % SQL with executions>1: 94.54 92.31
    % Memory for SQL w/exec>1: 98.79 98.74
    Top 5 Timed Events Avg %Total
    ~~~~~~~~~~~~~~~~~~ wait Call
    Event Waits Time (s) (ms) Time Wait Class
    CPU time 18,798 34.2
    gc cr multi block request 46,184,663 18,075 0 32.9 Cluster
    gc buffer busy 2,468,308 6,897 3 12.6 Cluster
    gc current block 2-way 1,826,433 4,422 2 8.0 Cluster
    db file sequential read 142,632 366 3 0.7 User I/O
    RAC Statistics DB/Inst: PROD/PROD1 Snaps: 26177-26178
    Begin End
    Number of Instances: 2 2
    Global Cache Load Profile
    ~~~~~~~~~~~~~~~~~~~~~~~~~ Per Second Per Transaction
    Global Cache blocks received: 14,112.50 1,342.26
    Global Cache blocks served: 619.72 58.94
    GCS/GES messages received: 2,099.38 199.68
    GCS/GES messages sent: 23,341.11 2,220.01
    DBWR Fusion writes: 3.43 0.33
    Estd Interconnect traffic (KB) 122,826.57
    Global Cache Efficiency Percentages (Target local+remote 100%)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Buffer access - local cache %: 97.16
    Buffer access - remote cache %: 2.68
    Buffer access - disk %: 0.16
    Global Cache and Enqueue Services - Workload Characteristics
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Avg global enqueue get time (ms): 0.6
    Avg global cache cr block receive time (ms): 2.8
    Avg global cache current block receive time (ms): 3.0
    Avg global cache cr block build time (ms): 0.0
    Avg global cache cr block send time (ms): 0.0
    Global cache log flushes for cr blocks served %: 11.3
    Avg global cache cr block flush time (ms): 1.7
    Avg global cache current block pin time (ms): 0.0
    Avg global cache current block send time (ms): 0.0
    Global cache log flushes for current blocks served %: 0.0
    Avg global cache current block flush time (ms): 4.1
    Global Cache and Enqueue Services - Messaging Statistics
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Avg message sent queue time (ms): 0.1
    Avg message sent queue time on ksxp (ms): 2.4
    Avg message received queue time (ms): 0.0
    Avg GCS message process time (ms): 0.0
    Avg GES message process time (ms): 0.0
    % of direct sent messages: 6.27
    % of indirect sent messages: 93.48
    % of flow controlled messages: 0.25
    Time Model Statistics DB/Inst: PROD/PROD1 Snaps: 26177-26178
    -> Total time in database user-calls (DB Time): 54951s
    -> Statistics including the word "background" measure background process
    time, and so do not contribute to the DB time statistic
    -> Ordered by % or DB time desc, Statistic name
    Statistic Name Time (s) % of DB Time
    sql execute elapsed time 54,618.2 99.4
    DB CPU 18,798.1 34.2
    parse time elapsed 494.3 .9
    hard parse elapsed time 397.4 .7
    PL/SQL execution elapsed time 38.6 .1
    hard parse (sharing criteria) elapsed time 27.3 .0
    sequence load elapsed time 5.0 .0
    failed parse elapsed time 3.3 .0
    PL/SQL compilation elapsed time 2.1 .0
    inbound PL/SQL rpc elapsed time 1.2 .0
    repeated bind elapsed time 0.8 .0
    connection management call elapsed time 0.6 .0
    hard parse (bind mismatch) elapsed time 0.3 .0
    DB time 54,951.0 N/A
    background elapsed time 1,027.9 N/A
    background cpu time 518.1 N/A
    Wait Class DB/Inst: PROD/PROD1 Snaps: 26177-26178
    -> s - second
    -> cs - centisecond - 100th of a second
    -> ms - millisecond - 1000th of a second
    -> us - microsecond - 1000000th of a second
    -> ordered by wait time desc, waits desc
    Avg
    %Time Total Wait wait Waits
    Wait Class Waits -outs Time (s) (ms) /txn
    Cluster 50,666,311 .0 30,236 1 1,335.4
    User I/O 419,542 .0 811 2 11.1
    Network 4,824,383 .0 242 0 127.2
    Other 797,753 88.5 208 0 21.0
    Concurrency 212,350 .1 121 1 5.6
    Commit 16,215 .0 53 3 0.4
    System I/O 60,831 .0 29 0 1.6
    Application 6,069 .0 6 1 0.2
    Configuration 763 97.0 0 0 0.0
    Second node top 5 events are as below,
    Top 5 Timed Events
              Avg %Total
    ~~~~~~~~~~~~~~~~~~ wait Call
    Event Waits Time (s) (ms) Time Wait Class
    CPU time 25,959 42.2
    db file sequential read 2,288,168 5,587 2 9.1 User I/O
    gc current block 2-way 822,985 2,232 3 3.6 Cluster
    read by other session 345,338 1,166 3 1.9 User I/O
    gc cr multi block request 991,270 831 1 1.4 Cluster
    My RAM is 95GB each node and SGA is 51 GB and PGA is 14 GB.
    Any inputs from your side are greatly helpful to me ,please.
    Thanks,
    Sunand

    Hi Forstmann,
    Thanks for your update.
    Even i have collected ADDM report, extract of Node1 report as below
    FINDING 1: 40% impact (22193 seconds)
    Cluster multi-block requests were consuming significant database time.
    RECOMMENDATION 1: SQL Tuning, 6% benefit (3313 seconds)
    ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
    "59qd3x0jg40h1". Look for an alternative plan that does not use
    object scans.
    SYMPTOMS THAT LED TO THE FINDING:
    SYMPTOM: Inter-instance messaging was consuming significant database
    time on this instance. (55% impact [30269 seconds])
    SYMPTOM: Wait class "Cluster" was consuming significant database
    time. (55% impact [30271 seconds])
    FINDING 3: 13% impact (7008 seconds)
    Read and write contention on database blocks was consuming significant
    database time.
    NO RECOMMENDATIONS AVAILABLE
    SYMPTOMS THAT LED TO THE FINDING:
    SYMPTOM: Inter-instance messaging was consuming significant database
    time on this instance. (55% impact [30269 seconds])
    SYMPTOM: Wait class "Cluster" was consuming significant database
    time. (55% impact [30271 seconds])
    Any help from your side , please?
    Thanks,
    Sunand

  • Service Ticket request failed

    Hey,
    Has anyone seen this "alert" coming from the domain controllers?
    Service ticket request failed
    I want to false positive it out because I've investigated.
    But I'd rather go to the server guys with a fix ...

    Yes your understanding is correct. The recommended approach is to tune out all the unneeded raw events at the reporting device itself.
    This will save both the network and MARS from unnecessary traffic. You can find more details about this error at the following:
    http://support.microsoft.com/kb/824905
    http://technet.microsoft.com/en-us/library/bb742435.aspx
    Regards
    Farrukh

  • Configure the ADMIN and CLUSTER service connections to be SSL

    Can you configure the ADMIN and CLUSTER service connections to be SSL
    rather than tcp?
    I was wondering about the present or future ability to secure other
    connection services with SSL. Can you now or are there future plans
    to configure the ADMIN and CLUSTER service connections to be SSL
    rather than tcp? I suppose I should add the PORTMAPPER to that list.
    My primary interest is for an SSLCLUSTER service in the case where
    two brokers are connected over a non-trusted network. It may
    not be too difficult to secure all the services the same way, but
    perhaps that is on the TODO list.
    A related question is if there are plans to add SSL with client
    authentication as a stronger authentication mechanism than 'simple'
    username and password. I believe you could get the username from
    the client certificate's DN and continue to use the same LDAP user
    repository for access control. I think this is similar to the way
    that BEA's Weblogic server does it.
    Finally should it be possible to deploy the HTTP tunnel servlet to
    a webserver (such as iPlanet Web Server) configured to do SSL with
    client authentication as a work-around to get stronger authentication
    with the current release of the product? Or am I perhaps missing some
    obvious and important detail? :) I guess I would like to know it's been
    done already or is at least possible before I try and do it myself.

    3 scenarios involving SSL are:
    1: JMS client <------- SSL -------> iMQ broker
    2: iMQ admin <------- SSL -------> iMQ broker
    3: iMQ broker <------- SSL -------> iMQ broker (i.e clusters)
    (1) is currently supported in iMQ 2.0
    (2) and (3) is not supported in iMQ 2.0. No concrete plans yet to support
    it in the near future but we'll definitely consider doing it if we
    hear a lot of demand for it.
    ]A related question is if there are plans to add SSL with client
    ]authentication as a stronger authentication mechanism than 'simple'
    ]username and password. I believe you could get the username from
    ]the client certificate's DN and continue to use the same LDAP user
    ]repository for access control. I think this is similar to the way
    ]that BEA's Weblogic server does it.
    This is on our todo list, but due to other more pressing issues we
    have not been able to address it. We will continue to keep it
    on our potential list of new features.
    Sorry if I sound pretty wishy-washy in my responses above, but the fact
    is that the things you mentioned above had to take a backseat
    to other more critical features. That and the usual time/resource
    constraints caused them not to be implemented.
    ]Finally should it be possible to deploy the HTTP tunnel servlet to
    ]a webserver (such as iPlanet Web Server) configured to do SSL with
    ]client authentication as a work-around to get stronger authentication
    ]with the current release of the product? Or am I perhaps missing some
    ]obvious and important detail? :) I guess I would like to know it's been
    ]done already or is at least possible before I try and do it myself.
    Yes, this should be possible (although I don't believe we've tried it here).
    The client authentication here is really only between the JMS client and the
    web server (not between the tunnel servlet and the iMQ broker) and should
    be similar in setup to any other java application talking to iPlanet Web
    Server.

  • Why virtual interfaces added to ManagementOS not visible to Cluster service?

    Hello All, 
    I"m starting this new thread since the one before is answered by our friend Udo. My problem in short is following. Diagram will be enough to explain what I'm trying to achieve. I've setup this lab to learn Hyper-V clustering with 2 nodes. It is Hyper-V
    server 2012. Both nodes have 3x physical NIcs, 1 in each node is dedicated to managing the Node. Rest of the two are used to create a NIC team. Atop of that NIC team, a virtual switch is created with -AllowManagementOS
    $False. Next I created and added following virtual interfaces to host partition, and plugged them into virtual switch created atop of teamed interface. These virtual interfaces should serve the purpose of various networks available. 
    For SAN i'm running a Linux VM which has iSCSI target server and clustering service has no problem with that. All tests pass ok.
    The problem is......when those virtual interfaces added to hosts; do not appear as available networks
    to cluster service; instead it only shows the management NIC as the available network to leverage. 
    This is making it difficult to understand how to setup a cluster of 2x Hyper-V Server nodes. Can someone help please?
    Regards,
    Shahzad.

    Shahzad,
    I've read this thread a couple of times and I don't think I'm clear on the exact question you're asking.
    When the clustering service goes out to look for "Networks", what it does is scan the IP addresses on each node. Every time it finds an IP in a unique subnet, that subnet is listed as a network. It can't see virtual switches and doesn't care about
    virtual vs. teamed vs. physical adapters or anything like that. It's just looking at IP addresses. This is why I'm confused when you say, "it won't show virtual interfaces available as networks". "Networks" in this context are IP subnets.
    I'm not aware of any context where a singular interface would be treated like a network.
    If you've got virtual adapters attached to the management operating system
    and have assigned IPs to them, the cluster should have discovered those networks. If you have multiple adapters on the same node using IPs in the same subnet, that network will only appear once and the cluster service will only use
    one adapter from that subnet on that node. The one it picked will be visible on the "Network Connections" tab at the bottom of Failover Cluster Manager when you're on the Networks section.
    Eric Siron Altaro Hyper-V Blog
    I am an independent blog contributor, not an Altaro employee. I am solely responsible for the content of my posts.
    "Every relationship you have is in worse shape than you think."
    Hello Eric and friends, 
    Eric, much appreciated about your interest about the issue and yes I agree with you when you said... "When the clustering service goes out to look for "Networks",
    what it does is scan the IP addresses on each node. Every time it finds an IP in a unique subnet, that subnet is listed as a network. It can't see virtual switches and doesn't care about virtual vs. teamed vs. physical adapters or anything like that. It's
    just looking at IP addresses. This is why I'm confused when you say, "it won't show virtual interfaces available as networks". "Networks" in this context are IP subnets. I'm not aware of any context where a singular interface would be treated
    like a network."
    By networks I meant to say subnets. Let me explain what I've configured so far:
    Node 1 & Node 2 installed with 3x NICs. All 3 NICs/node plugged into same switch. 
    Node1:  131.107.0.50/24
    Node2:  131.107l.0.150/24
    A Core Domain controller VM running on Node 1:   131.107.0.200/24 
    A JUMPBOX (WS 2012 R2 Std.) VM running on Node 1: 131.107.0.100/24
    A Linux SAN VM running on Node 2: 10.1.1.100/8 
    I planed to configured following networks:
    (1) Cluster traffic:  10.0.0.50/24     (IP given to virtual interface for Cluster traffic in Node1)
         Cluster traffic:  10.0.0.150/24   (IP given to virtual interface for Cluster traffic in Node2)
    (2) SAN traffic:      10.1.1.50/8      (IP given to virtual interfce for SAN traffic in Node1)  
         SAN traffic:      10.1.1.150/8    (IP given to virtual interfce for SAN traffic in Node2)
    Note: Cluster service has no problem accessing the SAN VM (10.1.1.100) over this network, it validates SAN settings and comes back OK. This is an indication that virtual interface is
    working fine. 
    (3) Migration traffic:   172.168.0.50/8     (IP given to virtual interfce for
    Migration traffic in Node1) 
         Migration traffic:   172.168.0.150/8    (IP given to virtual interfce for
    Migration  traffic in Node2)
    All these networks (virtual interfaces) are made available through two virtual switches which are configured EXACTLY identical on both Node1/Node2.
    Now after finishing the cluster validation steps (which comes all OK), when create cluster wizard starts, it only shows one network; i.e. network of physical Layer 2 switch i.e. 131.107.0.0/24.
    I wonder why it won't show IPs of other networks (10.0.0.0/8, 10.1.1.0/8 and  172.168.0.0/8)
    Regards,
    Shahzad

Maybe you are looking for

  • Tax on AP Invoice

    Hi, I am trying to configure the auto calculation of taxes on an AP invoice, but the system is not auto calculating. I created a new tax process that is marked with posting indicator "distribute to relevant expense". However there is no auto generati

  • Withholding tax - payment

    Dear experts, I have two A/R invoices with withholding tax and the customer is different in each two documents. This two invoices have been paid with an only cheque (so I have 2 invoices, 2 customers and an only payment). How can I do the payment by

  • Vedrsioncue.dll Module Could Not Be Found

    I have been running Adobe "eLearning Suite" for over a year and have in the past week encountrered "Specified Module could not be found C:Program Files\Common Files\Adobe\Adobe cs4\Client\4.01\versioncue.dll Running 64 bit in Windows 7 -- Error messa

  • Logic 8! Put down what we want in the next version. They need to know!

    I been using Logic Pro 7 since it came out. Apple updated everything else in there app line but not logic. So lets give Apple some ideas of what we want in the new version. If they don't know what we want we will all be complaining when a new version

  • Safari's 'Find' facility

    Apologies in advance if this is considered too off-topic, but.... When I use Safari's 'Find' facility to locate text on a webpage I find it difficult to actually locate the result's of the search on the page. This is due to Safari 'highlighting' the