Entire cluster hanging (8 appserver, 3 machines)

          Yesterday around 3:40 PM, all eight of our appservers, crashed for some reason.
          The appservers are distributed across 3 physical SPARC servers. We have 3 on
          two boxes, and 2 on the third box. The admin server is on the third box as well.
          These boxes are running Solaris 8 and Weblogic 6.0. We are using JDK 1.3.1.01.
          The admin server was brought down and restarted. About 10 minutes after the admin
          server went down, the managed servers started running out of threads. All the
          execute threads were being used, and you could not connect to any of the appservers.
          They had to be manually killed off and restarted.
          Has anyone seen anything like this?
          I will submit more info if requested...
          

Bad application code or bad environments (database is down or slow) will do
          that.
          Do you have any other information? Thread dumps? Error messages? Anything?
          Mike
          "David Stovall" <[email protected]> wrote in message
          news:3c3f8c6f$[email protected]..
          >
          > Yesterday around 3:40 PM, all eight of our appservers, crashed for some
          reason.
          > The appservers are distributed across 3 physical SPARC servers. We have
          3 on
          > two boxes, and 2 on the third box. The admin server is on the third box
          as well.
          > These boxes are running Solaris 8 and Weblogic 6.0. We are using JDK
          1.3.1.01.
          >
          >
          > The admin server was brought down and restarted. About 10 minutes after
          the admin
          > server went down, the managed servers started running out of threads. All
          the
          > execute threads were being used, and you could not connect to any of the
          appservers.
          > They had to be manually killed off and restarted.
          >
          > Has anyone seen anything like this?
          >
          > I will submit more info if requested...
          

Similar Messages

  • JDBC Connection pools and clusters (is max connection for entire cluster?)

    Hi,
    Quick question.
    When using JDBC connection pools in WAS 6.40 (SP13) in a clustered environment. Are the max connections the number
    a)Each application server can use
    b)The entire cluster can use
    Would believe a), but I'd like it confirmed from someoneelse

    Hi Dagfinn,
    your assumption is correct. Therefore, in a cluster environment you'd need to make sure the DB can open <i>Number of nodes X max connections</i>.

  • Does Dreamweaver enable us to change a link's destination from throughout an ENTIRE cluster ofPages?

    I have a link to an external site located throughout an entire cluster of pages within my website.   Is there a Dreamweaver CS5 command that would enable me to modify that link as it appears throughout the entire cluster of pages, simultaneously?    That sure would save me considerable time if it can be done without my having to make the change manually and individually, on a page-by-page basis.   Maybe something such as this command exists, perhaps?
    "Change all occurrences of the following address from ___  to ____ throughout the entire site"
    Any suggestions, please?

    Dreamweaver allows for three different methods of doing this.
    As suggested, one may use server-side code to call a text page that you just change in one place.
    Also, as suggested, one may use Dreamweaver's Find and Replace tool to replace all instances of a particular link. If you are looking to do this in source code, you have to be very careful about how you specify what to search for and exactly what you wish the replacement code to be.
    Last, I have started using templates for stuff that has reoccurring information throughout a website. Things like navigation, footers, headers and so on are very useful for automating a change in reoccurring content. Where you have a page link that changes in navigation, if you have a template that handles that in all pages, you change it once and Dreamweaver will ask you if you want to update all files based on that template. Answer "Yes," and all page changes will be made.
    Pages may be linked to only one template and one should never nest templates.
    So, to fully answer your question, yes. There are three methods at your disposal.

  • Add cluster nodes from multiple machines to WebLogic domain in OEM 10.2.0.5

    Hello,
    I want to monitor a WebLogic domain in Oracle Enterprise Manager 10.2.0.5 with the following layout:
    - Admin server on machine 1
    - managed server, cluster node a on machine 2
    - managed server, cluster node b on machine 3
    How can I do this?
    When I go to "Add Weblogic Domain", I can enter the admin adress (machine 1) and tick the box to say that there is an agent running on another host (where I specify machine 2). However I do not see a possibility to discover managed servers from machine 3.
    Does anyone know how to do this?
    Thanks,
    Nadja

    LSNRCTL> status
    Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
    STATUS of the LISTENER
    Alias LISTENER
    Version TNSLSNR for Linux: Version 11.1.0.6.0 - Production
    Start Date 28-JAN-2010 00:36:10
    Uptime 0 days 17 hr. 11 min. 52 sec
    Trace Level off
    Security ON: Local OS Authentication
    SNMP OFF
    Listener Parameter File /oracle/app/oracle/product/11.1.0/db/network/admin/listener.ora
    Listener Log File /oracle/app/oracle/diag/tnslsnr/corp1052/listener/alert/log.xml
    Listening Endpoints Summary...
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=corp1052)(PORT=1521)))
    Services Summary...
    Service "+ASM" has 1 instance(s).
    Instance "+ASM2", status READY, has 1 handler(s) for this service...
    Service "+ASM_XPT" has 1 instance(s).
    Instance "+ASM2", status READY, has 1 handler(s) for this service...
    Service "dex.example.com" has 2 instance(s).
    Instance "dex1", status READY, has 1 handler(s) for this service...
    Instance "dex2", status READY, has 2 handler(s) for this service...
    Service "dexXDB.example.com" has 2 instance(s).
    Instance "dex1", status READY, has 1 handler(s) for this service...
    Instance "dex2", status READY, has 1 handler(s) for this service...
    Service "dex_XPT.example.com" has 2 instance(s).
    Instance "dex1", status READY, has 1 handler(s) for this service...
    Instance "dex2", status READY, has 2 handler(s) for this service...
    The command completed successfully
    The output of SQLPlus:
    [oracle@dbhost: db]$ bin/sqlplus dex@DEX
    SQL*Plus: Release 11.1.0.6.0 - Production on Thu Jan 28 18:40:11 2010
    Copyright (c) 1982, 2007, Oracle. All rights reserved.
    Enter password:
    Connected to:
    Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
    With the Partitioning, Real Application Clusters, OLAP, Data Mining
    and Real Application Testing options

  • Aperture (and entire iMac) hangs brushing adjustments while zoomed in

    This problem is almost 100% repeatable on my iMac running Aperture 3.0.3. I've been working on brushing in various details on a photo (mainly blurs, sharpening and some level adjustments) and in some cases I've need to zoom in tightly (200% or more) to get to the detail that I need to adjust. Almost every time I do this eventually the system hangs after trying to brush in the adjustments. It never seems to hang while brushing in adjustments while not zoomed in.
    Not only does Aperture hang, but the entire computer hangs. I can't force quit Aperture or anything else. The Dock becomes unresponsive. I do not get the spinning beach ball. All I get is an active mouse cursor. Some background tasks continue to run, but display tasks seem to stop. I suspect that what's hanging here is the video display system, but I haven't seen anything in the Console to confirm that. The only way out is to power down the system and restart it.
    Is anyone else seeing this? I don't know if this is related to OS X 10.6.4 or Aperture since this is the first time that I've been zooming in at this level for any adjustments.

    Wow - interesting. I haven't seen all of the issues that some of the others in that thread have seen, but my issue is obviously related. It sounds like Apple may have installed a new video driver in the 10.6.4 upgrade which certainly could cause these kinds of issues. Let's hope they get them fixed!

  • Entire library deleted on time machine

    my entire library on my time machine was erased after I got a message saying time machine needs to back up in different location, so I followed the prompts and it deleted my entire library not just old back ups. Can I get them back?

    Time Machine will only delete its own sparsebundle.
    What do you mean it deleted your entire library? It will only have done that if it was part of the TM backup or you loaded it in the wrong place, ie inside the TM sparsebundle.
    Now that it is deleted.. stop using the disk immediately.. if it has done a backup you are probably already too late.
    The only way to recover is to physically remove the drive from the TC.
    See http://www.ifixit.com/Device/Apple_Time_Capsule for instructions
    Plug the drive into your computer using a USB hard disk shell..
    And you can run a disk recovery software on the computer.. disk warrior or data rescue 3 are the best known Mac ones.. they will cost you around $100.. and you will need another USB disk to recover the files to unless the Mac has a large partition you can use.
    Otherwise it has to go to data recovery people and spend your money.. up to $1000 for recovery but be warned.. if TM already completed a full backup you might be wasting your money.. there are no guarantees in recovery.
    BTW .. this is why we tell people never store files on the TC.

  • Install SQL Server 2012 SP1 on a Windows Server 2012 R2 Failover Cluster - hangs at "Running discovery on remote machine" on VMWare VSphere 5.5 Update 1

    <p>Hi,</p><p>I'm trying to install SQL Server 2012 SP1 on the first node of a Windows Server 2012 R2 failover cluster.</p><p>The install hangs whilst displaying the "Please wait while Microsoft SQL Server 2012 Servce
    Pack 1 Setup processes the current operation." message.</p><p>The detail.txt log file shows as follows:</p><p>(01) 2014-07-17 15:36:35 Slp: -- PidInfoProvider : Use cached PID<br />(01) 2014-07-17 15:36:35 Slp: -- PidInfoProvider
    : NormalizePid is normalizing input pid<br />(01) 2014-07-17 15:36:35 Slp: -- PidInfoProvider : NormalizePid found a pid containing dashes, assuming pid is normalized, output pid<br />(01) 2014-07-17 15:36:35 Slp: -- PidInfoProvider : Use cached
    PID<br />(01) 2014-07-17 15:36:35 Slp: Completed Action: FinalCalculateSettings, returned True<br />(01) 2014-07-17 15:36:35 Slp: Completed Action: ExecuteBootstrapAfterExtensionsLoaded, returned True<br />(01) 2014-07-17 15:36:35 Slp: ----------------------------------------------------------------------<br
    />(01) 2014-07-17 15:36:35 Slp: Running Action: RunRemoteDiscoveryAction<br />(01) 2014-07-17 15:36:36 Slp: Running discovery on local machine<br />(01) 2014-07-17 15:36:36 Slp: Discovery on local machine is complete<br />(01) 2014-07-17
    15:36:36 Slp: Running discovery on remote machine: XXX-XXX-01</p><p>After about 4 hours and 10 minutes, the step seems to time out and move on, however it doesn't seem to have discovered what it needs to and the setup subsuently fails</p><p></p>

    Hi,
    Sorry Information you provided did not helped can you post content of both summary file and details,txt file on shared location for analysis.
    Can you download Service pack again and try once more
    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it.
    My TechNet Wiki Articles

  • Unable to create cluster, hangs on forming cluster

     
    Hi all,
    I am trying to create a 2 node cluster on two x64 Windows Server 2008 Enterprise edition servers. I am running the setup from the failover cluster MMC and it seems to run ok right up to the point where the snap-in says creating cluster. Then it seems to hang on "forming cluster" and a message pops up saying "The operation is taking longer than expected". A counter comes up and when it hits 2 minutes the wizard cancels and another message comes up "Unable to sucessfully cleanup".
    The validation runs successfully before I start trying to create the cluster. The hardware involved is a HP EVA 6000, two Dell 2950's
    I have included the report generated by the create cluster wizard below and the error from the event log on one of the machines (the error is the same on both machines).
    Is there anything I can do to give me a better indication of what is happening, so I can resolve this issue or does anyone have any suggestions for me?
    Thanks in advance.
    Anthony
    Create Cluster Log
    ==================
    Beginning to configure the cluster <cluster>.
    Initializing Cluster <cluster>.
    Validating cluster state on node <Node1>
    Searching the domain for computer object 'cluster'.
    Creating a new computer object for 'cluster' in the domain.
    Configuring computer object 'cluster' as cluster name object.
    Validating installation of the Network FT Driver on node <Node1>
    Validating installation of the Cluster Disk Driver on node <Node1>
    Configuring Cluster Service on node <Node1>
    Validating installation of the Network FT Driver on node <Node2>
    Validating installation of the Cluster Disk Driver on node <Node2>
    Configuring Cluster Service on node <Node2>
    Waiting for notification that Cluster service on node <Node2>
    Forming cluster '<cluster>'.
    Unable to successfully cleanup.
    To troubleshoot cluster creation problems, run the Validate a Configuration wizard on the servers you want to cluster.
    Event Log
    =========
    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          29/08/2008 19:43:14
    Event ID:      1570
    Task Category: None
    Level:         Critical
    Keywords:     
    User:          SYSTEM
    Computer:      <NODE 2>
    Description:
    Node 'NODE2' failed to establish a communication session while joining the cluster. This was due to an authentication failure. Please verify that the nodes are running compatible versions of the cluster service software.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{baf908ea-3421-4ca9-9b84-6689b8c6f85f}" />
        <EventID>1570</EventID>
        <Version>0</Version>
        <Level>1</Level>
        <Task>0</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2008-08-29T18:43:14.294Z" />
        <EventRecordID>4481</EventRecordID>
        <Correlation />
        <Execution ProcessID="2412" ThreadID="3416" />
        <Channel>System</Channel>
        <Computer>NODE2</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="NodeName">node2</Data>
      </EventData>
    </Event>
    ====
    I have also since tried creating the cluster with the firewall and no success.
    I have tried creating the node from the other cluster and this did not work either
    I tried creating a cluster with just  a single node and this did create a cluster. I could not join the other node and the network name resource did not come online either. The below is from the event logs.
    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          01/09/2008 12:42:44
    Event ID:      1207
    Task Category: Network Name Resource
    Level:         Error
    Keywords:     
    User:          SYSTEM
    Computer:      Node1.Domain
    Description:
    Cluster network name resource 'Cluster Name' cannot be brought online. The computer object associated with the resource could not be updated in domain 'Domain' for the following reason:
    Unable to obtain the Primary Cluster Name Identity token.
    The text for the associated error code is: An attempt has been made to operate on an impersonation token by a thread that is not currently impersonating a client.
    The cluster identity 'CLUSTER$' may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain.

    I am having the exact same issue... but these are on freshly created virtual machines... no group policy or anything...
    I am 100% unable to create a Virtual Windows server 2012 failover cluster using two virtual fiber channel adapters to connect to the shared storage.
    I've tried using GUI and powershell, I've tried adding all available storage, or not adding it, I've tried renaming the server and changing all the IP addresses....
    To reproduce:
    1. Create two identical Server 2012 virtual machines
    (My Config: 4 CPU's, 4gb-8gb dynamic memory, 40gb HDD, two network cards (one for private, one for mgmt), two fiber cards to connect one to each vsan.)
    2. Update both VM's to current windows updates
    3. Add Failover Clustering role, Reboot, and try to create cluster.
    Cluster passed all validation tests perfectly, but then it gets to "forming cluster" and times out =/
    Any assistance would be greatly appreciate.

  • Cluster with 2 linux machines and ODI console - some questions

    Hello,
    I need to setup domain with ODI plugins (console etc.) on clustered environment. OS is Oracle Linux 6.3.
    I've read this documentation:http://docs.oracle.com/cd/E13222_01/wls/docs81/adminguide/createdomain.html#CreateClusteredDomain and I have some additional questions:
    - I know I need to install weblogic on 2 machines. Should I install Oracle Data Integrator on 2 machines as well?
    - Creating domains, starting domains etc. I assume I should do this on first server? (for example, via ssh). Or I will need to login via cluster ip addres?
    - MultiCast Address: this is not entirely clear for me. Should this ip aldready exist in my environment - should I configure my network interfaces somehow? Or, I simply need to provide any ip from 224.0.0.0 to 239.255.255.255 and this will work?

    MukeshNegi wrote:
    Which version of weblogic you are using ?Weblogic 11g, 64-bit.
    MukeshNegi wrote:
    if you are using shared filesystem between your machines then you don't need to install again on second server, Simply register ORACLE_HOME for ODI and oracle_common with oraInventory on second server.What do you mean "shared file system"? Let's say I have 2 separate physical machines, and they exists in the same LAN network. And I assume ORACLE_HOME is weblogic home directory, but what is "oracle_common"? Can you describe all of this more detailed?
    MukeshNegi wrote:
    Simply go to $ODI_ORACLE_HOME/common/bin on server1
    run config.sh and select following from domain template
    - Oracle Enterprise Manager Plug-in for ODI
    - Oracle Enterprise Manager Plug-in
    - Oracle Data Integrator Console
    - Oracle Data Integrator Agent
    - Oracle JRFShould I do the same on second server machine? If not, how weblogic will know about other physical machine in my network and that it should be available to join my cluster? There is no domain and no admin server set on second server, should not I do this? There are a lot of tutorials describing how to setup cluster via config.sh od enterprise manager console, but:
    - they describe how to add managed server to my cluster, but I need to know about physical machine servers. So, should I create managed server on second machine somehow? What about my domains - they should be re-created the same way on second server? I can't find any information about this, there are only enterprise manager screenshoots showing how to create managed server on the same physical machine and how to join managed servers into one cluster. But all of this don't tell me anything what should I do to complete my scenario.
    - cluster ip addreess. I still don't understand this. End user should be able to access odi console via cluster address, am I right? So, is there any system network configuration required? How this ip addres is created?
    - I have set up all of this (odi/weblogic/domain) on single machine and I have second server with only operating system installed (the same Oracle Linux). What's the simples way to join this second physical machine and make all of this working as clustered environment? Is there any step by step instruction/tutorial etc describing ALL steps should be done?
    Sorry for basic questions, I'm really newbie with this and I hope you are patient enough to answer all of this ;)
    Edited by: 960949 on 2012-12-10 01:36
    Edited by: 960949 on 2012-12-10 01:38
    Edited by: 960949 on 2012-12-10 01:56

  • Oracle Cluster Hang after failing a network

    Dear,
    I have a cluster database on two linux 4.0 servers the oracle version is 10.2. there are two network adapters installed on each machine (private + public).
    the problem is when i unplug the network cable from the server (to test the failover), the client failover to the next node, but after some minutes the server hang with no response, so i have to turn it off and restart it.
    Appreciate your advise
    Talal

    Dear,
    I have a cluster database on two linux 4.0 servers the oracle version is 10.2. there are two network adapters installed on each machine (private + public).
    the problem is when i unplug the network cable from the server (to test the failover), the client failover to the next node, but after some minutes the server hang with no response, so i have to turn it off and restart it.
    Appreciate your advise
    Talal

  • Setting up cluster env on different machines

    Cross posting this from
              "weblogic.developer.interest.general"
              http://forums.bea.com/bea/thread.jspa?threadID=600017824&tstart=0
              I am trying to configure a WLS cluster which would have 2 managed servers and a admin server. Each server would run on a different Linux machine. From the docs it is clear that i need to install the same version of WLS on each of the machines. But what is not clear is after i configure the clustered domain using the config.sh script on the adminserver machine, do i need to copy over any of the domain files to each of the managed servers? Where would the files related to each of the managed servers reside(viz. logs, appserver staging directories etc). Or would i need to configure the NodeManager which would allow the adminserver to start each of the managed servers remotely?
              Thanks
              Ramdas

    Hi,
              Run the admin server on the first node. Then on the other node edit <bea_home>/<weblogic_home>/server/bin/startWLS.sh and set the following vars
              @rem WLS_USER - admin username for server startup
              @rem WLS_PW - cleartext password for server startup
              @rem ADMIN_URL - if this variable is set, the server started will be a
              @rem managed server, and will look to the url specified (i.e.
              @rem http://localhost:7001) as the admin server.
              The server will run as a managed server and will try to connect to the admin server. If the connection established correctly, the managed server will retrieve the configurations from the admin server and will create a local copy of those configs and hold a local logs. The local configs will be used to start the server even if the admin is down but you need to enable the independance mode for the managed servers.

  • Leopard applications all hang while Time Machine prepares for backup

    I'm sure others have posted about this but I can't find a usable solution. I have the most recent version of Leopard, a 1TB external FireWire drive dedicated to Time Machine storage and about 400GB of backups on it. Every day at some point, Time Machine starts a backup, pauses while preparing, and the entire Mac grinds to a halt. New processes won't launch, open applications hang, Finder becomes unresponsive, menus and dock freeze in whatever position they were in at the point of launch, and I can't cancel the backup process.
    The only solution is to turn off the backup drive while its running and wait for the Mac to wake up again (which it does in under a minute). I have no virus protection running. I have erased the drive and started fresh at least once to no avail.
    Since I am effectively breaking a live connection every time I turn the drive off without unmounting it first (an obvious no-no), the data has great potential to be getting more and more screwed up. What can I do?

    Message logs:
    October 12, 2009, 1:43:25 AM
    Volume at path /Volumes/Racer BU does not appear to be the correct backup volume for this computer. (Cookies do not match)
    Backup failed with error: 18
    October 12, 2009, 11:32:32 AM
    Starting standard backup
    Backing up to: /Volumes/Racer BU/Backups.backupdb
    Event store UUIDs don't match for volume: Racer X
    Node requires deep traversal:/ reason:kFSEDBEventFlagMustScanSubDirs|kFSEDBEventFlagReasonEventDBUntrustable|
    No pre-backup thinning needed: 984.2 MB requested (including padding), 443.72 GB available
    Copied 21948 files (368.1 MB) from volume Racer X.
    No pre-backup thinning needed: 548.2 MB requested (including padding), 443.35 GB available
    Copied 961 files (965 KB) from volume Racer X.
    Starting post-backup thinning
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-215828: 443.35 GB now available
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-205855: 443.36 GB now available
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-195856: 443.40 GB now available
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-185908: 443.46 GB now available
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-175907: 443.51 GB now available
    Post-back up thinning complete: 5 expired backups removed
    Backup completed successfully.
    October 12, 2009, 12:18:32 PM
    Starting standard backup
    Backing up to: /Volumes/Racer BU/Backups.backupdb
    No pre-backup thinning needed: 954.4 MB requested (including padding), 443.51 GB available
    Unable to rebuild path cache for source item. Partial source path:
    Copied 16717 files (339.6 MB) from volume Racer X.
    No pre-backup thinning needed: 569.5 MB requested (including padding), 443.17 GB available
    Copied 634 files (18.7 MB) from volume Racer X.
    Starting post-backup thinning
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-225832: 443.24 GB now available
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-165828: 443.25 GB now available
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-155855: 443.30 GB now available
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-145857: 443.36 GB now available
    Deleted backup /Volumes/Racer BU/Backups.backupdb/Eric’s Mac Pro/2009-10-10-135852: 443.38 GB now available
    Post-back up thinning complete: 5 expired backups removed
    Backup completed successfully.
    October 12, 2009, 4:46:51 PM
    Starting standard backup
    Volume at path /Volumes/Racer BU does not appear to be the correct backup volume for this computer. (Cookies do not match)
    Backup failed with error: 18
    etc...
    The long periods between backups are when it is "preparing for backup" and nothing is happening. Sometimes this is while I'm not using it and the machine is asleep.

  • Airtunes hangs when time machine starts

    Hi,
    I have got one time capsule with one airport express. On snow leopard (10.6.2), I've configured time machine on time capsule. As soon as time machine starts (every 1 hour), airtunes hangs. I guess this is due to some QoS misconfiguration. But I haven't found anything about it. Does anyone has got any idea/fix to limit those hangs ?
    Regards
    Marc

    I'm having a very similar problem and it is detrimental.
    Here is the scenario: I have a 1TB Time Capsule as my primary router. I am able to perform backups on all but one of my machines. My iMac backs up via 1GB ethernet to the TC, my MacBookAir (60GB,SS) backs up via wireless N (270mb) and my other MacBookAir (80GB,IDE) also backs up via wireless N without any problems.
    I used to use a 500mb external drive to backup my iMac but outgrew it, so I started backing up to the TC instead. No problems at all.
    I also used to back up my Intel MacMini running Mac OS X Server 10.5.5 to a similar external hard-drive but haven't in a while.
    I decided it finally made sense to backup ALL my Mac's to the same 1TB TimeCapsule but it's just not working out!
    On the MacMini; I turn on TimeMachine, the TimeCapsule has been chosen as a backup location, it prepares backup for a moment, mounts the backup drive and then appears to begin backup up... No...! First I get a message the the TimeCapsule closed the connection, next I lose Internet connection across my entire network both wired and wireless and then the TimeCapsule light turns amber, flashes and essentially does a completel crash / reset and then comes back to life a few moments later. I can backup any other computer except the MacMini to the TimeCapsule without "crashing" it and it makes no sense! Every computer is completely up to date with all the latest patches and the TimeCapsule is running 7.3.2
    I sure hope this gets resolved, seems like other people are pretty disgusted with the performance of the TimeCapsule with problems that have no resolution.

  • How do I restore an entire boot disk from time machine?

    This used to be an option on the older system installers "Restore from time machine backup." I don't see this with the ML installer.

    See How do I restore my entire system?

  • Hyper-V cluster Backup causes virtual machine reboots for common Cluster Shared Volumes members.

    I am having a problem where my VMs are rebooting while other VMs that share the same CSV are being backed up. I have provided all the information that I have gather to this point below. If I have missed anything, please let me know.
    My HyperV Cluster configuration:
    5 Node Cluster running 2008R2 Core DataCenter w/SP1. All updates as released by WSUS that will install on a Core installation
    Each Node has 8 NICs configured as follows:
     NIC1 - Management/Campus access (26.x VLAN)
     NIC2 - iSCSI dedicated (22.x VLAN)
     NIC3 - Live Migration (28.x VLAN)
     NIC4 - Heartbeat (20.x VLAN)
     NIC5 - VSwitch (26.x VLAN)
     NIC6 - VSwitch (18.x VLAN)
     NIC7 - VSwitch (27.x VLAN)
     NIC8 - VSwitch (22.x VLAN)
    Following hotfixes additional installed by MS guidance (either while build or when troubleshooting stability issue in Jan 2013)
     KB2531907 - Was installed during original building of cluster
     KB2705759 - Installed during troubleshooting in early Jan2013
     KB2684681 - Installed during troubleshooting in early Jan2013
     KB2685891 - Installed during troubleshooting in early Jan2013
     KB2639032 - Installed during troubleshooting in early Jan2013
    Original cluster build was two hosts with quorum drive. Initial two hosts were HST1 and HST5
    Next host added was HST3, then HST6 and finally HST2.
    NOTE: HST4 hardware was used in different project and HST6 will eventually become HST4
    Validation of cluster comes with warning for following things:
     Updates inconsistent across hosts
      I have tried to manually install "missing" updates and they were not applicable
      Most likely cause is different build times for each machine in cluster
       HST1 and HST5 are both the same level because they were built at same time
       HST3 was not rebuilt from scratch due to time constraints and it actually goes back to Pre-SP1 and has a larger list of updates that others are lacking and hence the inconsistency
       HST6 was built from scratch but has more updates missing than 1 or 5 (10 missing instead of 7)
       HST2 was most recently built and it has the most missing updates (15)
     Storage - List Potential Cluster Disks
      It says there are Persistent Reservations on all 14 of my CSV volumes and thinks they are from another cluster.
      They are removed from the validation set for this reason. These iSCSI volumes/disks were all created new for
      this cluster and have never been a part of any other cluster.
     When I run the Cluster Validation wizard, I get a slew of Event ID 5120 from FailoverClustering. Wording of error:
      Cluster Shared Volume 'Volume12' ('Cluster Disk 13') is no longer available on this node because of
      'STATUS_MEDIA_WRITE_PROTECTED(c00000a2)'. All I/O will temporarily be queued until a path to the
      volume is reestablished.
     Under Storage and Cluster Shared VOlumes in Failover Cluster Manager, all disks show online and there is no negative effect of the errors.
    Cluster Shared Volumes
     We have 14 CSVs that are all iSCSI attached to all 5 hosts. They are housed on an HP P4500G2 (LeftHand) SAN.
     I have limited the number of VMs to no more than 7 per CSV as per best practices documentation from HP/Lefthand
     VMs in each CSV are spread out amonst all 5 hosts (as you would expect)
    Backup software we use is BackupChain from BackupChain.com.
    Problem we are having:
     When backup kicks off for a VM, all VMs on same CSV reboot without warning. This normally happens within seconds of the backup starting
    What have to done to troubleshoot this:
     We have tried rebalancing our backups
      Originally, I had backup jobs scheduled to kick off on Friday or Saturday evening after 9pm
      2 or 3 hosts would be backing up VMs (Serially; one VM per host at a time) each night.
      I changed my backup scheduled so that of my 90 VMs, only one per CSV is backing up at the same time
       I mapped out my Hosts and CSVs and scheduled my backups to run on week nights where each night, there
       is only one VM backed up per CSV. All VMs can be backed up over 5 nights (there are some VMs that don't
       get backed up). I also staggered the start times for each Host so that only one Host would be starting
       in the same timeframe. There was some overlap for Hosts that had backups that ran longer than 1 hour.
      Testing this new schedule did not fix my problem. It only made it more clear. As each backup timeframe
      started, whichever CSV the first VM to start was on would have all of their VMs reboot and come back up.
     I then thought maybe I was overloading the network still so I decided to disable all of the scheduled backup
     and run it manually. Kicking off a backup on a single VM, in most cases, will cause the reboot of common
     CSV members.
     Ok, maybe there is something wrong with my backup software.
      Downloaded a Demo of Veeam and installed it onto my cluster.
      Did a test backup of one VM and I had not problems.
      Did a test backup of a second VM and I had the same problem. All VMs on same CSV rebooted
     Ok, it is not my backup software. Apparently it is VSS. I have looked through various websites. The best troubleshooting
     site I have found for VSS in one place it on BackupChain.com (http://backupchain.com/hyper-v-backup/Troubleshooting.html)
     I have tested almost every process on there list and I will lay out results below:
      1. I have rebooted HST6 and problems still persist
      2. When I run VSSADMIN delete shadows /all, I have no shadows to delete on any of my 5 nodes
       When I run VSSADMIN list writers, I have no error messages on any writers on any node...
      3. When I check the listed registry key, I only have the build in MS VSS writer listed (I am using software VSS)
      4. When I run VSSADMIN Resize ShadowStorge command, there is no shadow storage on any node
      5. I have completed the registration and service cycling on HST6 as laid out here and most of the stuff "errors"
       Only a few of the DLL's actually register.
      6. HyperV Integration Services were reconciled when I worked with MS in early January and I have no indication of
       further issue here.
      7. I did not complete the step to delete the Subscriptions because, again, I have no error messages when I list writers
      8. I removed the Veeam software that I had installed to test (it hadn't added any VSS Writer anyway though)
      9. I can't realistically uninstall my HyperV and test VSS
      10. Already have latest SPs and Updates
      11. This is part of step 5 so I already did this. This seems to be a rehash of various other stratgies
     I have used the VSS Troubleshooter that is part of BackupChain (Ctrl-T) and I get the following error:
      ERROR: Selected writer 'Microsoft Hyper-V VSS Writer' is in failed state!
      - Status: 8 (VSS_WS_FAILED_AT_PREPARE_SNAPSHOT)
      - Writer Failure code: 0x800423f0 (<Unknown error code>)
      - Writer ID: {66841cd4-6ded-4f4b-8f17-fd23f8ddc3de}
      - Instance ID: {d55b6934-1c8d-46ab-a43f-4f997f18dc71}
      VSS snapshot creation failed with result: 8000FFFF
    VSS errors in event viewer. Below are representative errors I have received from various Nodes of my cluster:
    I have various of the below spread out over all hosts except for HST6
    Source: VolSnap, Event ID 10, The shadow copy of volume took too long to install
    Source: VolSnap, Event ID 16, The shadow copies of volume x were aborted because volume y, which contains shadow copy storage for this shadow copy, wa force dismounted.
    Source: VolSnap, Event ID 27, The shadow copies of volume x were aborted during detection because a critical control file could not be opened.
    I only have one instance of each of these and both of the below are from HST3
    Source: VSS, Event ID 12293, Volume Shadow Copy Service error: Error calling a routine on a Shadow Copy Provider {b5946137-7b9f-4925-af80-51abd60b20d5}. Routine details RevertToSnashot [hr = 0x80042302, A Volume Shadow Copy Service component encountered an
    unexpected error.
    Source: VSS, Event ID 8193, Volume Shadow Copy Service error: Unexpected error calling routine GetOverlappedResult.  hr = 0x80070057, The parameter is incorrect.
    So, basically, everything I have tried has resulted in no success towards solving this problem.
    I would appreciate anything assistance that can be provided.
    Thanks,
    Charles J. Palmer
    Wright Flood

    Tim,
    Thanks for the reply. I ran the first two commands and got this:
    Name                                                            
    Role Metric
    Cluster Network 1                                              
    3  10000
    Cluster Network 2 - HeartBeat                              1   1300
    Cluster Network 3 - iSCSI                                    0  10100
    Cluster Network 4 - LiveMigration                         1   1200
    When you look at the properties of each network, this is how I have it configured:
    Cluster Network 1 - Allow cluster network communications on this network and Allow clients to connect through this network (26.x subnet)
    Cluster Network 2 - Allow cluster network communications on this network. New network added while working with Microsoft support last month. (28.x subnet)
    Cluster Network 3 - Do not allow cluster network communications on this network. (22.x subnet)
    Cluster Network 4 - Allow cluster network communications on this network. Existing but not configured to be used by VMs for Live Migration until MS corrected. (20.x subnet)
    Should I modify my metrics further or are the current values sufficient.
    I worked with an MS support rep because my cluster (once I added the 5th host) stopped being able to live migrate VMs and I had VMs host jumping on startup. It was a mess for a couple of days. They had me add the Heartbeat network as part of the solution
    to my problem. There doesn't seem to be anywhere to configure a network specifically for CSV so I would assume it would use (based on my metrics above) Cluster Network 4 and then Cluster Network 2 for CSV communications and would fail back to the Cluster Network
    1 if both 2 and 4 were down/inaccessible.
    As to the iSCSI getting a second NIC, I would love to but management wants separation of our VMs by subnet and role and hence why I need the 4 VSwitch NICs. I would have to look at adding an additional quad port NIC to my servers and I would be having to
    use half height cards for 2 of my 5 servers for that to work.
    But, on that note, it doesn't appear to actually be a bandwidth issue. I can run a backup for a single VM and get nothing on the network card (It caused the reboots before any real data has even started to pass apparently) and still the problem occurs.
    As to Backup Chain, I have been working with the vendor and they are telling my the issue is with VSS. They also say they support CSV as well. If you go to this page (http://backupchain.com/Hyper-V-Backup-Software.html)
    they say they support CSVs. Their tech support has been very helpful but unfortunately, nothing has fixed the problem.
    What is annoying is that every backup doesn't cause a problem. I have a daily backup of one of our machines that runs fine without initiating any additional reboots. But most every other backup job will trigger the VMs on the common CSV to reboot.
    I understood about the updates but I had to "prove" it to the MS tech I was on the phone with and hence I brought it up. I understand on the storage as well. Why give a warning for something that is working though... I think that is just a poor indicator
    that it doesn't explain that in the report.
    At a loss for what else I can do,
    Charles J. Palmer

Maybe you are looking for

  • How to get the sender filename in the mapping

    hi I want to get the name of the target file placed by XI at the target into a mapping.I need this to correlate this file with the acknowledgement file to be received later by Xi from the target, which has the same name as the target file. I have gon

  • How do I add new photos to an existing photobook album?

    I'm creating a photobook and nearly finished it but want to add some more photos that weren't in the original album I used.  Can anyone advise how you add new photos to it that you can then select to go into the Photobook?  I can't find any way to ad

  • How do I get a large mp4 file off of my phone?

    1--Its too big to upload to GDrive (presumably because it can't create a swap file) 2--I can no longer mount the phone as a USB Mass storage device in order to just drag it off to my desktop (thanks to one of Verizon's later Android updates). 3--Veri

  • Disappointed with Adobe CC

    Hi to all, My first 12 months of membership to CC is just about up and I must say I am most dissapointed in the whole concept. I decided to give it a go based on Adobes promise of regular updates with "exciting" new features added regularly. This has

  • Sync Outlook 2010 calendar with my iPad using iTunes 11.1.3.8 : only from PC to iPad is working !

    Using iTunes 11.1.3.8 to syn my Mini Ipad with my PC running Windows 8 and Outlook 2010 : Calendar is only synchronizing in one direction (from PC to Mini iPäd). All calendar entries done on my Mini iPad are NOT appearing in Outlook 2010, while all e