Failover cluster not cleanly shutting down service

I've got a two node 2008 R2 failover cluster.  I have a single service being managed by it that I configured just as a generic service.  The failover works perfectly when the service is stopped, or when one of the machines goes down, and the immediate
failback I have configured works perfectly in both scenarios as well.
However, there's an issue when I take the networking down on the preferred owner of the service.  As far as I can tell (this is the first time I've tried failover clustering, so I'm learning), when I take the networking down, the cluster service shuts
down, and in turn shuts down the service I've told it to manage.  At this point, when the services aren't running, the service fails over to the secondary as intended.  The problem shows up when I turn the networking back on.  The service tries
and fails to start on the primary (as many times as I've configured it to try), and then eventually gives up and goes back to the secondary.
The reason for this, examining logs for the service, is that the required port is already in use.  I checked some more, and sure enough, when I take the networking offline the service gets shut down, but the executable is still running.  This is
repeatable every time.  When I just stop the service, though, the executables go away.  So it's something to do specifically with how the managed service gets shut down *when it's shut down due to the cluster service stopping*.  For some reason
it's not cleaning up that associated executable.
Any ideas as to why this is happening and how to fix/work around it would be extremely welcome.  Thank you!

Try to generate cluster log using closter log /g /copy:<path to a local folder>. You might need to bump up log verbosity using cluster /prop ClusterLogLevel=5 (you can check current level using cluster /prop).
You also can look at the SCM diagnostic channel in the event viewer. Start eventvwr. Wait for the clock icon on the Application and Services Logs to go away. Once the clock icon is gone select this entry and in the menu check Show Analytic and Debug Logs.
Now expand to the SCM provider located at
Application and Services Logs\Microsoft\Service Control Manager Performance Diagnostic Provider\Diagnostic.
or Microsoft-Windows-Services/Diagnostic
Enable the log, run repro, disable the log. After that you should see events from the SCM showing you your service state transitions.
The terminate parameters do not seems to be configurable. I can think of two ways fixing the issue
- Writing your own cluster resource DLL where you can implement your own policies. THis would be a place to start http://blogs.msdn.com/b/clustering/archive/2010/08/24/10053405.aspx.
- This option is assuming you cannot change the source code of the service to kill orphaned child processes on startup so you have to clenup using some other means. Create another service and make your service dependent on this new service. This new serice
must be much faster in responding do the SCM commands. On start of this service you using PSAPI enumirate all processes running on the machine and kill the orphaned child processes. You probably should be able to acheve something similar using GenScript resource
+ VB script that does the cleanup.
Regards, Vladimir Petter, Microsoft Corporation

Similar Messages

  • LabVIEW 2010 SP1 may not cleanly shut down

    Occasionally LabVIEW 2010 SP1 (32-bit, running on Windows 7) will appear to "hang" when the "Getting Started" window is closed.  The window disappears but the shortcut icon still appears in the Windows toolbar in the bottom of the screen.  Right-clicking and selecting "Close Program" does not work and Windows seems to think any VIs that have been recently closed are still open.  The program is still in the list in Task Manager and it can successfully be shut down all the way from there.
    This is more of an annoyance than an operational issue (I think) - is there something I am doing (or failing to do) that causes this problem? 
    Thanks in advance.

    Hello Unplugnow,
    Deleting the .ini file will not cause any issues in terms of being able to reload Labview it will simply restore the default settings for Labview which will insure that there were be no data corruption issues. I have tried this procedure on my side with no issue although I would recommend copying the .ini file to another directory in case you need to recover this information at a later time. 
    To another note, does this issue arise when you are opening a particular VI, or does it occur when you are simply opening and closing Labview?
    Best,
    Blayne Kettlewell

  • The system has rebooted without cleanly shutting down first. How to clear this error. Occurs frequently.

    Log Name:      System
    Source:        Microsoft-Windows-Kernel-Power
    Date:          31-12-2014 14:35:18
    Event ID:      41
    Task Category: (63)
    Level:         Critical
    Keywords:      (2)
    User:          SYSTEM
    Computer:      BLISS-ORACLE
    Description:
    The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-Kernel-Power" Guid="{331C3B3A-2005-44C2-AC5E-77220C37D6B4}" />
        <EventID>41</EventID>
        <Version>2</Version>
        <Level>1</Level>
        <Task>63</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000002</Keywords>
        <TimeCreated SystemTime="2014-12-31T09:05:18.479605300Z" />
        <EventRecordID>9113</EventRecordID>
        <Correlation />
        <Execution ProcessID="4" ThreadID="8" />
        <Channel>System</Channel>
        <Computer>BLISS-ORACLE</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="BugcheckCode">0</Data>
        <Data Name="BugcheckParameter1">0x0</Data>
        <Data Name="BugcheckParameter2">0x0</Data>
        <Data Name="BugcheckParameter3">0x0</Data>
        <Data Name="BugcheckParameter4">0x0</Data>
        <Data Name="SleepInProgress">false</Data>
        <Data Name="PowerButtonTimestamp">0</Data>
      </EventData>
    </Event>

    Hi,
    I agree with Mr X. This should be a BSOD issue. Did you refer to Mr X’s suggestions and solve this issue?
    Just addition, troubleshoot this kind of kernel crash issue, we need to analyze the crash dump file to narrow down the root cause of the issue. Please refer to following articles
    and check if can help you.
    Crash dump analysis using the Windows debuggers (WinDbg)
    How to read the small memory dump file that is created by Windows if a crash occurs
    By the way, it may be not effective for us to debug the crash dump file here in the forum. If this issues is a state of emergency for you. Please contact Microsoft Customer
    Service and Support (CSS) via telephone so that a dedicated Support Professional can assist with your request.
    To obtain the phone numbers for specific technology request, please refer to the web site listed below:
    http://support.microsoft.com/default.aspx?scid=fh;EN-US;OfferProPhone#faq607
    Hope this helps.
    Best regards,
    Justin Gu
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Support, contact [email protected]

  • Windows 2008 Failover Cluster - Cannot add a generic service

    Trying to add a generic service in a failover cluster.
    Select the option Services and Application and it opens the wizard and then displays the error "An error was encountered while loading the list of services. QueryServiceConfig failed. The system cannot find the file specified"
    The cluster validation wizard completes successfully. Permissions do not appear to be an issue as this account can seemly do everything else so I am at a loss to understand why this API is failing when it tries to query the server for services information.
    Having searched the Internet the only thing I have found was someone posting a similar issue in the Greek language Technet forum(if I recall correctly) and their comment was they rebuild their cluster.
    Windows 2008 (SP2) x64 two node cluster running a non-Microsoft database. We need to add a non-Microsoft Enterpirse backup solution and this is their documented method (adding it as a generic service) - both bits of software are from big vendors.
    Symantec AV, but have tried with that disabled so don't think it has anything to do with that. Something is stopping the API from reporting back but I can't find what.
    Really appreciate some help before we have to log a chargable call with Microsoft support
    Thank you

    Hi,
    Have you tried the suggestion? I want to see if the information provided was helpful. Your feedback is
    very useful for the further research. Please feel free to let me know if you have addition questions.
    Best regards,
    Vincent Hu

  • Kernel-Power Event ID 41: The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

    Hello,  Currently we are seeing this issue with a couple of our Lenovo T420s laptops with a Solid State Drive.  ruffly about 10 or so.  The Reboots happen randomly and do not create a dump file.  We have contacted Lenovo and they are
    not sure why its happening.  Since this Crash I set it for Minidump this did not work my next steps will be to Disable Automatic restart on System Failure to see if it brings anything up.  I am also looking at using Procmon to dump to a file as well. 
    If anyone has any other ideas please let me know.
    The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
    + System
      - Provider
       [ Name]  Microsoft-Windows-Kernel-Power
       [ Guid]  {331C3B3A-2005-44C2-AC5E-77220C37D6B4}
       EventID 41
       Version 2
       Level 1
       Task 63
       Opcode 0
       Keywords 0x8000000000000002
      - TimeCreated
       [ SystemTime]  2012-02-01T00:02:48.677610900Z
       EventRecordID 8270
       Correlation
      - Execution
       [ ProcessID]  4
       [ ThreadID]  8
       Channel System
       Computer LR8K6TLC.cntr.thrivent.corp
      - Security
       [ UserID]  S-1-5-18
    - EventData
      BugcheckCode 0
      BugcheckParameter1 0x0
      BugcheckParameter2 0x0
      BugcheckParameter3 0x0
      BugcheckParameter4 0x0
      SleepInProgress false
      PowerButtonTimestamp 129725280988099400
    Event 89, Kernel-Power
    ACPI thermal zone ACPI\ThermalZone\THM0 has been enumerated.            
    _PSV = 0K            
    _TC1 = 0            
    _TC2 = 0            
    _TSP = 0ms            
    _AC0 = 0K            
    _AC1 = 0K            
    _AC2 = 0K            
    _AC3 = 0K            
    _AC4 = 0K            
    _AC5 = 0K            
    _AC6 = 0K            
    _AC7 = 0K            
    _AC8 = 0K            
    _AC9 = 0K            
    _CRT = 371K            
    _HOT = 0K            
    _PSL - see event data.
    ---- Details
    +
    System
    Provider
    [ Name]
    Microsoft-Windows-Kernel-Power
    [ Guid]
    {331C3B3A-2005-44C2-AC5E-77220C37D6B4}
    EventID
    89
    Version
    0
    Level
    4
    Task
    86
    Opcode
    0
    Keywords
    0x8000000000000020
    TimeCreated
    [ SystemTime]
    2012-02-01T00:02:49.270411900Z
    EventRecordID
    8271
    Correlation
    Execution
    [ ProcessID]
    4
    [ ThreadID]
    68
    Channel
    System
    Computer
    LR8K6TLC.cntr.thrivent.corp
    Security
    [ UserID]
    S-1-5-18
    EventData
    ThermalZoneDeviceInstanceLength
    21
    ThermalZoneDeviceInstance
    ACPI\ThermalZone\THM0
    AffinityCount
    1
    _PSV
    0
    _TC1
    0
    _TC2
    0
    _TSP
    0
    _AC0
    0
    _AC1
    0
    _AC2
    0
    _AC3
    0
    _AC4
    0
    _AC5
    0
    _AC6
    0
    _AC7
    0
    _AC8
    0
    _AC9
    0
    _CRT
    371
    _HOT
    0
    _PSL
    0000000000000000
    Thank you.

    We have tested and checked both the Bios and firmware of the SSD Drive's
    Bios was up to date and no issue
    Firmware was also up to date as well.  
    Users are still experiencing random Reboots.   I tried to capture the issue with Procmon but since the PC shutdown (Goes to a black screen no power at all even when turning the "Automatically
    restart" off under Startup and Recovery) No dmp files as of yet.  Unable to configure Procdump due to not knowing where the issue is and what is causing it to happen.
    Going to replace one of the PC's with a New one with different hardware to see if this resolve the issue.  If anyone has any idea's to be able to capture what maybe happening that would
    be great.
    Thank you.

  • My iTunes will not stay shut down. It automatically restarts even without an iPod plugged in. How do I fix this??

    My iTunes, which is the latest version, will not stay shut down. No matter how many times I exit the software, it automatically starts back up. And NO, there are NO iPods plugged in. It's very annoying and is slowing down our desktop immensly. I've tried to find a setting to change it but I'm not a computer/iTunes expert. Can anyone help??

    try holding the home button and off button at the same time for about 10 seconds, it should restart

  • Some times my mac not getting shut down, some times my mac not getting shut down

    some times my mac not getting shut down, some times my mac not getting shut down
    it will show white scree for long time and will not shut down

    Launch the Console application in any of the following ways:
    ☞ Enter the first few letters of its name into a Spotlight search. Select it in the results (it should be at the top.)
    ☞ In the Finder, press the key combination shift-command-U. The application is in the folder that opens.
    ☞ If you’re running Mac OS X 10.7 or later, open LaunchPad. Click Utilities, then Console in the page that opens.
    Step 1
    Select "system.log" from the file list. Enter "BOOT_TIME" (without the quotes) in the search box. Note the times of the log messages referring to boot times. Now clear the search box and scroll back in the log to the time of the most recent boot when you had the problem. Post the messages logged during the time when you had the problem – the text, please, not a screenshot. For example, if the problem is a slow startup taking three minutes, post the messages timestamped within three minutes after the boot time. If the problem is a crash or a shutdown hang, post the messages from before the boot time, when the system was about to crash or was failing to shut down.
    Edit out excessive repeats and personal information, if any.
    If the log doesn't go back far enough in time, scroll down in the Console file list to /private/var/log/system.log.0.bz2. Search the archived log, and if necessary the older ones below them, for the same information.
    Step 2
    Do the same with kernel.log.
    Step 3
    Still in Console, look under System Diagnostic Reports for crash or panic logs, and post the most recent one, if any. For privacy’s sake, I suggest you edit out the “Anonymous UUID,” a long string of letters, numbers, and dashes in the header of the report, if present (it may not be.) Please do not post shutdownStall or hang logs – they're very long and not helpful.

  • MacBook Pro (OS X 10.9.1) calendar continues to "connect to server" and will not allow shut down or restart. Force quit worked. How can I make this calendar "behave"?

    MacBook Pro (OS X 10.9.1) calendar continues to "connect to server" and will not allow shut down or restart. Force quit worked. How can I make this calendar usable? The problem began after I updated to Maverick.

    babowa, it seems like it is using Fuse & NTFS, so I don't think it's the classic WD + 10.9 mess, but extra WD tools & drivers can still break things MtTran.
    MrTran, if you must use unsupported disk formats on your Mac you must also consider actually paying the developers that made the trial software.
    It's probably a good idea to follow the developers removal instructions, reboot & then install one tool at a time.
    MacFuse, FuseOSX, NTFS-3G are all likley to confict if you run older versions so you need to be sure you are using the latest version. I can't remeber which one depends on the other, so you will need to read the manuals.
    When the disk is readable copy the data to another disk. You could probably do this from a Linux distro or Windows if OS X won't do it.
    If you insist on only using the trial versions you will need to reinstall Mac OS, copy data off this disk & reformat it.
    Is there any good reason for not using the Mac HFS extended format?

  • Safari does not allow shut down why.??

    safari does not allow shut down why.? I get a tab that says to quite  safari and i do, but it still does not allow it to shut down why.?

    You can force quit safari if you really want to shut it down.. if the problem remains , try deleting safar and downloading it again

  • Ff 33 does not fully shut down

    My FF v 33.0 stays open in processes under task manager.
    All my other browsers do fully shut down.
    I have reloaded FF on my virus/malware free computer but FF will not completely shut down.
    Very annoying.
    Please help

    Hello,
    Try opening in [[Troubleshoot Firefox issues using Safe Mode|Firefox Safe Mode]] and see if the browser closes properly after opening in that. Safe mode is a troubleshooting mode that turns off some settings, disables most add-ons (extensions and themes). Since Firefox has trouble closing, make sure that it is not running in your Task Manager before opening in Safe Mode.
    If Firefox is not running, you can start Firefox in Safe Mode as follows:
    * On Windows: Hold the '''Shift''' key when you open the Firefox desktop or Start menu shortcut.
    When the Firefox Safe Mode window appears, select "Start in Safe Mode".<br>
    [[Image:Safe Mode Fx 15 - Win]]
    ''To exit Firefox Safe Mode, just close Firefox and wait a few seconds.''
    '''''If the issue is not present in Firefox Safe Mode''''', your problem is probably caused by an extension, and you need to figure out which one. Please follow the [[Troubleshoot extensions, themes and hardware acceleration issues to solve common Firefox problems]] article to find the cause.
    When you figure out what's causing your issues, please let us know. It might help others with the same problem.

  • Why does notes keep shutting down when I scroll up or down and when I try to add a note?

    Notes keeps shutting down when I try to scroll up or down the list an even if I try to add a new note.

    Please refer to the followibng link for the possible solution.
    http://helpx.adobe.com/creative-cloud/kb/ame-premiere-crash-launch-export.html

  • Start up and shut down services

    Hi team,
    Could tell me how to enable the start up and shut down services in unix environment?
    Regards,
    Rajesh

    What services to you mean ? You mean how to run a program ?
    Unix systems don't really have services like Windows. If you want something to start automatically when you reboot the server then usually you place a script under /etc/init.d or a similar location (it depends on what unix you are running)
    http://www.ContractOracle.com

  • Database system was not properly shut down

    Dear Mac community,
    I have just repaired a friends 800 MHz 17" Flat Panel Display iMac... Somehow the hard drive had become badly corrupted and the system was frequently shutting itself down and freezing with kernel panics as a result. I managed to recover my friends data using Data Rescue II and after using Disk Warrior managed to do an Erase & Install of Mac OS X (10.4.4).
    The iMac appears to be fine now except for one worrying thing that I have noticed in the console each time after restarting:
    Mac OS X Version 10.4.4 (Build 8G32)
    2006-02-02 16:20:47 +0000
    2006-02-02 16:20:50.962 SystemUIServer[201] lang is:en
    LOG: database system shutdown was interrupted at 2006-02-02 16:19:46 GMT
    LOG: checkpoint record is at 0/87904C
    LOG: redo record is at 0/87904C; undo record is at 0/0; shutdown FALSE
    LOG: next transaction id: 550; next oid: 82855
    LOG: database system was not properly shut down; automatic recovery in progress
    LOG: ReadRecord: record with zero length at 0/87908C
    LOG: redo is not required
    LOG: database system is ready
    Does anyone have an idea what this might be and how I can resolve the issue??
    The systems performance doesn't appear to be affected in anyway however it looks and sounds ominous.
    One other small thing I have noticed is that the iMac won't automatically put itself into sleep mode despite it being set to in the Energy Saver System Preferences. There are no external hardware devices attached such as a printer, firewire device, USB hub etc. I have reset the PRAM and NVRAM and this appears to have had no effect.
    Any advice / suggestions greatly appreciated!!
    Many thanks,
    Justin

    Marking as answered as I managed to resolve the issue

  • To shut down, or not to shut down?

    I read somewhere that it is better not to shut down your MBP at night because it is programmed to do maintenance from 3AM-5AM. Is this true?

    lissermarie,
    There are also other benefits to not shutting down. Being able to build and keep a larger cache of "Inactive" RAM is one of them, leading to much faster application launch times in many cases.
    More importantly, perhaps, is that there is no compelling reason to shut down, over simply putting your computer to sleep. We have had several portable Macs, and none of them have ever actually been shut down (just restarted, mostly when running updates, etc.). In one case that I know of, we had an iBook that had an "uptime" of over 160 days. That's almost 6 months with no shutdown or restart!!!!
    Scott

  • Satellite A100 does not "really" shut down

    Hey everybody!
    I have a Satellite A100 here and the problem is ... yeah, it does not really shut down. I can choose "Shut Down" in Windows and Windows shuts down, the monitor turns black and the notebook does not work anymore, but it does not turn off itself. The blue lights are still glowing. To switch off the system I have to push the on-off-switch for a few seconds.
    The problem occured three days ago. I don't think it's a virus (someone on the net assumed that), because the system has no connection to the internet and has'nt seen any usb-stick or whatever for at least half a year. The user told me he changed the energy-options from "Toshiba Power Saver" to "Desktop". Of course I changed it back, but this did'nt solve the problem. I also did'nt find anything helpful in the options. Anyone out there who knows the problem or knows what to do? :-)
    Greetins,
    coincidence

    Unfortunately, I can't. I choose "Shut down" in safe-mode and the operating system does a simple reboot. The notebook does not turn off at all.
    Thanks so far. :) Merry Christmas!

Maybe you are looking for

  • Text Click box not working when using back button on playbar

    I have an interstesting situation. I have a slide that has 2 text clickboxes (Slide B) Each text click box has an advanced action associated that controls what is displayed on the next slide (Slide C) . The click boxes and the advanced actions work j

  • Install Limit?

    I bought a copy of SL from best buy, but I want to know if there is a limit on how many times I can install this? Can I install it on as many of my macs as I want?

  • Page Content Disappears When Adding WebPart SP 2010

    I have been working on a project (wiki page).  I have added WPs.  There are only 2 on the page.  One day one of the WPs was not there I am trying to upload the same one.  When I do, the entire page disappears.  So I close it without editing it and it

  • How iReport can work in Sun Application Server??

    I am success to use the iReport to show out the pdf file in Tomcat which i am using jsp code, but, when I run them in sun application server, it throws out the following exception. //===================== HTTP Status 500 - type Exception report messa

  • Syncing photos with Aperture 2.0

    After upgrading to Aperture 2.0, I no longer have the option to select which projects I want to sync my photos from. It is now all photos or none (with Aperture 1.5, you could select one or more projects). Has anyone else experienced this problem? Is