Authentication freezing occasionally in OD

We're having problems with our two Xserve G4s which are both running OS X Tiger Server 10.4.6. Call the first server "filesrvr" and the other "mailsrvr" (which in addition to mail also provides web and file sharing services). Both are running behind a firewall with mailsrvr on a DMZ visible to the Internet. The filesrvr is the DNS and OD Master, and the mailsrvr a DNS slave and OD replica. The mailsrvr is operating in a beta capacity at the moment while we address these problems.
As far as I can tell, the DNS and basic aspects of OD are working correctly: both systems have proper forward and reverse local DNS names, and the hostname is being set correctly on both systems. We had problems with a prior incarnation of the servers because DNS and OD were set up in a funny way and in the wrong order: this led to all sorts of lookupd-related crashes, but the servers have been wiped clean since and carefully rebuilt from scratch. We haven't seen anything that looks like the prior problem and both servers are much more stable now.
That being said, there still appear to be two problems, both on the mail server side. Any suggestions would help:
1) Every few days, OD-related authentication freezes up for a period of time. Sometimes it will come back on its own within fifteen or twenty minutes, at other times we must do a hard restart to reset the problem. The affected services are: Tiger Server Admin (port 311), SMTP, POP3, IMAP, SSH, AFP, SMB, and FTP. Console logins are also rejected during these periods. I'll supply details for what I think is happening here with log listings.
2) Several times a day (say two to three times a day), the Tiger Server Admin (servermgrd on port 311) becomes non-responsive. It doesn't matter whether you run the Server Admin client application locally on the Tiger Server, or via another machine. It also doesn't seem to matter if you have an active session already running or not. It just seems to cut in and out on its own. The good news is that usually within ten minutes it always comes back up on its own. I don't find anything in the logs related to servermgrd during these outages. I also don't see anything 'funny' about the instance of servermgrd running at the time.
I'm not sure if these two problems are related. The authentication issue seems more serious since it affects the availability of mail service to our users which needs to be up and running at all times.
I'll supply details below, but the bottom line is that I suspect memberd and/or lookupd (or something they both depend on -- like the NetInfo database) is not working quite right. I've been reading up on postings related to these two daemons and wonder what would be an appropriate course of attack.
In <http://discussions.apple.com/thread.jspa?threadID=492829>, one recommendation is to run "/sbin/fsck -y" in safe mode looking for any problems, and then to manually rebuild the NetInfo database. Obviously memberd and lookupd depend on their caches and the databases they make use of to be in working order. The suggestion is to check for file system issues and rebuilt the NetInfo database.
In <http://discussions.apple.com/thread.jspa?threadID=132009> the recommended course of action is to flush the memberd cache using "sudo memberd -r" from time to time (elsewhere it is suggested to do this once an hour via a crontab entry, and also to issue a "/usr/sbin/lookupd -flushcache"). The idea here is to make sure the caches are kept fairly fresh.
I'm a bit torn about how to attack this problem -- it seems a no-brainer to:
1) Run Disk Utility "Verify Disk" and "Verify Permissions" checks on filesrvr and mailsrvr.
2) Run "/sbin/fsck -y" on filesrvr and mailsrvr until no problems are reported.
3) Insert the following line into the root user's crontab to clear out the memberd and lookupd caches once an hour:
@hourly /usr/sbin/memberd -r; /usr/sbin/lookupd -flushcache
I'm a little hesitant to rebuild the NetInfo database right now. There's a number of special accounts defined there that will simply be a pain to rebuild. My thinking is that I'll make the three changes above and stop just short of rebuilding NetInfo. If authentication fails again on mailsrvr in a similar way then I'll run the Disk Utility and fsck fixes again but also go one step further and rebuild NetInfo.
Does anyone have thoughts for other things to try?
<details>
We've seen the first type of failure (authentication stops working) four times in the past two weeks, and it's happened in a number of different ways. Here are some of the details, with log snippets (the only interesting activity is in /var/log/system.log):
The first time authentication stopped working was on June 6th at about 3:15am. I happened to be awake at the time working on something else -- checking for mail periodically but otherwise not logged onto mailsrvr. We use InterMapper to probe all of the server ports and make login checks, and I received a flurry of failures (on the ports listed above). The system had been running without incident for about 36 hours since the last reboot, with only a few mail users accessing the system plus InterMapper's probes every two minutes or so, and not much else going on. In the logs (below), it's clear that the original memberd daemon (process 55) exited abnormally. Launchd created a new instance of memberd (process 23431) which stayed up about eleven minutes before it also crashed. Launchd fired up one more instance of memberd (process 23901), which then stayed up for several days until we took it down for a backup. It was when this final instance of memberd came up that authentication started working again. I also verified that I could not log in via SSH while authentication services were down.
Here's a log snippet (I've omitted log entries which are clearly part of normal operations -- like Cyrus's checkpointing of the mail database every thirty minutes and InterMapper's probes):
Jun 6 03:13:44 mailsrvr memberd[55]: Fatal error -1 submitting to kernel (1: Operation not permitted)\n
Jun 6 03:13:44 mailsrvr memberd[23431]: memberd starting up
Jun 6 03:15:18 mailsrvr sshd[23399]: fatal: Timeout before authentication for <<my IP address>>
Jun 6 03:24:34 mailsrvr memberd[23431]: Fatal error -1 submitting to kernel (1: Operation not permitted)\n
Jun 6 03:24:34 mailsrvr launchd: Server 0 in bootstrap 1103 uid 0: "/usr/sbin/memberd -x"[23431]: exited with status: 1
Jun 6 03:24:34 mailsrvr memberd[23901]: memberd starting up
On June 12th, we suffered a similar failure. All of the authentication-based services listed above suddenly stopped responding. At the time, I had an open SSH connection running on root@mailsrvr, looking into unrelated issues, so while authentication was down I probed around a bit. I noticed right off that there was nothing funny in the logs as before -- no smoking gun indicating the memberd daemon. In fact, there was nothing in the logs to indicate that there was any problem at all!
However, I noticed from my bash session that if I did an "ls", the shell would list files without problem, but if I did an "ls -l", the command would hang until I issued a CTRL+C. I surmised that the reverse lookup of the user ID or group ID wasn't working correctly, either because memberd or lookupd or something deep in the system related to user or group names wasn't running right. Since this hangup occurred during business hours, it was necessary to restart the server to get it back online ASAP, so unfortunately I couldn't spend more time investigating the state of the system.
The next glitch occurred on June 14th. Once again, all of the authentication-based services when down. The only spurious entry in our logs was related to servermgrd (which provides service on the Tiger Server port 311). I don't think this daemon was the cause of the problem, but an indicator that something else was going on at the time. At the time of the failure (about 16:02), we were receiving FTP probes from mainland China (a host at 219.136.191.93) at intervals of 75 seconds or so to keep under the radar. I also don't think this was the cause of the problem. By 16:10 the problem had cleared itself up without my interaction or the need to restart the server.
Jun 14 16:01:39 mailsrvr ftpd[14832]: FTP LOGIN REFUSED (PASS before USER) FROM 219.136.191.93 [219.136.191.93]
Jun 14 16:05:11 mailsrvr servermgrd: [41] error in getAndLockContext: flock(servermgr_netboot) FATAL time out
Jun 14 16:05:11 mailsrvr servermgrd: [41] process will force-quit to avoid deadlock
Jun 14 16:05:11 mailsrvr launchd: com.apple.servermgrd: exited with exit code: 1
Jun 14 16:05:11 mailsrvr launchd: com.apple.servermgrd: 9 more failures without living at least 60 seconds will cause job removal
Jun 14 16:05:21 mailsrvr servermgrd: servermgr_dns: hostname and DNS entries for this server are synchronized
Jun 14 16:09:26 mailsrvr ftpd[15037]: FTP LOGIN REFUSED (PASS before USER) FROM 219.136.191.93 [219.136.191.93]
Jun 14 16:09:27 mailsrvr servermgrd: TIME-CHECK: NSLXStandardRegisterService took 255.824166 seconds!\n
The next day (June 15th), we saw a similar outage which lasted a bit less than fifteen minutes. Our friends in China had been probing our FTP port for several hours before this occurred:
Jun 15 12:48:28 mailsrvr ftpd[10559]: FTP LOGIN REFUSED (PASS before USER) FROM 219.136.191.93 [219.136.191.93]
Jun 15 12:49:30 mailsrvr servermgrd: [15310] error in getAndLockContext: flock(servermgr_netboot) FATAL time out
Jun 15 12:49:30 mailsrvr servermgrd: [15310] process will force-quit to avoid deadlock
Jun 15 12:49:30 mailsrvr launchd: com.apple.servermgrd: exited with exit code: 1
Jun 15 12:49:30 mailsrvr launchd: com.apple.servermgrd: 9 more failures without living at least 60 seconds will cause job removal
Jun 15 12:49:36 mailsrvr servermgrd: servermgr_dns: hostname and DNS entries for this server are synchronized
Jun 15 12:49:44 mailsrvr ftpd[10600]: FTP LOGIN REFUSED (PASS before USER) FROM 219.136.191.93 [219.136.191.93]
Jun 15 12:50:53 mailsrvr ftpd[10635]: FTP LOGIN REFUSED (PASS before USER) FROM 219.136.191.93 [219.136.191.93]
Jun 15 12:55:31 mailsrvr servermgrd: [11272] error in getAndLockContext: flock(servermgr_dhcp) FATAL time out
Jun 15 12:55:31 mailsrvr servermgrd: [11272] process will force-quit to avoid deadlock
Jun 15 12:55:44 mailsrvr launchd: com.apple.servermgrd: exited with exit code: 1
Jun 15 12:55:44 mailsrvr launchd: com.apple.servermgrd: 9 more failures without living at least 60 seconds will cause job removal
Jun 15 12:56:03 mailsrvr sshd[11398]: fatal: Timeout before authentication for <<my IP address>>
Since June 15th we've been running without incident, I'm happy to report. It's possible that the failures have been related to periods in which a number of changes have been made to the LDAP and NetInfo databases via Workgroup Manager. I'm going to keep a log from now on to see if there's a correlation there. Since June 15th, we've made only minimal changes.
</details>
1.42GHz Mac Mini   Mac OS X (10.4.6)   1GB RAM, SuperDrive, Airport

Ah, forgot to add that I'm running version 1.1, if that makes a difference. Sounds like it's been causing trouble...

Similar Messages

  • 24" iMac freezing occasionally/Please Help

    24" iMac freezing occasionally without allowing force quit. Sometimes when I attempt to restart it, when the computer comes back on it will freeze on a gray screen with a folder in the middle with a question mark on it. If I unplug all peripherals and the iMac power cord and let it sit for a while, it starts working again Sometimes I might get 4 days of use before it happens again.  Then sometimes it might happen 2 or 3 times in a day.  I've always heard the gray screen with foler with question mark was the mark of death for the hard drive.  If this information is true, how am I able to get it to work occasionally with no porblems?
    Thanks,
    34firefighter

    I found out what was happening... I was running screen dimming software called 'Shades' which was messing up the colour profile allocation. I stopped this software running and I was able to recalibrate! I'm so pleased as I thought there was something wrong with my graphics card!

  • My ipad mini freezes occasionally

    My ipad mini freezes occasionally while just browsing on the Internet and hitting e sle/wake button or home button doesn't do anything.  The screen is just frozen on the current page I was on. It won't even shut off by holding the sleep/wake button like you would normally do.  The only response is will respond to is holding the sleep/wake while simultaneously holding the home button, which will in about 10 seconds shut the device off.  What's going on?!

    Frozen or unresponsive iPad
    Resolve these most common issues:
        •    Display remains black or blank
        •    Touch screen not responding
        •    Application unexpectedly closes or freezes
    http://www.apple.com/support/ipad/assistant/ipad/
    iPad Frozen? How to Force Quit an App, Reset or Restart Your iPad
    http://ipadacademy.com/2010/11/ipad-frozen-how-to-force-quit-an-app-reset-or-res tart-your-ipad
    iPad: Basic troubleshooting
    http://support.apple.com/kb/TS3274
     Cheers, Tom

  • OS X 10.8.3 freezes occasionally with Thunderbolt external LaCie boot drive

    When it works, it's awfully fast but occasionally it freezes.   Why?  Help!

    As far as I know, natural scrolling is only for OS X, not for Windows, unless Apple has added this feature with the most recent drivers. However, if you want the natural scrolling, you can try using Trackpad++ > http://trackpad.powerplan7.com/

  • DVD Player Freezes Occasionally

    My white MacBook (running OS X 10.4.11, purchased August 2006) occasionally freezes while playing DVDs. This happens very inconsistently, perhaps once for every five or more DVDs I play. While the film is playing, with no input from me, the computer becomes nonresponsive (i.e. I can't change the volume, none of the keyboard keys work, I can't control/pause the movie), the mouse still visibly moves on the screen, but it is a spinning rainbow wheel for several minutes. The funny thing is, the movie continues playing as if nothing is wrong, but I have no control over it. Eventually the problem either resolves itself, or I force quit (which does not work immediately, but does force the program to quit after a few minutes).
    I run software update and none needs updating. I scoured Apples help fields and these discussion forums and did not see anyone else having this problem. Please direct me to a helpful thread, if I missed it.

    I would first run disk utility and repair permissions. Then I would delete the plist file. It's here:
    /Users/YourName/Library/Preferences/com.apple.DVDplayer.plist
    Then restart DVD player and see if it works like it should.
    If it's still acting up I would try playing the dvd with VLC, it's free and you can get it here:
    https://www.versiontracker.com/dyn/moreinfo/macosx/10210434

  • Probook 450 G2 Touch Pad freezes Occasionally

    Brand new Laptop. Touch pad will occasionally freeze but i can still click and scroll using the arrows on the keyboard. WIl start working again maybe after a couple of minutes. Any idea how to fix?? Running on windows 7 professional 

    I have exactly the same problem with an Aspire Switch SW5-171. Following the instructions of an Acer tech on chat, I refreshed the system (driver reinstallation) to no avail, and then I reinstalled Windows, which didn't work either. I submitted a repair request, but I only got a reply to contact Acer by phone. The freezes can usually be resolved by putting the system to sleep and disconnecting and reconnecting the table from the keyboard. Freezes happen more often, but not exclusively, when using browsers. If this were a driver issue, I would agree with your comments, and an updated driver might resolve it, but I suspect it's  a hardware issue with some kind of intermitent operation of the touchpad, because USB mice connected to the keyboard USB port work perfectly consistently all the time with my unit, and because the freezes can be resolved temporarily by the steps I mentioned above. Acer ought to test units for this issue and release an updated driver if it's stablished that interference with other software is the problem. Otherwise, the detachable touch pad/keyboard must be replaced under warranty. 

  • My internet freezes occasionally after using iphoto

    Occasionally, after using iphoto, particulary the photo books, the internet will freeze.  The only way to get going again, is to reboot the computer.

    How is that related to Firefox support?

  • Screen Sharing and SSH sessions freeze occasionally on multiple mac minis

    I have 28 Mac Minis at work. With such a large number of minis, I obviously can't have a monitor attached to each of them so I've got them plugged into a network switch and access them via Screen Sharing (both via regular Screen Sharing and ARD) and SSH sessions.
    A few of them seem to suffer from intermittent problems however. I'll be using Screen Sharing when the session freezes. It may unfreeze eventually, but I can also usually just quit out and re-connect and it will be unfrozen. The same thing happens when I'm connected via SSH, it will freeze and I won't be able to type in any more commands.
    I need help troubleshooting (or if anyone knows what could be causing this, that'd be cool too).
    I've tried connecting from both a Mac Pro on the wired network and a MacBook Pro on the wireless network. The freezing seems to only happen on certain Mac Minis as well.
    I've tried switching network cables from a Mac Mini that doesn't suffer from this problem with one that does and nothing changed.
    I also thought it might be a bandwidth issue at first, despite being a gigabit switch connected via cat6 to the rest of our gigabit network, but even when no significant bandwidth is being used, the freezing still occurs.
    One more thing I want to test is the connection between the switch all these Mac Minis is plugged into and one of the other switches that all our other network traffic goes through. I didn't set it up myself so I fear that it might be an old, damaged cable or something. Failing that, I have no idea what the problem could be, which is why I'm posting here.
    So, does anyone have any idea what the problem could be? Or any other ideas for troubleshooting the problem? Thanks.
    (They're all running 10.6.8, and range from Mid-2007 to 2009 models).

    It would be in the system log. However, the next step would be to safe-boot in order to eliminate third-party system modifications. That goes for both client and server. If you can reproduce the problem in safe mode, then you probably have a network issue. Take everything offline except one client and one server, and test.

  • My MacBook freezes occasionally...is there a way to bring it out of frozen?  What would be causing it to freeze?

    My MacBook freezes @ least once a week.  I have to shut it down.  What might be causing it to freeze & is there another way to bring it out of frozen other than shutting it down?

    Not enough free space on the disk can cause the system to freeze...
    Right or control click the MacintoshHD icon. Click Get Info. In the Get Info window you will see Capacity and Available. Make sure there's a minimum of 15% free disk space
    Check the startup disk in case it needs repairing >  Using Disk Utility to verify or repair disks
    MacBook, Mac OS X (10.6.5)  <  your profile
    If that is correct, when you can, you need to update to v10.6.8
    Updates are available from your Apple menu > Software Update...
    Not enough RAM can cause the operating system to freeze also. You may want to consider increasing memory. To see how much memory you have installed click your Apple menu icon then click About This Mac. That will prompt the System Profiler. See Memory under Hardware on the left.

  • My dock(Mac) freezes occasionally while Photoshop CC Is open.

    My OS dock mouse over animation freezes sometimes while Photoshop CC Is open. (Mac) Cause? solution? The dock itself is still functional, just the mouse over animation stops working.

    Photoshop can't interfere with the Dock process. It could be system paging slowing everything down, or a bug in the Dock code.

  • Aperture 2 trial freezes occasionally

    I'm using the Aperture 2 trial, downloaded from the apple site, and I would like to purchase it when my trial expires. The only thing is about 75% of the time when i launch the trial from my dock it gets to the splash screen and then freezes. If i force quit and then open it again immediately after the initial freeze it'll get to the splash screen then ask me if i want to continue, register, etc. like it normally does on launch.
    Does anyone know why it usually freezes up at the initial splash screen?
    Thanks

    I have the same problem, but more frequently. Like, ALWAYS. After working with Aperature enough to decide I wanted to buy it. I've got several thousand images imported already and all of a sudden the trial freezes at the splash screen and it's impossible to get beyond that point. Forcing Quit, even rebooting the Mac will not solve the problem. The last action I took was copying an existing Web theme folder and altering some of the HTML in the template pages of the copy but the freeze is unrelated to web gallery generation. There's something wrong at startup and now I'm faced with not being able to run the program, purchase it, and lose all the work I've already done with my photos in anticipation of buying the darned thing...

  • 2007 MBP Running Hot and Freezing occasionally after 10.6.3

    This is obviously not normal. Anyone else experiencing this or know of the cause or a fix?
    It's gotten so hot I've resorted to placing a towel in my lap. iStat says the CPU is running at 148 degrees, all the time, with the GPU frequently hitting 160, and both fans are running full bore.
    The freezes come and go, and usually only last for 20-30 seconds, but when they do pop up, they happen frequently, as many as 10 times per hour.
    I don't think I am doing anything differently than I have always done. No new software, etc.
    Thanks much in advance for any suggestions.

    I don't have 10.6 on anything but I believe this works for SL. In Applications > Utilities you will find Activity Monitor (hereafter AM to avoid the extra typing). Launch it and look at its "Show" selector at the top of the window. By default, AM launches to "User Processes" and that seldom shows enough. Change it to "All Processes" to get a good look at the true background picture.
    Look at the columns AM presents. Find the "%CPU" column and click on it to sort the list by usage. When the computer is idling (you've quit all applications so only the Finder is active), no process should be more than 4-6 percent (might be a little different in SL). Basically you're looking for things that are using double-digits of the CPU cycles when the computer seems to be at idle..
    Processes with "mds" in the name are related to Spotlight indexing but don't run that often after the initial install. If "syslogd" is running at high numbers, something is writing a bunch of stuff to system.log. I had that happen on my MBP with the demo for the game "Prey." Even when the computer was only idling, the demo continued to write to the log file and ballooned it from its usual size of ~250K to over 60MB, and ran the heat up to boot. Deleting the demo and a restart cured that insanity.
    I found a process related to Safari 4 called Safari Webpage Preview Fetcher that ran a lot. It's how Tops Sites gets page previews. I now avoid using Tops Sites, especially on our slower G4 Macs which constitute the majority of our household Mac-ness. That process really gummed the works in those older Macs, all running 10.4.11.
    You might want to repeat the question in the Snow Leopard forums. Someone more familiar with SL may see a pattern in the shutdowns and stalls. The 10.6 forums are here:
    http://discussions.apple.com/category.jspa?categoryID=263
    Keep us posted.
    Allan

  • Desktop freezes occasionally and recovers after around 10s in 2.6.35-2

    Top shows i915 is taking 100% CPU. The system is a thinkpad SL410, running latest versions of Xorg, intel-video, openbox + xcompmgr. Killing xcompmgr seems to resolve the problem.
    No such issue while with 2.6.34. Perhaps some conflicts with out-of-date xcompmgr and the newest i915 features in 2.6.35?
    Last edited by lapa (2010-08-21 12:59:49)

    New update - freezing still happens without xcompmgr...

  • ITunes (v. 11.1.5.5) is slow, freezes occasionally, pauses songs every few seconds, and just generally stinks. What should I do?

    Ever since I updated iTunes to 11.1.5.5, it has become a mess. It's slow to start. Once it finally gets going and allows me to click something, it usually freezes up for a short time. This happens especially if I click the search box in my library. If I manage to get a song playing, it will skip constantly. After about 5 songs of doing this, it will usually correct itself and play right. If I try to go to the iTunes Store, it might take 5 minutes to come up. Altogether the entire program is just slow and unresponsive most of the time. If I buy an album and try to download it, it'll only let a few songs download and the others will have errors. I've had iTunes since the original version and I've never had this many issues with it.
    Here's what I'm working with:
    -HP Pavillion m7 Notebook PC
    -Windows 7 Home Premium
    -Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz
    -8GB of installed memory (RAM) 7.90Gb usable
    -64 bit operating system
    *I'll be happy to provide any more information
    I only use this computer for music and I have about 40GB worth. Could I have too much music? Is anyone else having these types of problems with this version?

    Ever since I updated iTunes to 11.1.5.5, it has become a mess. It's slow to start. Once it finally gets going and allows me to click something, it usually freezes up for a short time. This happens especially if I click the search box in my library. If I manage to get a song playing, it will skip constantly. After about 5 songs of doing this, it will usually correct itself and play right. If I try to go to the iTunes Store, it might take 5 minutes to come up. Altogether the entire program is just slow and unresponsive most of the time. If I buy an album and try to download it, it'll only let a few songs download and the others will have errors. I've had iTunes since the original version and I've never had this many issues with it.
    Here's what I'm working with:
    -HP Pavillion m7 Notebook PC
    -Windows 7 Home Premium
    -Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz
    -8GB of installed memory (RAM) 7.90Gb usable
    -64 bit operating system
    *I'll be happy to provide any more information
    I only use this computer for music and I have about 40GB worth. Could I have too much music? Is anyone else having these types of problems with this version?

  • Powerbook G4 random freezes & occasional black power button screen?

    Hi,
    I bought a secondhand Powerbook G4 a few months ago and it has been working a dream until recently...on boot up I occasionally get a screen with a black power button logo on it and some english/japanese text saying there has been an error. I have to hold down the power button to reboot when this happens...?
    Also in the last few days I've been working along happily and then ever so often the machine will just lock up and again the only way to get out of it is a power button hold down.
    Does anyone have any ideas what this could be? I thought it might be the HDD on the way out, but is there any way to test this? It's not been excessively noisy or anything.
    My specs for reference: -
    Powerbook G4 17" 1.33GHz (stock GFX, etc)
    512 GB RAM (1 stick PC2700U-25330)
    60GB HDD (fujitsu original)
    Leopard 10.5.1
    Thanks in advance for any advice.
    Jim

    Sounds like you're getting kernel panics, the number one cause are memory-related problems. I would run the Apple hardware test to see if it turns up any problems. You can also run the program rember to specifically test the memory. Rember is available from http://www.kelleycomputing.net:16080/rember/ .
    This FAQ provides more information on dealing with kernel panics: http://www.thexlab.com/faqs/kernelpanics.html.

Maybe you are looking for