I've just deployed my new AD/OD system out to about 50 macs in the dept, and having this problem!! Generally, things are fine, but usually there are around 5-10 machines (out of 30) in a suite that will not find the AD domain on first boot. Generally a reboot will sort the problem, but a lot of the time it takes a fair few boots before i have both "Network Accounts Available". Not a great problem in a school where it is important that things "just work" straight away!
At the same time as me deploying the new mac server in the dept. the main IT network structure has changed and upgraded to Windows 2008 servers. It is only since this change that I have had these intermittent problems. It's such a pain that I get my problems when the mac server and clients are not bound to an RM build anymore!!
Based on the problems arising after the Windows 2008 upgrade, the only thing I can blame it on is that changes in 2008 do not bode well for dual directory mac clients. I hoped this article would solve my problem..
Mac OS X: Cannot authenticate when Active Directory includes Windows 2008 Servers
but it did nothing. Which was a shame as it describes the fault perfectly.
So, not a helpful comment to anybody, but I just wanted to join the gang. Hopefully something will arise soon, and I will continue to investigate and update if anything changes!!
First of all you need to find out what version of OS X you have deployed. If it is 10.6.2 have you tried updating to 10.6.3? It may resolve this issue. If this is indeed the version installed then try this.
When one fails to find the AD, log in as the local admin. OPen the console and check the logs for any problems around that time. There may be some info in there that could point you in the right direction. If it's a time issue in think it says something like time skew to great. If it's a kerberos issue for the machine account i think it says something like computer failed Pre-auth. If this doesn't lead to anything, then check the dir util. It may bring up further information. I.E it may say not responding. Or in my case, One of the binds may be missing. I have an issue where the OD bind is removed on each reboot of the machine.
I don't have a Server 2008 set up so can't help much further than this. But IIRC you can put the Dir Util in to debug mode. Can't remember how though. Check AFP548 or Bombich site for some clues or even Apple's site.
Hope this helps in some way.
Sorry to hear you've having problems. If it's any consolation I've successfully integrated OSX Server and Clients in a Windows Server 2008 R2 AD environment with none of the problems you're seeing. More than once. There is a 30 second or so delay before Network Accounts are available but I think that's a natural consequence of an environment that contains 65000 active nodes. Apart from this it works well as you would expect.
sudo killall -USR1 DirectoryService
Enables the DS Debug Log. Once enabled you view the log by lauching Console and selecting /Library/Logs/DirectoryService/DirectoryService.debug.log. Or launching terminal and issuing "tail -f /Library/Logs/DirectoryService/DirectoryService/debug.log". The "tail" command with the "-f" switch displays constantly updated entries to the screen. You use the same command to disable the log. Don't let it go on for too long - 30 minutes or so should be long enough? It can get large fairly quickly if you forget.
There's also another command that enables API (Application Program Interface) logging and lasts only for 5 minutes which you may also find useful?
sudo killall -USR2 DirectoryService
You could also issue this command in another shell:
Which is the manual for DirectoryService and should list all the error codes. These may be useful in understanding the logs? You could also log what DirectoryService does at startup by creating two files (use TextEdit with the Plain Text option) and save them as .DSLogDebugAtStart and .DSLogAPIAtStart. Save them in /Library/Preferences/DirectoryService folder.
Restart for logging to take effect.
It's advisable to not use hyphens when naming mac workstations. This could contribute to what you're seeing? Malformed SRV Records could also be a factor. Double-check they were created properly. I have been to sites where this was the case and macs were losing 'sight' of Directory Services as you've described. Avoid using user names that take this form: firstname.lastname. The full point between the names can cause login problems.
Antonio Rocco (ACSA)
That might come in handy for the six weeks me thinks :)
Thank you for your replies.
Today was the first day that students used the environment, so i've been very busy today and haven't looked into any of your solutions, but i'll be sure to get back in the week and let you know how I get on. As a quick fix, i've set the computers not to sleep during the day, so the problem only occurs in the morning. Takes a few reboots on some machines to find both domains, but once it's done, the machine is completely available for the whole day.
Somehow I can't see the problem being timeskew. Although I haven't fully investigated this, I can't see why it would connect sometimes and not others? This means that the server or client must be drifting in and out of time, which is unlikely. Correct me if i'm wrong!!
And I did an update to 10.6.3 last week in a hope that this would help, and it had no effect whatsoever. Shame really!
I would consult the logs. If it is time skew it will state in the logs. If it is pre-auth it will also mention it in the logs. It could just be that they take a little longer to connect than the others. Also what type of machines are they? Edu iMacs, standard off the shelf iMacs, Mac pros?
Also are they dual-booted? If so you could check to find out what network chipsets are used in Windows. Apple's System profiler doesn't give you this info as far as i know. The reason i ask (and i keep mentioning this each time i hear of these issues) is that the Nvidia chipsets can take longer than usual to find the domain.
They are mac minis, of varying age. Some have Nvidia chips and others don't, but the problem doesn't seem to reside on any particular machine type.
I've done some debug logging, and the only errors I can find are the following...
14260 - eDSBufferTooSmall
14002 - eDSOpenNodeFailed
These errors both occur during an unsuccessful boot attempt (in that both directories are not found)
These don't show me anything in particular, but then my largely untrained mind might be missing something quite important!! 've looked around on the net for any answers but nothing obvious is sticking out.
How have you deployed these machines?
Did you use deploystudio?
If so take one of the machines that is behaving badly and re-image it. I've had a few where the Directory Services hasn't quite bound properly during the restore process.
I have deployed using DS.
Problem is, there aren't any particular machines that are behaving badly. It is a very intermittent and random problem.
In Directory Utility, even when the AD hasn't responded, the bind still exists, but just has a red light and says that the domain server does not respond. Things that can give a quick fix are re-binding the machine, restarting or unplugging the ethernet cable and replugging when the OD connection is lost. Sometimes these steps work, sometimes it takes bloody ages!!
As a quick fix, I have stopped the machines sleeping during the day, as a sleep seems to restart the directory service and sometimes after waking the AD Domain isn't found. So by stopping sleeping, once the machine is bound to both domains in the morning, it should stay that way for the rest of the day.
Not very carbon neutral I know, but I have no choice. One thing Apple could build into the next OS update is if the directory service can continually refresh a bad connection until it becomes good. This might mean a bit of time for machines that aren't playing ball, but at least they should all eventually connect.
Maybe try re-installing the OS from the DVD /cd drive to see if this helps one of the clients. After re-install try binding this machine.
Also update your Deploy Studio and then try again. It does however say that you should re-build the image /.NBI for the latest version though. Maybe try rebuilding the Client and then imaging that machine once built?
Just a few things to try. As you can probably guess, I do suspect that the re-imaging is a possible problem. I have seen a few things that have been a little strange after imaging and all related to Directory Binding (macbooks losing the OD bind after restart on the same 3 machines.Exactly the same image).
At least we can remove another potential cuplrit.
finally got back to posting a reply and the good news is that my apple-no-login-shake problem now seems to have gone away, *joy* , so the rebuild of the DNS did the trick, I can use the windows side for weeks, then switch to the mac side for a lesson (no shake off), then back to windows.
To think of all the time I wasted rejoining the domain on each and every mac for each and every time I needed to use it. Well now its fixed and nobody, not even the apple experts, ever suggested that the problem is all down to DNS.
It's pleasing you managed to fix your particular problem.
"nobody, not even the apple experts, ever suggested that the problem is all down to DNS"
Perhaps you choose to remember and understand only what you want to? These boards (as well as others) continually stress the importance of a properly configured DNS Service. An earlier post of mine (as well as others) in this thread actually mentions:
"slightly iffy DNS"
As being relevant to the problems the OP (and others) mention. Even if DNS is 'perfect' you can still have problems logging in that may be caused by something else.
Antonio Rocco (ACSA)
So which aspects of the DNS rebuild did the trick. Perhaps a summary of what you did (as opposed to the massive post you made. Some may not want to read it all. :)) might help. I for one am curious as to what you did try in the end and what was successful. Did you use reserved DNS entries coupled with reserved DHCP addresses?
I think the main thing was that it just looked in a mess when comparing it to DNS at a different school, or I should say that perhaps it just looked 'different'. When I rebuilt DNS I had the forward and reverse zones, my cached lookups and my events folders, nice and neat. One the old DNS I also had things like _msdcs on the same level and possible some other entries.
I guess the important thing was that the server had been through an upgrade path from Win2K so some old functionality was being catered for, possible to the detriment of a happy win-mac environment. So when I did the rebuild I took the opportunity to switch to Win 2003 functionality ONLY.
Another important issue is probably to not manually add records to DNS while logged in as another user, I think the old records got added by our apple contractor using remote desktop, (no idea what he would have logged in as), and I believe windows system then treats those records totally hands-off, or something [I can't remember exactly what I read but it was in a windows online doc], ....erm basically I registered my servers, including macs, by instructing the servers to register themselves in DNS instead of manually adding records to DNS directly.
Anyway, I'm still happy to say the network is continuing to behave itself now.
By the way, I do have reserved DHCP entries although some of these dont actually get used since the servers have fixed/manual IPs. It doesnt do any harm to list the items anyway since it puts everything in one place and I know whats free and what isnt.