Hi all hope you can all give me some ideas on things to look for.
Seam to be having lots of random things happening all over the network and having problems finding connection with them.
We have a Windows server 2003 R2 based network.
3 x Domain Controllers 2 with DHCP, DNS and the other with DFS.
We have around 700 pupils using mandatory profiles.
All staff using local profiles
All Clients are Win XP prof SP3
All switches are 3com
Everything was working fine before the 6 weeks holiday, but when i came back i discovered that one of the HDD had failed on DC2. It was a Dell poweredge 2850, so called Dell and they sent me a replacement and the raid (5) appeard to rebuild ok.
However, i had had complaints from various teachers that pupils drives would fail to map intermittently. This i had attirbuted to DC2 authentication issues relating to faulty HDD. But the HDD was replaced over a week ago and the problem is still happening. There are no errors in any of the logs on the DC's. Perfmon shows no excessive usage.
I turned off DHCP on DC2, and pointed all clients to DC1 and it made no difference. The logon problems are intermittent so i cant seam to replicate them and i have noticed that DC1 has disappeared from Network Neighbourhood but still pings by name.
Hmmm have changed around my FSMO roles and, taken the offending mapped drive off the DFS and just created it as a standard network share. This has helped a bit by reducing the problem by about a third. Getting desperate now, any ideas dont really care if they are unlikely just wanna try anything at this point.
Amazingly enough, no the RAID 5 system in it was Hot swappable so i just took out the old HDD and replaced it and the RAID system automatically resynced itself. I did give it a reboot after, in fact i have rebooted every server after that still no joy.
I cant find any evidence of replication issues either, all the logs say everything is working fine, changes made on one server appear to replicate fine to the others straight away.
Ie user accounts created on one appear on other staright away, if i use WDS to install a new machine, it adds the name of the computer to active directory and creates the relevant DNS entries, and also works if i add the machine manually. etc.
Do you know of any tools i can use to analyse this further?
Originally Posted by CHR1S
Is it safe to assume that DC2 had some downtime? Could be AD replication issues?
Sysvols appear to be replicating fine, anything added to one server instantly replicates accross them all.
The last error in the FRS event log is on one DC and was from a couple of months ago telling write caching was enabled, which i remember disabling over the holidays. Other than that nothing.
You can see why i am tearing my hair out.
Try some of these and see if they will help you see what it is.
Have you had a look at dns if it as a dns problem it might not find the name in time and be skipping the map drive. Have you tried replacing the computer name in the script with the ip address.
Check the times on the pcs and the servers if they are not within 5 mins kerberos will not work.
Are you switches having problems say a loop back or is something like a nasty network card broadcasting all the time.
Lastly which I am sure you have done event logs.
Just a few ideas
I have attached the results of the three tests you recommended and they all seam to have passed without a hitch.
The DC's all timesync with DC01 so all say the same time.
Will try replacing the name with ip address, but i am suspecting more and more it is a networking issue rather than an active directory or DNS one.
We have a 3 Com network with mainly 4500 swiches, our core swich is a 7750 however i have no access to it as it doesnt have a web interface and the terminal is password protected by a company that went bust over a year ago. Oh the Joy.
Not that good with hardware myself, any ideas on how i should go about tracking a loop or a fauly NIC, bearing in mind we have no cable diagrams and most of the wall sockets are not marked. Great yeah i know, yep i did inherit this network. Lol
Last edited by Bezwick; 2nd October 2009 at 11:18 AM.
I will have a look at the log files now. Is the problem all over the network or in just one area. If say all the computers are all attached to one switch pull the power on the switch at a low useage time and lease the power out for 5 mins then plug it back in.