+ Post New Thread
Page 1 of 2 12 LastLast
Results 1 to 15 of 17
Windows Thread, Help - server disaster! in Technical; Hi, OK I know I should have a plan for this, but I don't. It was on my 'to do' ...
  1. #1
    eean's Avatar
    Join Date
    May 2006
    Location
    Kuala Lumpur
    Posts
    559
    Thank Post
    65
    Thanked 52 Times in 37 Posts
    Rep Power
    29

    Exclamation Help - server disaster!

    Hi,
    OK I know I should have a plan for this, but I don't. It was on my 'to do' list, really, it was.
    Setup:
    We're a primary school (55 computers) - We've got 5 yr old converted RM storebox now running plain Window's Server 2000 AD domain. 1.7ghz, 700mb ram. 2 hard drives. One with system and the other with everything else, which is then extended back onto the system drive, because we ran out of space.
    Server does everything: files, AD, printers, WSUS, SQL for Abacus maths planning software.
    Problem:
    4.30pm on Friday, I restart the server because it's running really slow. Get a BSOD on startup. System Disk error.
    Backup
    I have external hard drive with NT backup of everything. No special 'open file options' though.
    I can also access all drives from the Recovery console via the Server 2000 CD.

    I need - a plan!
    Prior to this, the server ran fine; I donít like upgrading for the sake of it but, on the other hand, Iím leaving at the end of the year and Iíd like to leave them setup for a few years to come so I could use this as an opportunity to upgrade.
    As I see it, I've got 4 options:
    1. Replace the faulty hard drive, and somehow recover it all. This is not the first HD failure we've had (3 out of 3 RM server hard drives have now died on me!), so it leaves us very open to the same problem in future.
    2. Buy 3 new SATA HDs and a RAID card. Any recommendations on these - they seem to range from £30 to £300. What's the difference?
    3. Use this opportunity to buy a new server, with RAID. Servers with RAID already in, seem very expensive..
    4. (wildcard): I have a spare 1.7ghz server. Set both servers up (with new HDs) to provide some redundancy? Are two old servers better than one new one?


    All of them involve reinstalling everything! What's the best way of doing this? If I install Windows 2000 can I just start it in safe mode, recover as many files as it'll let me, restart and hope for the best? (I'm guessing this WONT work very well!) I know all the user data is safe, but I'm worried about all those little 'fiddly bits' I've setup over the years - WSUS, RIS, SQL, permissions on files etc...
    Is there anyway I can upgrade to Server 2003, while I'm at it?
    Ideas please!

  2. #2

    m25man's Avatar
    Join Date
    Oct 2005
    Location
    Romford, Essex
    Posts
    1,617
    Thank Post
    49
    Thanked 448 Times in 331 Posts
    Rep Power
    136
    Don't do anything in a hurry.

    Why not first boot from cd to the recovery console and run chkdsk!

    It sounds like your server is simple enough. if the disk is still being recognised it's probably something simple that a chkdsk will fix.
    I had to fix a Secondary Schools DC Raid 5 last weekend, not booting 1 failed raid volume no hot spare no backups of any kind!

    Took 3 days to recover it all but it's been running since monday without a problem.

    Use the CMDCONS first then consider DR alternatives.

  3. 2 Thanks to m25man:

    eean (27th January 2008), FN-GM (27th January 2008)

  4. #3

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,991
    Thank Post
    851
    Thanked 2,653 Times in 2,253 Posts
    Blog Entries
    9
    Rep Power
    764
    My first plan would be to try and recover the existing hardare. I would get a replacement hard drive and use a robust imageing tool to mirror the corruptud system volume. I would then try running chkdsk on the recovered volume. If you can still see the files this could bring it back or at least change the error.

    If it does change the error it should give you a better idea of what is cooked and by replacing damaged files you may be able to reanimate it.

    I would then turn to prevention and at the very least get a raid controller. The more expensive ones usually have more onboard memory and do the work themselves rather than offloading it to the cpu. This makes them way faster and totally worth the extra.

    It will be much easier to transfer all of the stuff if you can get the thing up and running first then add raid to the second one and transfer it all across that way.

    If you want the setup to be good and relyable for three years I would have to go with a new server if they can afford it. It will be faster, probably more relyable than the two older ones and most importantly it comes with a warrenty.

  5. 2 Thanks to SYNACK:

    eean (27th January 2008), FN-GM (27th January 2008)

  6. #4

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,613
    Thank Post
    1,229
    Thanked 772 Times in 670 Posts
    Rep Power
    234
    Quote Originally Posted by eean View Post
    4.30pm on Friday, I restart the server because it's running really slow. Get a BSOD on startup. System Disk error.
    I did exactly the same thing to our old Windows 2000 server. Painstakingly imaged the harddrive before doing anything, then found that chkdsk fixed it in 10 minutes.

    Buy 3 new SATA HDs and a RAID card. Any recommendations on these - they seem to range from £30 to £300. What's the difference?
    The £30 ones are "fake RAID" - RAID that needs an OS driver to function properly. They're really best suited to gamers who want to improve their disk performance by having two read heads read the same data at the same time. Get the £300 version.

    --
    David Hicks

  7. Thanks to dhicks from:

    eean (27th January 2008)

  8. #5

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    17,528
    Thank Post
    513
    Thanked 2,406 Times in 1,862 Posts
    Blog Entries
    24
    Rep Power
    822
    Personally, I would try and get the old unit up and running first, then buy a new server with raid built in and transfer all the services to it.

    You really need RAID nowadays for a school server, as it is so integral to the operation of a school. We've just ordered another new server from HP for £1900, and will get about £650 of that back via an offer they are running. This server has 2 SAS 146GB hard disks for use in a mirrored manner, 4GB RAM and dual quad core 2Ghz processors. That is not a lot of money for what it is.

    Take a look at the HP range of servers at insight (but remember you can get better prices by talking to them).

  9. Thanks to localzuk from:

    FN-GM (27th January 2008)

  10. #6
    eean's Avatar
    Join Date
    May 2006
    Location
    Kuala Lumpur
    Posts
    559
    Thank Post
    65
    Thanked 52 Times in 37 Posts
    Rep Power
    29
    Thanks. That sounds like a good plan. I left CHKdsk running when I left. Hopefully it is all magically fixed!
    We have enough money for a new server, so I will get one.
    I'm guessing the hard drive is fried, or is on it's way out, which has upset some system files. What's the best way of copying everything from the old hard drive to the new hard drive?... The old drive has an extended partition from the 2nd hard drive, which makes things more complicated.
    If I do manage to copy it to a new HD, but (say) some system files aren't working, can I run a repair windows install? Will that just replace any damaged files or will it mess everything else up?

    When the 2nd HD failed, I created a scheduled task that ran with System rights to xcopy everything from the old to the new (minus the files it had lost, which I logged and recovered from backup). But then, I could log into windows.
    How about: getting a new hard drive and a spare hard drive. Install a fresh windows on the spare. Will my 2 existing HDs be visible, including the extended partitions from the fresh windows? Will the system user of the fresh windows be able to access them??

    RAID question:
    Say I got a server with built in RAID or even a RAID card. What would happen if the RAID card or the MB died? Can I simply pull out the HDs and put them into a new machine with any RAID controller?
    Or, what if 1 HD failed. Do I have to use identical drives to restore the RAID.
    I'm thinking that the new server will be expected to last a long time 5yrs+ and finding identical parts may be difficult.

  11. #7
    meastaugh1's Avatar
    Join Date
    Jul 2006
    Location
    London/Hertfordshire
    Posts
    889
    Thank Post
    69
    Thanked 85 Times in 70 Posts
    Rep Power
    32
    Quote Originally Posted by eean View Post
    RAID question:
    Say I got a server with built in RAID or even a RAID card. What would happen if the RAID card or the MB died? Can I simply pull out the HDs and put them into a new machine with any RAID controller?
    Which level of RAID are you looking at, 1 (mirror, two disks), 5 (striped with parity, at least 3 disks)?

    Yes, you should be able to. Although having said that, I have had a card swapped (for same card but different cache size) due to manufacturer build error. When a disk failed, it wouldn't take a replacement and therefore not rebuild the array. Had to DR from tape. Although this may have just been a coincidence.

    Or, what if 1 HD failed. Do I have to use identical drives to restore the RAID.
    I'm thinking that the new server will be expected to last a long time 5yrs+ and finding identical parts may be difficult.
    The disk would need to be at least the size of the smallest disk in the array. For example if a disk in an array of 73GB disks, you could replace with 147GB if necessary. Although only half of the disk would actually be utilised.

    You may wish to consider adding a hotspare to the array. In the event that one online disk in the array fails, the hotspare can be used automatically. This would mean your array wouldn't need to operate in a degraded state while you try and source a replacement disk.

  12. Thanks to meastaugh1 from:

    eean (27th January 2008)

  13. #8
    eean's Avatar
    Join Date
    May 2006
    Location
    Kuala Lumpur
    Posts
    559
    Thank Post
    65
    Thanked 52 Times in 37 Posts
    Rep Power
    29
    Quote Originally Posted by meastaugh1 View Post
    Which level of RAID are you looking at, 1 (mirror, two disks), 5 (striped with parity, at least 3 disks)?
    Err... I don't know. I suppose with RAID 1, it would be able to cope if the RAID controller failed because you've just got 2 identical hard drives with no fancy parity going on.
    Some servers have RAID 10 (or 1+0 - same thing?). I read wikipedia, but I still don't understand what it means.

  14. #9

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    17,528
    Thank Post
    513
    Thanked 2,406 Times in 1,862 Posts
    Blog Entries
    24
    Rep Power
    822
    Quote Originally Posted by eean View Post
    Err... I don't know. I suppose with RAID 1, it would be able to cope if the RAID controller failed because you've just got 2 identical hard drives with no fancy parity going on.
    Some servers have RAID 10 (or 1+0 - same thing?). I read wikipedia, but I still don't understand what it means.
    RAID 1+0/10 is where you have 4 drives - 2 pairs of 2. Each pair makes up a single RAID 0 array (striped) and together the pairs make up a mirror (RAID 1).

    We have RAID 1+0 on our 2 DC's, RAID 1 on our application servers and RAID 5 on our NAS boxes.

  15. Thanks to localzuk from:

    eean (27th January 2008)

  16. #10
    eean's Avatar
    Join Date
    May 2006
    Location
    Kuala Lumpur
    Posts
    559
    Thank Post
    65
    Thanked 52 Times in 37 Posts
    Rep Power
    29
    Thanks to everyone so far. Anyone got any tips to answer my other question:
    Quote Originally Posted by eean
    What's the best way of copying everything from the old hard drive to the new hard drive?... The old drive has an extended partition from the 2nd hard drive, which makes things more complicated.
    If I do manage to copy it to a new HD, but (say) some system files aren't working, can I run a repair windows install? Will that just replace any damaged files or will it mess everything else up?

    When the 2nd HD failed, I created a scheduled task that ran with System rights to xcopy everything from the old to the new (minus the files it had lost, which I logged and recovered from backup). But then, I could log into windows.

  17. #11
    contink's Avatar
    Join Date
    Jul 2006
    Location
    South Yorkshire
    Posts
    3,791
    Thank Post
    303
    Thanked 327 Times in 233 Posts
    Rep Power
    118
    Quote Originally Posted by eean View Post
    I'm guessing the hard drive is fried, or is on it's way out, which has upset some system files. What's the best way of copying everything from the old hard drive to the new hard drive?... The old drive has an extended partition from the 2nd hard drive, which makes things more complicated.
    If I do manage to copy it to a new HD, but (say) some system files aren't working, can I run a repair windows install? Will that just replace any damaged files or will it mess everything else up?
    Using Ghost or Acronis Disk Director should solve the problem if you boot from the CD.

    I can't think of any reason why the extended partition won't continue to work with the other hard drive (been using dynamic disks?) once you've cloned it all. Just make sure that when you've cloned the original you take it out and replace it with the new one before booting again (thus avoiding any confusion to your OS or drive letter changes).

    As to the repair, once you have a good drive it's a case of you've done what you can to resolve the hardware issues.. if the clone got most of what it needed a repair may well solve the rest...

    Based on what everyone else has suggested so far though I'd go with the following.

    1. Clone the original "bad" drive using ghost 2003 or whatever
    2. On completion of the clone power down the system and swap out the original with the new clone.
    3. boot using the repair CD but just to run a CHKDSK routine to see if that solves it.
    4. If not reboot using the repair CD and complete a standard repair which won't reformat (unless you tell it to!).

    If that doesn't work you're going to need to start over regardless.

  18. Thanks to contink from:

    eean (27th January 2008)

  19. #12

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,991
    Thank Post
    851
    Thanked 2,653 Times in 2,253 Posts
    Blog Entries
    9
    Rep Power
    764
    Same advice as contink even down to the choice of software, but if you don't have access to those programs you could try using DriveImage XML which is on the free UBCD for Windows (a version of Windows that runs off a cd) or something like Forensic Acquisition Utilities.

    There are also many free linux based alternatives that I'm sure someone else can point to.

  20. #13
    eean's Avatar
    Join Date
    May 2006
    Location
    Kuala Lumpur
    Posts
    559
    Thank Post
    65
    Thanked 52 Times in 37 Posts
    Rep Power
    29
    Using Ghost or Acronis Disk Director should solve the problem if you boot from the CD.9
    I still fear that this spanned volume may cause me headaches...
    According to symantec:
    Ghost does not support dynamic disks having spanned, striped, or RAID-5 volumes

  21. #14
    eean's Avatar
    Join Date
    May 2006
    Location
    Kuala Lumpur
    Posts
    559
    Thank Post
    65
    Thanked 52 Times in 37 Posts
    Rep Power
    29
    I left Chkdsk running in repair mode when I left. The server will now start and work but runs at 100% CPU utilisation. When you press CTRL-ALT-DEL everything claims to be 0% CPU.
    If I start in safe mode, the server is running OK.
    Server has an uptodate Virus scan running on it, so I don't think its a virus, but I suppose I can't rule it out.

    Would a faulty hard drive cause this problem, or is it a red herring?

  22. #15

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,991
    Thank Post
    851
    Thanked 2,653 Times in 2,253 Posts
    Blog Entries
    9
    Rep Power
    764
    Could be, it sounds like some of the hardware in it is faulty. If there is a bit of hardware malfunctioning and constantly using its interrupt it will use up the CPU just like this. You could use a sysinternals tool called Process Explorer it should show up increased interrupt activity.

SHARE:
+ Post New Thread
Page 1 of 2 12 LastLast

Similar Threads

  1. Disaster Recovery Policy
    By pat_k in forum School ICT Policies
    Replies: 2
    Last Post: 7th October 2007, 11:50 PM
  2. disaster recovery plan
    By in forum Wireless Networks
    Replies: 5
    Last Post: 15th June 2007, 09:13 AM
  3. Disaster recovery with virtualisation
    By sidewinder in forum Windows
    Replies: 6
    Last Post: 30th April 2007, 01:28 PM
  4. Disaster recovery
    By Chrispy in forum How do you do....it?
    Replies: 5
    Last Post: 23rd May 2006, 11:33 AM
  5. disaster recovery documentation
    By russdev in forum School ICT Policies
    Replies: 16
    Last Post: 20th March 2006, 11:29 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •