+ Post New Thread
Results 1 to 6 of 6
Hardware Thread, Raid fails after drive swap in Technical; ProLiant ML150 G2 – Windows 2003 standard Raid Controller adaptec 2610sa with 4 160gb drives in raid 5 config HP ...
  1. #1
    ozydave's Avatar
    Join Date
    Jun 2007
    Posts
    243
    Thank Post
    68
    Thanked 34 Times in 23 Posts
    Rep Power
    31

    Raid fails after drive swap

    ProLiant ML150 G2 – Windows 2003 standard
    Raid Controller adaptec 2610sa with 4 160gb drives in raid 5 config

    HP Storage Manager Agent reported a failed drive. I checked the server and saw a constant amber light on the drive. (the first one in the array at the bottom). Then the server shut down. I pulled the drive and replaced it with a 250gb one (new). Restarted the server. During the start up no raid was found, it then said that the bios was not installed and went into a network boot.
    I shut down the server, put the old drive back in (I had put the suspect drive into another HP server and it seemed to work fine)
    I again restarted the server. This time the raid was found but it came up on the screen “drive is missing or degraded”. The restart got and far as “windows is starting “. It then came up with an error
    “Lsass.exe – system error security accounts manager initialization failed because of the following: directory services cannot start. Error status 0x00002e1. Please click OK to shut down this system and reboot into directory services restore mode, check the event log for more detailed information”
    Another normal restart – the above error still came up but there was another window in the background saying “active directory is rebuilding indices” but seemed to stay on this
    Another restart but this time into directory services restore mode. The server is now running in this mode. The light on the suspect drive is now a constant green, the other drives are flashing green.
    The HP storage manager shows the RAID controller and it says all drives in the array are ‘optimal’. But is says that the logical drive is degraded.
    While in this restore mode I thought I would hot swap the suspect drive and allow the raid to rebuild. As soon as I pulled the first drive in the array the server went into a blue screen and shut down. Again put the suspect drive back in and again the server is running in directory services mode.
    Why can’t I just swap the first drive in the array? Is active directory broken?

    Any help would be great?

  2. #2

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,986
    Thank Post
    850
    Thanked 2,652 Times in 2,252 Posts
    Blog Entries
    9
    Rep Power
    764
    Eeek, I suspect that the initial drive failed to write or read some data properly. When you changed over the drive while it was off it got confused when trying to read the volume information off the disks and reset the error counts on the system. When you put the old drive back in it was able to properly detect the array again and initialized it but had lost the error count information and marked the bad drive as good. As the drive was marked as good and the data was not consistent the controller attempted a rebuild of the affected areas unfortunately using the busted drive as the source drive.

    Best procedure in that failed drive situation is to replace it while the system is still on if it is hot swap capable.

    You may be able to recover the system so long as the data corruption has not spread to far but I would be looking at backups for AD or better yet another domain controller if one is available.

    You may need to let it rebuild with the failed drive and hope that it does not cook all of the data, you should be able to boot it from just the good disks without the replacement drive in when you boot and have it boot then add in the replacement drive and rebuild the disk set.
    Last edited by SYNACK; 23rd September 2008 at 09:50 AM.

  3. #3
    torledo's Avatar
    Join Date
    Oct 2007
    Posts
    2,928
    Thank Post
    168
    Thanked 155 Times in 126 Posts
    Rep Power
    47
    Another good thing to have done, if your going with a RAID 5 setup for an OS volume was to have a hot spare, so that if the controller detects a failed drive it can rebuild from the spare.

    That's if reseating the troubled drive while the system is on didn't rectify the original problem. A lot of the time that's all that is required.

  4. #4
    Butuz's Avatar
    Join Date
    Feb 2007
    Location
    Wales, UK
    Posts
    1,579
    Thank Post
    211
    Thanked 220 Times in 176 Posts
    Rep Power
    62
    Agree with Toledo - hot spares are well worth having.

    One thing that could have happened - if you did not have Background Consistency Check enabled on the controller the Array parity could have slowly become slightly corrupt over time and now that a drive has actually failed is is too currupt to re-build the array properly.

  5. #5
    ozydave's Avatar
    Join Date
    Jun 2007
    Posts
    243
    Thank Post
    68
    Thanked 34 Times in 23 Posts
    Rep Power
    31

    Update - I now have a little less hair.

    Promoted the backup domain controller and demoted the other. The server with the active directory error has had AD removed and restored.
    The old PDC still has the RAID error. Management software still says the logical drive is degraded. I have tried replacing the suspect drive with a new one. As soon as I pull it the suspect disk the server crashes. With the new drive in place the server will not boot, saying no RAID installed.
    Put the old suspect drive back in and the server start normally. The suspect drive has a constant green light on. The other drives have flashing green lights. I put a additional new drive in (so that now makes 5 disks) and created a hot spare out of it. I rebooted the server which started normally.
    I was hoping the RAID with now use the hot spare, it doesn’t.
    Questions.
    I think that the constant green light on the suspect drive means the disk is online but inactive. I could initialize the drive but will that delete the data in the whole array or just that disk?
    The suspect drive is the first in the array. Does the controller write boot info to this drive? That’s the only reason I can think off why I cant hotswap.
    Can I add another disk and make that part of the array? Will this delete data in the whole array? It will be 5 disks in the array including the suspect disk. When the array has finished rebuilding could this latest disk be removed and placed in the suspect drive slot?
    I am new to RAID so go easy
    Cheers

  6. #6

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,986
    Thank Post
    850
    Thanked 2,652 Times in 2,252 Posts
    Blog Entries
    9
    Rep Power
    764
    Quote Originally Posted by ozydave View Post
    I was hoping the RAID with now use the hot spare, it doesn’t.
    Questions.
    I think that the constant green light on the suspect drive means the disk is online but inactive. I could initialize the drive but will that delete the data in the whole array or just that disk?
    The suspect drive is the first in the array. Does the controller write boot info to this drive? That’s the only reason I can think off why I cant hotswap.
    Can I add another disk and make that part of the array? Will this delete data in the whole array? It will be 5 disks in the array including the suspect disk. When the array has finished rebuilding could this latest disk be removed and placed in the suspect drive slot?
    I am new to RAID so go easy
    Cheers

    The constant green light usually indicated that that is the drive that it fully online and the blinking indicates activity/rebuilding.

    The array data for the volume set is written to the beginning of each drive in the set or at least it should be but I have never had much luck with the adaptec cards in comparison to other brands.

    You can add the new disk to the array if the controller supports expansion but probably not while it is still rebuilding. This will also leave you with a 6 disk set and still a possible failed drive that would need to be replaced.

    Your only options appear to be to let it finish rebuilding then hot swap the drive out or attempt to boot with only the good disks in (but not a good plan to turn it off while it is rebuilding)

    When you finally get this sorted I would recommend upgrading the RAID controllers firmware to the latest version to hopefully make it a little more reliable and as the others have said grab another disk as a hot spare.

    As the drives are SATA you should be able to remove the suspect drive after the rebuild is complete and install it in a standard computer to run a manufacturer diagnostic on it with something like seatools (segate) to see if it is the drive or controller to blame.

    Googleing the controller does not bring up many nice comments about its abilities under load, you may want to look at replacing it outright with a higher model adaptec controller if the server is under much load. If you stick with adaptec there is a good chance that the RAID disk set will be transportable and it will simply be a case of plugging the drives into the new controller.

SHARE:
+ Post New Thread

Similar Threads

  1. International Swap
    By RabbieBurns in forum General Chat
    Replies: 3
    Last Post: 11th August 2008, 08:17 AM
  2. Bulb Swap
    By plexer in forum Hardware
    Replies: 1
    Last Post: 14th February 2008, 11:57 AM
  3. Testing a RAID drive for a bad disk
    By contink in forum Hardware
    Replies: 3
    Last Post: 25th January 2008, 07:12 AM
  4. Replies: 5
    Last Post: 7th December 2007, 02:36 PM
  5. Advice on hot swapping Raid Drive
    By tosca925 in forum Windows
    Replies: 3
    Last Post: 28th September 2006, 08:22 PM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •