Jump to content

RAID Dead Please help (Dell Adaptec Perc 5/i)


Recommended Posts

Posted (edited)

Came in to find the main file server had died. Tried to boot but the logical drive is broken causing the system to see no boot devices.

I go into the Perc Raid setup and find that 2 physical disks of the 4 which form the RAID5 array are showing as missing/forgien copnfiguration.

 

1 of the disks looks like it is well gone as the warning lights are flashing but the other may be ok. Is there any way of joining these 2 disks (or just 1) back to the array in order to recover teh data?

 

To make things worse this server may not have a working backup :rolleyes:

 

 

The RAID is just a plain old RAID 5 with no hot spare.

Edited by Guest
Posted

Can you reinsert the drive that may not have physically failed to see if you can get the array back to a degraded state ready for a new hard drive?

 

Ben

Posted

It is possible that two disks have failed simultaneously but it is unlikely. I have been in a similar situation before. I suspect that you may have a firmware error in all four disks even though it is presently only manifesting itself in two of them.

 

Do you still have warranty support with Dell? If so, I would contact them for detailed hand-holding as sorting this out is both complex and also involves a fairly high risk of data loss as it is not guaranteed that the solution will fix the problem.

 

I would also get the system tag no and look up Dell support on-line for tech advisories in respect of disks. You may have to identify the disk manufacturer and model.

 

If you are suffering the same problem I did the annoying thing is that Dell classified the fix as non-urgent. The fix involves identifying the order in which the disks failed, attempting to restart them in the same order and then updating the firmware on the disks and finally updating the firmware on the disks that are still working. The devil is in the detail as you would expect, hence the suggestion to speak to Dell support.

 

If the fix fails then the only other solution is a full tape restore.....

 

If you want to take this 'conversation' outside the public arena, post an email address.

Posted

Pay someone £££ to recover data. You should have a backup. I've just been fixing two corrupt SIMS databases after the hdd failed, not fun.

 

Not sure if you slap another disc in if it'll either 1 repair it's self (which it should), or destory it. You should have tested the raid before making it live so you should be ok here ;)

 

Good luck. We've all made the these mistakes and smart people rush out and get a good backup solution after\during it (or start using it :p)

Posted (edited)
Cheers for the replys people. Will look into Dell support. Edited by Guest
Posted (edited)

Also gunna try the dell tools tomorrow to see what they say.

 

 

Does anyone know of any methods of adding disks back to arrays even when they think they dont belong? Im sure atleast one disk must contain atleast most of the data. If we lose some data then so be it but thats still better than nothing at all.

Edited by Guest
Posted

In a RAID array the data is spread accross the drives entirely, one chunk of a file will be on 1 drive and the other will be on the next drive. they don't fill one drive up and start using the next. Data recovery from one drive would probably yeild very very little usable data.

 

I would get in touch with dell and see what they can suggest, failing that if the data MUST be recovered then contact a data recovery company ASAP. Its expensive but then again so is losing the data.

Posted (edited)
In a RAID array the data is spread accross the drives entirely, one chunk of a file will be on 1 drive and the other will be on the next drive. they don't fill one drive up and start using the next. Data recovery from one drive would probably yeild very very little usable data.

 

I would get in touch with dell and see what they can suggest, failing that if the data MUST be recovered then contact a data recovery company ASAP. Its expensive but then again so is losing the data.

 

Yeah i know that cheers but im quite sure atleast 3 drives of the 4 will be recoverable, which is all you need for a RAID5 array to recover the data. I was wondering if anyone knew the methods the data recovery company would use to force these 3 disks back together

 

Cheers

Edited by Guest
Posted

Proper data recovery companies use specialist equipment to either repair the drives or recover the data off the hard drives.

My best recommendation is to switch off the file server, unplug it from the mains and leave well alone until you can get specialist support from either Dell or a data recovery company.

Be warned that Dell will take no responsibility for loss of data and what they get you to do may make it harder to recover the data.

The more you fiddle, the greater the risk of damaging the data that is on there and making it much more difficult and therefore more expensive to recover the data.

My advise is if the data is critical and you have no backup, seek the help of a decent data recovery company. It will be expensive, probably £1-2k.

Posted (edited)

Sorted.

 

As i suspected only 1 of the disked was actually broken, the other had just fell off the array. In order to fix the problem i had to go into the Perc BIOS and "Import" the "Foreign" configuration. (Quite easy and obvious really but not something which i could have taken upon myself to do without the go ahead of someone in the know)

 

Booted the server up and all is well, barring the obvious implications of having a RAID5 array with a missing disk.

 

:D

Edited by Guest
Posted
Good news! What I would do now is get on to Dell and get a replacement hard drive for the dead one and to avoid this problem in the future if there is space get an extra hard drive to act as a hot spare. Done this in all our servers now as we had a spate of hard drive failures. Of course, since putting in the hot spares none have failed!
Posted (edited)
Good news! What I would do now is get on to Dell and get a replacement hard drive for the dead one and to avoid this problem in the future if there is space get an extra hard drive to act as a hot spare. Done this in all our servers now as we had a spate of hard drive failures. Of course, since putting in the hot spares none have failed!

 

Yep thats my recommendation. The dell support guy actually looked into the firmware as you suggested. He didnt see any reported problems but got me to send him a diagnostic report on all the hardware which forms the array just to make sure, he is analysing it now. You cant knock dell server support! :D

 

 

On the backup front it looks asthough we would have been ok, missing a few days but most of it would be fine. Still not good enough imo so something to look at. The problem we have had is our SDLT320 will only hold 240gb give or take after compression which ment we can only backup half of the server at a time, resulting in an absolute maximum best case senario of "missing a day or 2 on half the server, other half ok" which still isnt good enough imo. Found 1 tape which should be fine had failed.

The tape with most of the data on is still trundling away after 3 hours and has only completed 19% of its restore, it still may fail and in either case would have ment downtime of atleast a couple of days, again not good enough imo.

Edited by Guest
Posted
What we've done, which worked out the cheapest and easiest solution was to buy a server maxed out with a load of big SATA drives to give us a huge storage area to back up everything onto. We run a full backup of all servers over the weekend and then differential backups on weekdays and take a copy onto removable hdd's once a week. Think it worked out at about £2k in total 3 years ago including a dozen external hdd's so would probably be a lot less now. Much easier and seems much more reliable than tape.
Posted

Thats my ultimate plan too (great minds think alike eh?)

 

We actually have an external backup solution on its way for the file server and 1 DC. No plans as yet for backing up the app server which is currently backed up onto an external HD, again something i am not comfortable with.

Posted

Yes great minds do think alike! It has been so much less hassle since we moved to this solution as before we were using 5 different tape backup drives attached to different servers. It was a right PITA and extremely unrealiable! What made the switch was a similar situation to what you had, especially when I went through the server backups for the problem server and found that the tapes which appeared to have backed up correctly would not restore:eek:. Managed to get the server back up and running with no data loss, but after that scare I demanded a decent backup solution (showing the head a quote from a data recovery agency to recover the server helped!) and also have a much more stringent disaster recovery plan.

Another thing I do is to use VMWare converter to make a VM image of all the key servers during any holiday, so that in a disaster I can quickly get some VM's up and running and restore the latest backup.

Posted

Anyone know if a HP SDLT320 should be able to restore data at the same rate as it backs it up?

 

The rated sustained backup speed is 57gb/hour native (uncompressed). So far my restore job has done only 28gb and has taken 3 and a half hours. At the rated speed it should have done knocking on 300GB after decompression.

 

Looks like im right not to have faith in our backup situation.

Posted

Depends on your load, how the tape is connected to the server.

 

Normally "Dell" don't give you a dedicated SCSI card for the tape drive. Dispite the huge performance issue.

Posted (edited)
Depends on your load, how the tape is connected to the server.

 

Normally "Dell" don't give you a dedicated SCSI card for the tape drive. Dispite the huge performance issue.

 

Opps sorry. Its on its own dedicated SCSI 320MB/s PCI-X card restoring onto a sata mirrored raid. Its a HW raid afaik (never actually looked much at the server tbh), ill look tomorrow.

 

Surerly though any HD i/o can handle more than 8gb/hour (the speed at which it is currently restoring). The tapedrives SCSI card is certainly not the bottle neck.

Edited by Guest
Posted
If your RAID array is still rebuilding it will slow it down quite a bit. Also writing to RAID 5 is quite a bit slower than reading (but not as slow as the figures you are getting).
  • 3 years later...
Posted

Waking a dead thread I had a similar issue happen this morning with our MD1000. I got a notice from Symantec Backup that the volume was offline and come in this beautiful Monday morning to find 13 of the 15 1TB drives flashing failed on the chassis along with the warning beep. I booted into the PERC5/E BIOS/config and it showed the 2 remaining non-failed drives as OK and foreign configuration. I have never had this issue before but glad I found this thread. I imported the foreign configuration and forced back online the 13 drives one at a time (no particular order). It appears to be healthy but I'm running a consistency check right now. Will probably grab the newest firmware and update. It's been stable for 4-5 years without issue before this on the same server/setup, etc. --Mike

 

 

Came in to find the main file server had died. Tried to boot but the logical drive is broken causing the system to see no boot devices.

I go into the Perc Raid setup and find that 2 physical disks of the 4 which form the RAID5 array are showing as missing/forgien copnfiguration.

 

1 of the disks looks like it is well gone as the warning lights are flashing but the other may be ok. Is there any way of joining these 2 disks (or just 1) back to the array in order to recover teh data?

 

To make things worse this server may not have a working backup :rolleyes:

 

 

The RAID is just a plain old RAID 5 with no hot spare.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



  • 33 When would you like EduGeek EDIT 2025 to be held?

    1. 1. Select a time period you can attend


      • I can make it in June\July
      • I can make it in August\Sept
      • Other time period. Comment below
      • Either time

×
×
  • Create New...