Hardware Thread, Imminent Hard Drive Failure on SIMS Server! in Technical; Planned power outage last night, so downed all the servers ready to come in a little bit earlier and power ...
13th August 2009, 01:36 PM #1
Imminent Hard Drive Failure on SIMS Server!
Planned power outage last night, so downed all the servers ready to come in a little bit earlier and power them up.
The Compaq ML250 server that hosts SIMS refused to start, saying no logical drives found! Then various other messages on S.M.A.R.T came up about accepting data loss, or drives suddenly appearing. Me, at this point panicing.
After a couple of re-boots, the server does boot up, but with two of the four hard drives having flashing red lights. Management console is indicating both have 'Iminient Hard Drive Failure'.
Took an opportunity to do a quick SIMS backup (with dbattach), and copied all resulting files along with other files (docserver, setups, etc) to another server. Does appear to be working ok at the moment though.
Phone call to HP as server is still under warranty, and two new hard drives should be here tomorrow, but could be anytime. Obviously going to have to do one at a time and let them rebuild, but I'm due on holiday tomorrow!!! Typical!
SWGfL connection didn't come up straight away either, the media convertor just seem to click away madly for about half and hour before that started working again!
Remote Installations Services seemed broken as well, but fixed that after seeing that Windows Deployment Service for some reason didn't start correctly. Also seems ok at the mo.
Feeling a little stressed at the moment! And breathe.....
13th August 2009, 02:10 PM #2
I wouldn't trust the current data on the drives now, probably best to go back to the backup from the night before
13th August 2009, 02:22 PM #3
Go get a beer
13th August 2009, 06:57 PM #4
I lost 3 drives out of a 6 drive RAID 5 array when I downed a server at the beginning of the holidays that had been on pretty much constantly for over 2 years. It's always the time when the drives are going to fail when they've been running a long time.
Always make sure you have a good backup before turning off a server, just incase!
13th August 2009, 07:02 PM #5
I used to lose raid controllers instead, Compaq ones had non serviceable batteries which would die and corrupt the controller memory so it would no longer boot/show in bios. Choices were replace or leave the server off for a couple of weeks and hope it went completely flat again.
14th August 2009, 05:11 PM #6
Well, new disks arrived; have replaced one and it happily recognised and started to re-build. Still showing as rebuilding parity info so going to have to leave it until to finish. Put the other hard drive in on Monday, but all is backed up so should be good.
By cookie_monster in forum Hardware
Last Post: 6th August 2009, 08:02 PM
Last Post: 13th August 2008, 12:55 AM
By HarryMonkey in forum *nix
Last Post: 3rd April 2008, 07:19 PM
By mullet_man in forum Windows
Last Post: 31st January 2008, 02:56 PM
Last Post: 14th January 2007, 09:53 PM
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)