RAID controller drive drop out weirdness
We have an RM Server fitted with an LSI MegaRAID SAS 830ELP controller. There are 4 drives, 3 configured as a RAID5 and one global hotswap.
Last week the controller started beeping, we looked at the logs and the drive in slot 0 come up as: Foreign (Unconfigured) Good. While we investigated I swapped the drive with an identical spare. We tested the drive which dropped out using chkdsk and it came back fine so we configured it as a global hot spare in slot 3.
This week the new replacement drive in slot 0 has come up as: Foreign (Unconfigured) Good and the hot spare has jumped in to action.
I've taken the side off the server to see if there are any cables to slot 0 but it's all plugged directly in to a daughter board in the front and a single chunky cable going to a PCI type slot at the back.
I don't really want to think it, but if it happens one more time in my mind I'm going to have to buy a new RAID controller and rebuild the server.
Before it happens though, has anyone got any suggestions or other things to check?
Also, will I have to rebuild the server from backups or is there a non destructive method?