+ Post New Thread
Page 1 of 2 12 LastLast
Results 1 to 15 of 26
Hardware Thread, RAID issues in Technical; Experienced a drive failure in one of our servers last week however I'm rather disappointed in the resiliency side of ...
  1. #1

    synaesthesia's Avatar
    Join Date
    Jan 2009
    Location
    Northamptonshire
    Posts
    6,513
    Thank Post
    627
    Thanked 1,173 Times in 900 Posts
    Blog Entries
    15
    Rep Power
    524

    RAID issues

    Experienced a drive failure in one of our servers last week however I'm rather disappointed in the resiliency side of things. Wonder if there's something I'm doing wrong - only thing I can think of is that I should try some different "drivers" for ESXi.

    Single Intel server with an Adaptec 6805 8 port SAS/SATA3 RAID controller with 2 drive arrays - a 2 drive RAID1 and a 4 drive RAID10.
    The Raid1 is home to the first domain controller, the Raid10 is home to the first file server. The drive that failed was in the Raid10. This shouldn't be an issue, it's a single failure and would just be degraded until the drive was replaced. Whipped the drive out so I know which one it is, drives on order for replacement on Wednesday however the VM guest only lasted a day before it just died a death. The datastore is "live" albeit extremely slow, and the guest is inaccessible - not even pinging.

    This will be resolved on Wednesday however I will have this nagging doubt in my mind should another drive fail. I don't want to interrupt my holiday time off again with things that should take care of themselves or at least tick over until we can resolve them! Any ideas on how I can resolve this permanently?

    (For reference, the failing drives are Seagate. Never, ever again.)
    Last edited by synaesthesia; 24th August 2014 at 01:22 PM.

  2. #2

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    11,271
    Thank Post
    884
    Thanked 2,749 Times in 2,322 Posts
    Blog Entries
    11
    Rep Power
    785
    Seagate has gone down hill and are now rubbish, I have never found an Adaptec RAID I am happy with, HP rebrands and LSI rebrands in IBM seem to do an all right job but have never had good luck with adaptec. Perhaps a different and beefier RAID controller with more memory, RAID 10 should not slow down that much with a single dropped drive. Had RAID 5 sets fail one drive and keep going at almost full speed with the HP Smart Array stuff.

  3. #3

    Join Date
    May 2014
    Location
    San Ramon
    Posts
    5
    Thank Post
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0
    You should always have a drive available to pop in a RAID in case of drive failure. Or you could move everything to a RAID 6 configuration, where you could survive TWO drive failure. I personally don't see much of a point in doing RAID10. I'd rather do RAID5 with a hot spare or a RAID6 configuration.

  4. #4
    DMcCoy's Avatar
    Join Date
    Oct 2005
    Location
    Isle of Wight
    Posts
    3,505
    Thank Post
    10
    Thanked 508 Times in 445 Posts
    Rep Power
    116
    I'm going to make a guess. SATA drives? Do you still get write caching with a failed array?

  5. #5

    synaesthesia's Avatar
    Join Date
    Jan 2009
    Location
    Northamptonshire
    Posts
    6,513
    Thank Post
    627
    Thanked 1,173 Times in 900 Posts
    Blog Entries
    15
    Rep Power
    524
    Yeah they're SATA drives - a server built on rather a tight budget. And yes, write caching with a failed array apparently.
    A little research shows this could be an ESX issue, there's problems relating to this sort of thing since 5.1 - will need to do some more digging, but not until start of term!

  6. #6

    synaesthesia's Avatar
    Join Date
    Jan 2009
    Location
    Northamptonshire
    Posts
    6,513
    Thank Post
    627
    Thanked 1,173 Times in 900 Posts
    Blog Entries
    15
    Rep Power
    524
    Secondarily, @ericdano we don't need lectures that are not relevant to the problem in hand thanks! We should have a spare, but personally I'm glad we didn't. A spare would have meant putting in another Seafail drive. We do keep spares for all arrays, just luck of the draw we didn't for this one. I would not use a RAID6 on a VM host, the write speed drop is far too harsh.

  7. #7

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    11,271
    Thank Post
    884
    Thanked 2,749 Times in 2,322 Posts
    Blog Entries
    11
    Rep Power
    785
    Just to be a pedant

    Quote Originally Posted by synaesthesia View Post
    I would not use a RAID6 on a VM host, the write speed drop is far too harsh.
    on your available hardware.

    Again agree with the Seagate assessment, even their Enterprise RAID sata drives are shocking, had three fail with weird SMART errors inside a year and a half.

  8. #8

    synaesthesia's Avatar
    Join Date
    Jan 2009
    Location
    Northamptonshire
    Posts
    6,513
    Thank Post
    627
    Thanked 1,173 Times in 900 Posts
    Blog Entries
    15
    Rep Power
    524
    Indeed, these were enterprise drives. Annoying really. Seatools doesn't even find the drive defective yet every other tool available says it's toast.

  9. #9
    AButters's Avatar
    Join Date
    Feb 2012
    Location
    Wales
    Posts
    562
    Thank Post
    187
    Thanked 126 Times in 97 Posts
    Rep Power
    46
    This is where a hotspare is worth it's weight in gold (literally!).

    IMHO you'd be better off flattening that server (at some point) and losing the RAID1 array as it's just a waste of drives. Use 5 drives in Raid6 and keep the 6th one online set as a hotspare.

  10. #10

    synaesthesia's Avatar
    Join Date
    Jan 2009
    Location
    Northamptonshire
    Posts
    6,513
    Thank Post
    627
    Thanked 1,173 Times in 900 Posts
    Blog Entries
    15
    Rep Power
    524
    Not on SATA drives I'm not! I would prefer to keep the DC as physically separate from the rest of it as possible hence the separate array. As said re the RAID10 setup, we have a very limited budget and the setup is built around it, so bearing in mind the lower end controller and drive combination we need to keep performance up as much as possible. Not fussed about write performance on a DC but the file server concerns me greatly - that's home drives and shared areas. Maybe in a few years when those shared areas are done away with, but not yet!

  11. #11


    Join Date
    May 2009
    Posts
    3,395
    Thank Post
    301
    Thanked 917 Times in 684 Posts
    Rep Power
    346
    Quote Originally Posted by ericdano View Post
    I'd rather do RAID5 with a hot spare
    It has been discussed here a number of times but it is probably worth repeating. RAID5 should never be provisioned with a hot spare. A failure of a drive on a RAID5 should trigger copy of the data and only then should you attempt to put in a new drive and rebuild the array. RAID5 is really best avoided in any kind of production environment.

    (Apologies for the slight hijack, we now return you to your normal viewing).

  12. #12

    synaesthesia's Avatar
    Join Date
    Jan 2009
    Location
    Northamptonshire
    Posts
    6,513
    Thank Post
    627
    Thanked 1,173 Times in 900 Posts
    Blog Entries
    15
    Rep Power
    524
    Funnily enough, I might not really haveany other option but to accept the performance hit. It won't take up the slack of the failed drive on the hotspare >:| Currently tryign to force it online to recover it.

  13. #13
    AButters's Avatar
    Join Date
    Feb 2012
    Location
    Wales
    Posts
    562
    Thank Post
    187
    Thanked 126 Times in 97 Posts
    Rep Power
    46
    To be honest, once your virtualised, keeping resources physically separate is kind of defeating the point. I would suggest it is much better to have one pool of more reliable storage complete with hotspare, rather than multiple pools with less resiliency and no hot spares.

    Either way - it is a bit worrying that the guest has become inaccessible, that shouldn't happen (hasn't happened when I've had degraded arrays). I could understand it may happen during the rebuild process as most low end raid cards can't cope with rebuilding an array whilst serving data off it, but it shouldn't happen during the initially degraded stage. it may be highlighting a further problem with other drive(s).

    Hate to be the one to ask this - but do you have backups?

  14. #14

    synaesthesia's Avatar
    Join Date
    Jan 2009
    Location
    Northamptonshire
    Posts
    6,513
    Thank Post
    627
    Thanked 1,173 Times in 900 Posts
    Blog Entries
    15
    Rep Power
    524
    Yeah, backups are good. I'm half way through recovering from the array as a "live" backup recovery, if that fails I'm not too worried (other than losing my holiday to resolve this!) as I can recover from those. Waiting on Parcelfarce to deliver the replacement drive (plus extra spare) that they should have delivered yesterday (GRR!) then it looks like I'll be flattening the arrays and starting from scratch. Performance copying from the drives whilst it's "rebuilding" is oddly fine, how or why it impacted on the running of the server does indeed concern me, especially on the other array (which is what convinced me that it's probably best to do as you suggested as we then gain absolutely nothing keeping them separate).

    It would be so nice to even believe we'd be listened to if we said we needed a SAN - a couple of years ago yes, but I really don't believe *anyone* in schools should be doing that currently, not with the way things are going. Hence I'm not worrying too much, get this back up and running and it'll tide us over nicely until we basically have machines in school that are only a physical gateway into servers hosted elsewhere!

  15. Thanks to synaesthesia from:

    AButters (27th August 2014)

  16. #15
    AButters's Avatar
    Join Date
    Feb 2012
    Location
    Wales
    Posts
    562
    Thank Post
    187
    Thanked 126 Times in 97 Posts
    Rep Power
    46
    Yeh making do and mending here too. I'm looking at VMware VSAN next year for our server refresh. No san needed, yet all of the benefits of a san (and less drawbacks).



SHARE:
+ Post New Thread
Page 1 of 2 12 LastLast

Similar Threads

  1. Dell R410 Server Build - RAID Configuration Issue
    By Matt_Brfc in forum Hardware
    Replies: 0
    Last Post: 12th March 2014, 12:44 PM
  2. serious share on RAID issue
    By manxdan in forum Mac
    Replies: 3
    Last Post: 15th March 2012, 11:28 AM
  3. Exchange 2003 and Server 2003 SP1 issue.
    By tosca925 in forum Windows
    Replies: 0
    Last Post: 21st August 2005, 11:32 PM
  4. DIY NAS RAID 5
    By Dos_Box in forum Hardware
    Replies: 10
    Last Post: 13th July 2005, 11:16 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •