+ Post New Thread
Page 2 of 4 FirstFirst 1234 LastLast
Results 16 to 30 of 59
Hardware Thread, Please help with dying server in Technical; Originally Posted by swgeek Afternoon all.. If it is raid any way can you not just remove drives on a ...
  1. #16

    LeMarchand's Avatar
    Join Date
    Jan 2008
    Location
    The deepest pits of hell
    Posts
    2,171
    Thank Post
    303
    Thanked 332 Times in 236 Posts
    Rep Power
    141
    Quote Originally Posted by swgeek View Post
    Afternoon all..
    If it is raid any way can you not just remove drives on a one by one basis and if the server stops restarting you will know which drive it is! If it is a drive?
    Cheers Mark
    That's what I did, but it was at the "I'll try anything to get it back up" stage and may not have been the best course of action.

  2. #17

    matt40k's Avatar
    Join Date
    Jun 2008
    Location
    Ipswich
    Posts
    4,390
    Thank Post
    368
    Thanked 637 Times in 519 Posts
    Rep Power
    158
    Pay to get HP onsite 4 hrs response and get them to sort it. (Not sure how HP support compares to Dells)

  3. #18

    tmcd35's Avatar
    Join Date
    Jul 2005
    Location
    Norfolk
    Posts
    5,628
    Thank Post
    845
    Thanked 884 Times in 732 Posts
    Blog Entries
    9
    Rep Power
    326
    Actually, there is an idea there! I've been to worried about the system rebuilding on the hot-spare to try it. Some times you just can't see the wood for the trees. Yes, I'll had it to the list of tests to do (after I've finished the PSU checks).

    If I pull the hot spare first, it can't rebuild the array. If I then pull the other drives one at a time I'll soon stumble across the culprit - if it is indeed a drive problem.

  4. #19

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    17,655
    Thank Post
    516
    Thanked 2,443 Times in 1,891 Posts
    Blog Entries
    24
    Rep Power
    831
    A faulty drive should not cause any reboots, as the whole point of RAID is to have redundancy - ie. if one drive dies, the other 4 should handle it fine.

    This definitely sounds like a PSU issue.

  5. #20
    swgeek's Avatar
    Join Date
    Jun 2009
    Location
    cornwall
    Posts
    6
    Thank Post
    0
    Thanked 1 Time in 1 Post
    Rep Power
    0
    Thanks TMCD35... Cool that was my first post her too!

  6. #21

    tmcd35's Avatar
    Join Date
    Jul 2005
    Location
    Norfolk
    Posts
    5,628
    Thank Post
    845
    Thanked 884 Times in 732 Posts
    Blog Entries
    9
    Rep Power
    326
    Okay,

    Whats the likelihood of both PSU's being faulty? Should I consider swapping them out, one at a time to the other two servers? See if the reboot problems follows both the PSU's?

    As It stands. I've tried running off of 1 PSU and then running off the other. Reboots occurred both times. So it's either not a PSU problem or both the PSU's are at fault...

  7. #22

    Michael's Avatar
    Join Date
    Dec 2005
    Location
    Birmingham
    Posts
    9,262
    Thank Post
    242
    Thanked 1,568 Times in 1,250 Posts
    Rep Power
    340
    From all the tests and replacements you've made, I would have to agree that the probability of the fault being related to your RAID array is unlikely. All RAID controllers usually come with software so you can see various bits of information, including up-to-date status within Windows itself. If there was a fault I'm sure something would be picked up.

    If you're using your PSU near its limit, this could trigger a reboot in theory. Your server should be connected to a UPS and this also can give clues with regards to power problems.

  8. #23

    tmcd35's Avatar
    Join Date
    Jul 2005
    Location
    Norfolk
    Posts
    5,628
    Thank Post
    845
    Thanked 884 Times in 732 Posts
    Blog Entries
    9
    Rep Power
    326
    Quote Originally Posted by swgeek View Post
    Thanks TMCD35... Cool that was my first post her too!
    No prob! You made me re-think how I was looking at the problem - always worth a thank-you in my books

  9. #24

    tmcd35's Avatar
    Join Date
    Jul 2005
    Location
    Norfolk
    Posts
    5,628
    Thank Post
    845
    Thanked 884 Times in 732 Posts
    Blog Entries
    9
    Rep Power
    326
    Quote Originally Posted by Michael View Post
    If you're using your PSU near its limit, this could trigger a reboot in theory. Your server should be connected to a UPS and this also can give clues with regards to power problems.
    Here in lies the problem. I have three identical servers plugged into one uber-UPS. I need to check if the PSU's are hot swappable. If they are I'm going to live swap them with another server. If it's a problem with the PSUs then the problem should follow them onto another server.

    As all three systems are configured identically, If one PSU was near it's limit, then surely the others would be as well?

  10. #25

    Michael's Avatar
    Join Date
    Dec 2005
    Location
    Birmingham
    Posts
    9,262
    Thank Post
    242
    Thanked 1,568 Times in 1,250 Posts
    Rep Power
    340
    As all three systems are configured identically, If one PSU was near it's limit, then surely the others would be as well?
    Very true, but who knows, the PSUs might be different brands or have different power ratings. It is possible but generally speaking they should be the same you're right.

  11. #26

    tmcd35's Avatar
    Join Date
    Jul 2005
    Location
    Norfolk
    Posts
    5,628
    Thank Post
    845
    Thanked 884 Times in 732 Posts
    Blog Entries
    9
    Rep Power
    326
    STATUS UPDATE:

    Okay, After plugging both PSUs back in the server entered boot windows and immediately reboot cycle. After a while I got the very (un)helpful screenshot attached!

    I've have now removed BOTH PSU's. Swapped one PSU with the hot-spare in one server and the other with the hot-spare on another server.

    IF it's a PSU prob then I should now get either or both the other servers randomly rebooting. Also, since I'm sure the two PSUs now in the first server a fine, it should no longer reboot every chance it gets?

    If it does reboot I suppose I'm onto the HDDs next
    Attached Images Attached Images

  12. #27

    matt40k's Avatar
    Join Date
    Jun 2008
    Location
    Ipswich
    Posts
    4,390
    Thank Post
    368
    Thanked 637 Times in 519 Posts
    Rep Power
    158
    Looks like a CPU error. I assume it was correctly installed into the replace board?

    Am I correct in thinking you've had this problem for over a week? Can't you just get HP out and let them sort it rather then spending you time fixing it? I tented to do this will Dell alot, soon as I know it's hardware related I just phone them up and basically say come fix it. That way I've spend about 1 hr fixing it and I know it'll be sorted by the next day. Ok you pay a little extra, but when you add up the hours you save it pays for it's self when stuff like this happens. Plus it keeps SMT happy

  13. #28

    tmcd35's Avatar
    Join Date
    Jul 2005
    Location
    Norfolk
    Posts
    5,628
    Thank Post
    845
    Thanked 884 Times in 732 Posts
    Blog Entries
    9
    Rep Power
    326
    I'm still not convinced by a CPU error.

    1. The system board was replaced by an HP engineer last week
    2. The same said HP engineer fitted the CPUs in the new system - all 8 threads appear in windows
    3. A 30minute run of the Prime95 stress test on all 8 threads, 100% CPU usage, showed no signs of error at all


    I'm not counting it out completely but given how much as been tested/changed I'm still routing for a HD problem (or maybe NIC related).

    I could pull the CPU's one at a time and see what gives?

    EDIT: I'm actually (in a perverse way) enjoying trying to hunt down the problem. I'm lucky enough to have a relatively free jobs list at the moment. This has top number 1 high priority. I'm here till it's fixed

  14. #29

    Michael's Avatar
    Join Date
    Dec 2005
    Location
    Birmingham
    Posts
    9,262
    Thank Post
    242
    Thanked 1,568 Times in 1,250 Posts
    Rep Power
    340
    In my experience you don't get intermittent CPU problems. It either works or it doesn't, unless you count overheating, but this can create a whole range of problems.

  15. #30

    plexer's Avatar
    Join Date
    Dec 2005
    Location
    Norfolk
    Posts
    13,344
    Thank Post
    624
    Thanked 1,584 Times in 1,421 Posts
    Rep Power
    414
    We have a laptop that gives a hardware malfunction error occasionally with the nic plugged in.

    Ben

  16. Thanks to plexer from:

    tmcd35 (29th June 2009)

SHARE:
+ Post New Thread
Page 2 of 4 FirstFirst 1234 LastLast

Similar Threads

  1. "If you can hear this whispering you are dying."
    By SteveT in forum General Chat
    Replies: 5
    Last Post: 16th September 2008, 09:37 AM
  2. Replies: 8
    Last Post: 18th July 2008, 02:34 PM
  3. Windows Server 2003 File Server Resource Manager
    By mrforgetful in forum Windows
    Replies: 1
    Last Post: 17th June 2007, 01:51 PM
  4. Hayfever, killing, dying!
    By starscream in forum General Chat
    Replies: 26
    Last Post: 12th June 2007, 05:15 PM
  5. My server keeps dying
    By dezt in forum Wireless Networks
    Replies: 2
    Last Post: 6th November 2006, 08:31 PM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •