+ Post New Thread
Page 1 of 3 123 LastLast
Results 1 to 15 of 43
Windows Server 2008 Thread, Server reboots in Technical; Hi All, Have been experiencing some strange issues with one of our Servers a HP Proliant ML350 running 2008, it ...
  1. #1

    Join Date
    May 2010
    Location
    UK
    Posts
    160
    Thank Post
    39
    Thanked 8 Times in 8 Posts
    Rep Power
    9

    Server reboots

    Hi All,

    Have been experiencing some strange issues with one of our Servers a HP Proliant ML350 running 2008, it randomly switches off and starts up again. This isn't a clean shutdown and happens mainly at night, in fact its only happened twice or there about in the last 6 months during the day, but has happened 19 times during the night in the same period.

    Checking event viewer / system logs I'm left with "The previous system shutdown at 05:06:51 on 24/10/2011 was unexpected." looking at all other logs there is non that are even close to the shutdown times.

    After this, last Thursday during the night the server switched off but this time was stuck on a blank screen with "internal health problem" and "external (power supply) health problem" I was unaware of the internal health LEDs so didn't check but switching the UPS off and back on and the server booted fine. This is the first time its done this so as you can understand - quite worried! Checked temperature both with software and a temp probe. It seems to be fine, nothing over 40 degrees. All fans are running. Everything is working OK even when the server is under load.

    There is no scheduled tasks, but as I said before the times are completely random.

    I was thinking it could be a UPS problem but there is another server attached to this which doesn't reboot, and its just had a new battery.

    So any ideas what I could check or do?

    Cheers!

  2. #2

    glennda's Avatar
    Join Date
    Jun 2009
    Location
    Sussex
    Posts
    7,714
    Thank Post
    269
    Thanked 1,116 Times in 1,012 Posts
    Rep Power
    345
    are there any blue screen notices? event id 1001

    Also i have had a few problems recent with HP servers and drivers - so worth updating all the drivers and if they are the latest version maybe roll back to the previous release and then see if it carries on.

  3. #3

    Join Date
    May 2010
    Location
    UK
    Posts
    160
    Thank Post
    39
    Thanked 8 Times in 8 Posts
    Rep Power
    9
    Hi Glennda,

    Did a filter for 1001 in system logs all it showed was events for SNMP starting. I also tried for id 1003.

    I literally haven't touched this server for ages apart from making changes in GP, but I will try update drivers.

    Cheers!

  4. #4
    IanT's Avatar
    Join Date
    Aug 2008
    Location
    @ the back of my server racks farting.....
    Posts
    1,887
    Thank Post
    2
    Thanked 118 Times in 109 Posts
    Rep Power
    59
    Is it fully windows updated, proliant service packed, firmware updated?

  5. #5

    Join Date
    May 2010
    Location
    UK
    Posts
    160
    Thank Post
    39
    Thanked 8 Times in 8 Posts
    Rep Power
    9
    Cheers for the ideas.

    The reboots are really intermittent, like some have been a week apart. So I'm trying to do it slowly to try find out the problem. Still nothing showing in logs. Did a windows update on Friday and so far so good, but it could just be waiting!

  6. #6
    kevin_lane's Avatar
    Join Date
    Mar 2007
    Location
    Derby
    Posts
    486
    Thank Post
    21
    Thanked 16 Times in 16 Posts
    Blog Entries
    5
    Rep Power
    17
    what changes did you make for the gp also do you have windows automatically install windows updates

  7. #7

    Join Date
    May 2010
    Location
    UK
    Posts
    160
    Thank Post
    39
    Thanked 8 Times in 8 Posts
    Rep Power
    9
    Sorry I was meaning I've only gone on that server to make some changes to GP for normal workstations or AD, I haven't touched the domain controller gpos.

    Auto updates were off, but it's now completely up to date.

    Not sure what to check next. Hopefully the windows updates fixed it?!

  8. #8

    Join Date
    Aug 2009
    Location
    Huddersfield
    Posts
    55
    Thank Post
    10
    Thanked 14 Times in 10 Posts
    Rep Power
    12
    Not familiar with the ML350, but does it come with ILO? the ILO management log should tell you if you have had a hardware issue or not.

  9. #9

    plexer's Avatar
    Join Date
    Dec 2005
    Location
    Norfolk
    Posts
    12,964
    Thank Post
    586
    Thanked 1,494 Times in 1,340 Posts
    Rep Power
    397
    Have you done any hardware diagnostics on it disk scan, memtest?

    Ben

  10. #10

    Join Date
    May 2010
    Location
    UK
    Posts
    160
    Thank Post
    39
    Thanked 8 Times in 8 Posts
    Rep Power
    9
    Dave - it does come with ILO, wasn't aware it could do that / never got round to setting it up so I will have a look at that tomorrow! Thanks

    Ben - I haven't no, the first time it appeared to be a hardware issue was the other week and I haven't been able to power down the server since then. I did upgrade the RAM last Christmas, and ran a memtest on the new and old ram and it was all fine. So I'm hoping that it hasn't only lasted 6months! I'll make sure that these are my next things to check though, hopefully ILO will indicate if there is problems. Cheers

  11. #11

    Join Date
    May 2010
    Location
    UK
    Posts
    160
    Thank Post
    39
    Thanked 8 Times in 8 Posts
    Rep Power
    9
    Damn - Last night another restart. Literally the server sits there. I haven't made any changes to it. I've had remote desktop open to it, and literally checked its still their every 15mins like a mad man.

    I've checked ILO - I'm assuming you mean the ILO2 log on the System Status page?

    Informational iLO 2 11/17/2011 22:51 11/17/2011 22:51 1 Server power restored.
    Informational iLO 2 11/17/2011 22:51 11/17/2011 22:51 1 Server power removed.

    So this refers to the reboot Yesterday. The only log before this is:

    Informational iLO 2 11/11/2011 19:01 11/11/2011 19:01 1 Server power restored.
    Caution iLO 2 11/11/2011 19:01 11/11/2011 19:01 1 Server reset.

    Which is when I restarted the server for Windows Updates.

    In the IML the last entry is on the 6th:
    Caution POST Message 11/06/2011 05:56 11/06/2011 05:56 1 POST Error: 1778-Drive Array Resuming Automatic Data Recovery Process
    Which coincides with another crash.

    Everything in System information is OK.

    I've upgraded the ILO firmware to the latest.

    So does this mean I don't have hardware issues? Or could I still but their not registering?

  12. #12

    plexer's Avatar
    Join Date
    Dec 2005
    Location
    Norfolk
    Posts
    12,964
    Thank Post
    586
    Thanked 1,494 Times in 1,340 Posts
    Rep Power
    397
    What about the PSU in the server itself?

    Ben

  13. #13

    glennda's Avatar
    Join Date
    Jun 2009
    Location
    Sussex
    Posts
    7,714
    Thank Post
    269
    Thanked 1,116 Times in 1,012 Posts
    Rep Power
    345
    Is it connected to a UPS? Could also be the UPS failing.

    Or is it directly connected into mains?

    Toby

  14. #14

    Join Date
    May 2010
    Location
    UK
    Posts
    160
    Thank Post
    39
    Thanked 8 Times in 8 Posts
    Rep Power
    9
    Not sure how to check Plexer. According to ILO its OK, surely if it was on its way out, when the server is under load it would cut out?

    However the hardware lights (that have only happened once) did indicate internal problem and external problem. Apparently an external problem is the PSU.

    It is connected to a UPS - which has just got a new battery, according the software the UPS is fine. Running the tests it can keep the servers powered up. Connected to the same UPS is another server - which isn't rebooting so I scrapped the idea of it being the UPS??

  15. #15

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,686
    Thank Post
    824
    Thanked 2,570 Times in 2,187 Posts
    Blog Entries
    9
    Rep Power
    731
    Could be RAM and dodgey PSU, other things to try:

    Install latest driver
    Install the latest firmware for all components (NICs, BIOS, Power managment controler, RAID controller firmware) you can use the firmware update CD from HP or do it manually in Windows with the HP downloads.
    Use the Insite diagnostics from the latest smartstart CD to run a memory test that allows for ECC RAM and other avalible tests.
    Ramp the CPU up to 100% and leave it there for a few hours (folding@home SMP is a good one for this) to check for CPU overheat/point overheating (areas of CPU not near the temp sensor overheating before the temp sensor registers it)

    Swap the PSU to the other PSU bay, swap the PSU with another one from another identical server.

SHARE:
+ Post New Thread
Page 1 of 3 123 LastLast

Similar Threads

  1. Server Rebooting itself
    By mthomas08 in forum Windows Server 2008
    Replies: 8
    Last Post: 12th April 2010, 09:41 AM
  2. Server reboots everyday, similar time?
    By markman in forum Windows Server 2000/2003
    Replies: 6
    Last Post: 1st October 2009, 03:35 PM
  3. WSUS and Server Reboots
    By mitchell1981 in forum Windows
    Replies: 13
    Last Post: 22nd September 2009, 10:49 PM
  4. Server 2003 rebooting on startup
    By edie209 in forum Windows
    Replies: 16
    Last Post: 5th November 2007, 10:28 AM
  5. server reboot
    By chrbb in forum Windows
    Replies: 25
    Last Post: 14th September 2007, 09:11 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •