Wired Networks Thread, SAN flatlining? in Technical; Hi All,
We are having major problems at the moment with our curriculum computers freezing and crashing once logged in ...
3rd October 2013, 02:03 PM #1
- Rep Power
We are having major problems at the moment with our curriculum computers freezing and crashing once logged in browsing network drives and saving etc.
We have been through the images with a fine tooth comb and everything looks ok (we are running Windows 7 64-bit and most of our PCs are Intel i5 with 8GB RAM)
Therefore we are now looking at problems with our network/infrastructure and today we can across this graph on our SAN:
It shows one of our iSCSI cards flatling during a period where we were getting complaints of slowness/freezing/hanging in the ICT Suites.
Has anybody got any ideas why this would happen?
Bearing in mind our SAN turned it self off during the summer for no apparent reason we are a bit worried.
IDG Tech News
3rd October 2013, 02:11 PM #2
Cant help with the problem, but why anyone would use a SAN in a school is beyond me.
File storage is cheap these days and its easy to make it redundant.
Sans are designed for Data centers and huge volume clients not schools with a few thousand users.
3rd October 2013, 02:17 PM #3
Why SANs in school? Because until very recently that was the only sensible way to build a Virtualised infrastructure without having to mess with NFS filers.
To return tho the topic.. call the vendor at once! That is not a happy looking graph.
I'd also be tempted to break out Wireshark or NetMon to see if anything stands out at the packet level between servers and clients and between VM hosts and the SAN. (assuming your virtualised)
3rd October 2013, 02:21 PM #4
Would be worth checking firmwares are up to date on the SAN?
As for who would use a SAN they are very good tools for use in any business or school for Virtualisation and in some cases the SAN does more than just iSCSI so in my place our SAN did NFS, iSCSI, SMB, FTP and others out of the box and thus was an NFS store for the Citrix XenServer farms VMs which gave redundancy in that VMs could move hosts no issues, and the home areas were all directly off it via the SMB meaning no middleware between the SAN and the end user to slow things down and it works reasonably well just had its moments of crap software and poor support in the end but that is well discussed here. The idea however was excellent and I'd do it all over again just with EMC storage this time
3rd October 2013, 02:54 PM #5
- Rep Power
Thanks for the replies. The SAN installation was before my time but I believe it was, as said, an easy way of visualisation as most of our servers are virtual.
I am waiting for a reply from Dell Compellent and yes our firmware is out of date so we will look at upgrading that in the holidays.
We are running Wireshark as I type this so hopefully that flags something up, it hasn't previously when we ran it though.
3rd October 2013, 03:45 PM #6
You could also look to take some traces using the windows performance toolkit (which is a bit like the collision of procmon and perfmon with 100x the data points), it has some nice graphing features which might help you determine whether SAN weirdness was the cause or an effect of the problem.
There is a product in Beta called Microsoft Message Analyzer which is the collision of the WPT and Network Monitor which has the potential to be the ultimate go-to tool for all windows troubleshooting, if that sort of thing is of interest.
Also, unless their declare your SAN dead, I expect you'll be doing that firmware upgrade in the next 24hrs
Last edited by psydii; 3rd October 2013 at 03:49 PM.
3rd October 2013, 05:30 PM #7
That chart is the throughput on your iSCISI card - what's the throughput for your physical drives? I assume this device has some kind of hardware RAID card, with a certain amount of RAM to act as a cache, and probably with the facility to be battery-backed. Your SAN server might also be on a UPS, but its RAID card won't know that. If the battery on the RAID card has failed, or the RAID card thinks it has failed, or possibly on random reboot, the RAID card might have switched to write-through caching as a safety measure. Every time something is written to the RAID controller it makes sure that data is written to the physical disks before it carries on to the next disk write. If you have a lot of data writes, and if they're scattered all over the disk array (as they will be if you have multiple users using multiple files) your array is going to run slowly as it seeks all over the place and writes data. You should make sure your RAID card is set to use write-back caching, where the write to the RAID controller goes into the cache and gets written out when the RAID controller thinks best, preffereably in a neatly-timed slot as the disks seek up and down in a steady manner.
Originally Posted by Scottyboy99
This might not be the problem, but I've had a server do this before with similar symptons and it might be worth invstigating. How you get to the hardware RAID card settings on your SAN server is something you'll have to figure out yourself, though - you might need to reboot and look at BIOS settings. If the RAID battery has failed you can generally override the settings and force write-back caching, and if your SAN server is on a UPS of its own that should be perfectly fine to do.
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)