Hardware Thread, SCSI Connection to Backup Device keeps failing in Technical; Arrgg... Bloody nightmare.
Backups to the Sony LIB-81 Library system started failing last week, keep getting hardware errors within Backup ...
26th March 2008, 08:07 PM #1
SCSI Connection to Backup Device keeps failing
Arrgg... Bloody nightmare.
Backups to the Sony LIB-81 Library system started failing last week, keep getting hardware errors within Backup Exec.
It would seem that the SCSI connection between the computer and the device keeps getting disconnected. Switching off the library unit and back on again seems to get things working again for a short while.
Various websites point to checking the SCSI Cable, re-seating SCSI card, etc, etc. Bit difficult as I can't get to the back of the Server to check cables as both servers are shoved in the corner of the server 'cupboard'. Not helped by the fact that the NAS unit rests on top of the two servers, followed by the Library system itself.
Servers are big brutes (pedastal units), and the other bits and bobs are a bit heavy too. Can't safely get to the back of these units without downing the network. Nightmare. I've inherited a bloody mess.
I think the problem hasn't been helped by the fact there is no air circulation in there and things are getting a tad hot. Going to have to start kicking up a fuss with SMT and Governors that some serious money has got to be spent in this place (ie. Proper racking, air con, etc) or things are just going to start failing.
27th March 2008, 12:25 AM #2
The websites seem to point you in the right directions to me. Troubleshooting is a similar process in most applications as to networks. Check the physical layer first...so that'd mean checking your cables and cards etc and all the connections where weak points might have occured.
You could check what your windows event logger says incase there's hardware issues to do with the SCSI controller failing etc or something.
With regards to the mess in your cupboard. Have you had easter yet? If not then it'd be an ideal time to down the network and get some order in there!
27th March 2008, 02:19 PM #3
The only errors in the actual event log are when it attempts to connect to the library and 'times out'. A standard SCSI error is written.
Manage to nurse it in doing a full backup last night, and that went through ok. Just now doing a restore from that same backup (thank God it went through ok)
Yeah, Easter break in a couple of weeks, so my list is getting longer of things to do. I'm just going have to re-arrange everything for easier access.
Got a meeting today to vent my spleen in what is needed in this place; things are going to fail big time if no action is taken.
27th March 2008, 02:23 PM #4
On many tape drives there's a small pin hole like you find on CD/DVD-ROM drives. As I'm sure you know, this is to serve as the manual override in case you need to remove a disc.
On a tape drive it permits you to reset the drive power without having to reset the whole server. Hope this makes sense. May be worth a try and then initialising a manual backup.
27th March 2008, 02:30 PM #5
I experienced a similar problem with by Tandberg library and Retrospect. In the end it turned out to be a duff SCSI card. It was very intermittent as you described so it is definately worth trying another.
28th March 2008, 12:07 AM #6
All useful stuff, thanks.
End of term in a couple of weeks is going to be when I can switch this off and have a good look.
Hope it's not the SCSI card, haven't got a spare lying around. The current unit is a LSI PCI-X Ultra 320, quite pricey cards.
Behaved itself today, completed and verified a full backup last night, and merrily started the differential this evening as well.
Had ICT Development meeting today and made it very clear that I can't do my job properly and support these servers if the environment they are based in is improved (ie. Proper racking and cooling). Seemed to take notice.
12th April 2008, 12:47 PM #7
Well the problem did seem to settle down. But on occasion the odd backup was 'missed' by BackExec claiming the window had been missed due to no devices being available.
Due to some Electrical work going on, had a chance to down all the servers and re-organise things a bit better. Unplugged everything, removed all equipment out of the 'cupboard', cleaned things up, moved it about a bit and plugged it all back in and rechecking all connections. It's amazing what I found in the tangle of mess of cables. All the equipment has redundant PSU's in, but for some reason only one PSU from each device was plugged into a UPS the other bits plugged directly into the wall !!! No surge protection and what with no communications setup between servers and UPS's it t'was a right mess!
Did a little pray, sacrificed a chicken, banged a gong and switched it all back on. Thankfully everything came back up, but the NAS unit had a bit of a 'wobble' but thankfully sorted itself out after forcing the controller to do another reset. The NAS unit uses a 4Gb Fibre Channel and those Fibre Tails can be so easily damaged, especially when tangled up with all the other cables and crap hiding behind the servers!
All seemed well, backups seemed ok again, but then last night I manually set the Full Backup running and once again it's lost sight of the backup device again!!!!
The SCSI connection the backup device uses comes off the motherboard of the server to a backplate, when right next door to it is a LSI Logic 320 SCSI card sitting there not doing much! Most odd.
Step two, get a couple of new cables to see if replacing that will solve the issue and/or use the other SCSI card.
Quite liberating and rewarding pulling apart everything and getting it back together again. Gives me a chance to take proper ownership of what I've inherited and get a clear view of how to improve it.
13th April 2008, 12:19 AM #8
The idea behind the one cable to the wall and one to the UPS is should the UPS die or the feed that supplys it then teh servers may still run from the other feed as those sockets should come from a different supply and phase to try and help keep it up is my understanding of it.
13th April 2008, 12:38 AM #9
Yeah that's sort of the idea. Take one from a UPS on one ring main and one from another UPS on another ring main. I mean, if you were designing a data centre you'd have each on different generators etc too. It's all about redundancy these days.
Teamed NICs and failover is good too.
13th April 2008, 09:45 AM #10
Actually most hosting centers use the N+1 UPS design, whereby the redundancy is built into the UPS itself with parallel power modules.
Originally Posted by Joedetic
Having UPS on seperate ring mains really makes little difference. The policy is for a 2N or N+1 UPS configuration to be fed by seperate phases....and even then they should be going to seperate substations.
An Automatic transfer switch (ATS) would then be used to connect both phases to the UPS so that in the event of a power loss utlitiy power is still being fed through the second phase via the ATS.
As for the connection in the event of UPS failure ideally you'd use a bypass switch for this, either built into the UPS itself or a seperate box which will allow the equipment load to bypass to utility power in the event of UPS fault. Redudnat power connections from equipment shoud NOT be going into the wall sockets. Both power supplys should go into PDU's in the rack. And then bypass power wired upstream from the PDU's either in the UPS or externally.
I happen to agree with Ric and would suggest TS gets an Adaptec card or similar to try instead of the LSI card....just to rule out SCSI card fault which is the most likely explanation.
Last edited by torledo; 13th April 2008 at 09:51 AM.
13th April 2008, 02:03 PM #11
*sigh* how about you just wait there whilst I go and shoot some one. Ok?
13th April 2008, 02:06 PM #12
13th April 2008, 02:19 PM #13
I was discussing DCs a while back with someone I know who's had a lot of experience in D.Cs but it would appear that he likes to overly simplify stuff and miss things out.
Yes I've heard of N+1. But if you can't afford three phase power feeds and don't want to buy fancy UPSs then bodge jobbing is something I'm sure he's good at.
Last edited by Joedetic; 13th April 2008 at 02:21 PM.
13th April 2008, 03:18 PM #14
But fancy ups's are the best sort
Originally Posted by Joedetic
though i agree, if on a budget best to get some sort of redundancy even if it's 2 cheap ups's on dedicated circuits.
13th April 2008, 08:57 PM #15
Interesting discussion on the UPS. I can see the sense, and it did cross my mind.
However, straight into the wall socket with no sort of Surge protection? Errr.... No.
It's a good quality APC UPS-3000 so I'm not too worried about it.
Anyone recommend a good supplier of SCSI Cables for next day delivery?
By Simcfc73 in forum Hardware
Last Post: 22nd July 2008, 11:37 PM
By FN-GM in forum Wireless Networks
Last Post: 29th December 2007, 08:19 PM
By mullet_man in forum MIS Systems
Last Post: 23rd November 2007, 07:03 PM
By Andie in forum Wireless Networks
Last Post: 16th November 2007, 04:44 PM
By park_bench in forum Windows
Last Post: 21st August 2007, 05:18 PM
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)