Wonder if anybody has experienced this kind of thing:
In the last few months we recently had a company upgrade part of our aging HUB controlled cat5e network with cat 6 switches. We now use gigabit to the desktop which is great. All seemed ok for a bit...
Then we had some segments starting to lose some ports! This required resetting the switch in that area. This resolved the issue temporarily. Since that time all sorts of areas have been affected. We thought maybe some power fluctuations could affect the switches and indeed we have data via PowerChute which goes some way to confirm this so we have protected the most troublesome areas with SmartUPS units. However, this to our surprise has had little or no effect!
We regularly on almost a daily basis now have 3 key areas which exhibit this issue and randomly others.
We have a large network over a very spread out campus with many fibre segments going back to two central fibre switches!
We have had the network installers back to check for storming switches and transceivers etc and they apparently did find an issue although reluctant to share this with us at the time of discovery. Even with all this the issue still exists and is inconveniencing many staff and making the situation difficult for us in IT Support as we have:
1. Checked the network for storming switches and hubs ourselves (A massive undertaking)
2. Checked for damaged cables
3. Checked for damaged network cards
4. Checked for damaged storming WAPS, min switches etc.
The only thing we have identified is that the new MAIN switch in question an AT 9924SP seems to fluctuate very quickly indeed and is the main fibre core switch that the company installed along with a number of new AT switches (copper) around the campus. The existing MAIN switch which is identical we had before did not exhibit this behaviour. Only since the other switch has been installed does the whole lot exhibit this manic behaviour. And another thing, we have to reset the remote segments first thing in the morning! Why is this?
We are considering purchasing a spare AT 9924SP switch (Not cheap) and changing it out to test the idea that the new core switch may be at fault so....
The upshot of all this is:
Our network seems less reliable since we have upgraded to full Gigabit, the company responsible for the upgrade has already attended, corrected an issue and felt nothing else was wrong even though they accepted that the switch seemed overly busy, we have checked the other switches around the campus, all computers and printers and other devices for errors and as much cabling as we can and found nothing obvious except the main (apparently chattering) main switch.
So the questions are:
1. Has anybody else experienced this kind of thing, if so, what was the resolution?
2. Has anybody else had experience with this switch AT 9924SP and modules ATSPSX, mini gbics? What do think of Allied Telesis switches?
3. Does Power have an effect on switched networks in your experience in relation to brown outs and spikes? Do you need to UPS protect the switches for this issue and in which case even though they are smartUPS and protect against spikes and surges and over charge under charge why do they appear to not have an effect?
Many thanks and apologies for the long post.
I appreciate any comments and answers.
Can you put the fibre connections from the new switch into the old one and turn the new one off to see if the issue vanishes?
Unfortunately we have too many connections to do that and not enough ports. Many thanks for the idea though and many thanks for looking and replying...
Originally Posted by wesleyw
Could you do it for the three key areas?
Have you got a loop and is tree spanning turned on. Tree spanning would cut off ports if a loop was detected to protect your network.
How did you check for damaged network cards?
1) Without more info, it could be a few things - if spanning tree is turned on, what's configured as the root? (should have the lowest bridge ID). What does Wireshark tell you if you listen on the main switch? What do the error counters look like on the switches - take a note of any errors, clear the counters and look at them again in 30 mins.
2) No idea, just have a couple of fibre converters.
3) Yes (IME). All our switches are on UPS (area with unreliable power). It's still the best money I've spent in terms of £ > reliability improvement. If you're on UPS, chances are your issues aren't power related.
The installers have not set up the switch so that we may access it. I will have to refer to the company who did the work before I start attaching laptops to the managment port. In terms of a loop, we have checked as much as we can for this and it doesn't seem to affect everything, just limited it seems to our two main fibre switches. Actually I have to agree the behaviour is indicative of a loop. I will contact the company to see if they have any problem me connecting a laptop to the managment port...thanks very much for your thoughts and we'll look for the loop etc...
Many thanks. We'll connect the switch when we can and take a look...
Have you tried network supervisor from 3 com its free for a trial and might show a loop.
hi, I'd try another switch to be honest this would be the best way to rule out your main switch and see if a loop occurs. I'd grab a cheap Cisco switch or a HP or something to try it - you could also keep it in your store cupboard as a spare.
Have you tried something like wireshark?
Do a bit of a packet sniff and see what the traffic is and where it's coming from? If there are lots of spanning tree changes you'll see it in WireShark by filtering for STP, MSTP, or RSTP packets.
Did the providing company leave you with any documentation?
Yep, we agree....this is what we think too....
Originally Posted by cpjitservices
I agree it would be interesting to logon to the switch with a remote station and try wireshark on it. Unfortunately the company did not leave any docs at all so we'll have to download it all and try this.
Many thanks everyone for the ideas and thought so far....
Originally Posted by pantscat
IF you want the manual for your switches I have the link here for the Manual - we have dealt with those switches before.
If the company who installed the infrastructure didn't document anything or provide you with details of how to access it, vent extreme displeasure in their general direction with a view to someone coming onsite PDQ to a) document their work and do a proper handover b) diagnose the problem.
Lack of documentation would also suggest that they didn't document it for their own reference (unless you have a maintenance agreement with them?).