Just thought I'd throw my two pence worth in. I've got a Cisco SG-300 as my main and only switch (small operation) and its fantastic.
Highly configurable for a "budget switch" However in this instance I wouldn't recommend it as a core switch. For that you need something with a bit more of a punch like a Cisco 2900, 3500 or 3700 series something that is 10gig compatable for the future if not now. A stacked switch array will work out quite well for you.
Its always better to get quality over quantity.
I don’t know a lot about switches but most of the posts seem off topic unless all of your switches are kaput.
Even if your core switch died the workstations should still connect at gigabit speeds to the edge switch they are plugged into.
I remember when schools couldn’t even afford to connect the server at gigabit and were lucky if they had 10 / 100 switches and weren’t using 10 /100 hubs let alone being able to connect the workstations at gigabit speeds. The workstations didn’t suddenly drop down to 10mb link speed because the load on the switch / backbone was too great.
Yes you might have bottle necks in your network and unless you have no other choice daisy chaining switches is a bad idea but none of this should cause your problem.
What I would be looking at:
1. The driver on your workstations
2. Power management on your WORKSTATIONS (power management can drop the link speed)
3. Faulty Cabling / patch leads
4. The way your switches are configured (have you checked for firmware updates?)
Even if one or two of your switches are faulty (I’m not convinced that any of them are) they should only cause problems to the equipment connected directly to it not your whole network.
Got a few spare PCI gigabit nics? If yes drop them in a few workstations and see how they behave.
Got a cheap (£30) 8 port unmanaged gigabit desktop switch? Then drop it between one of the switches and a few of your workstations and see what happens.
If you have the money by all means replace your entire infrastructure. But I find it hard to believe that all of your switches all faulty, unless you have been flooded or have cooked them.
You're absolutely right - however I'll admit I've let it go off topic because it also mirror a lot of what else is actually going on - there are performance issues but we suspect it's entirely down to this one problem.
We've already checked drivers etc - there's many different types of machines ranging from celeron 2.6ghz units with onboard realtek gig nics to intel i5 kit with the latest greatest Intel nics onboard.
However we've had a bit of a breakthrough. Whilst monitoring D-View today, one of the switches in the library of Site 1 kept dissapearing from view as if SNMP was playing up. Ok- ping -t and pings ranged from 4ms to 400ms, and entirely timing out randomly. Interesting - ping the other switch daisy chained off it, exactly the same. Pulled one switch to see if it's a PC/cable to station, same. Swapped to the other switch, same. That means it's the cable between core and that room which we're going to pull and re-do at lunchtime.
The same thing (dissapearing switches) was happening to anything directly connected - so the core switch it was connected to, and the 2 switches in the room. As that core switch has the trunks to the other switches and the inter-site link, we wonder if that is causing a major problem. One way to find out, and that'll happen in an hour :D
We've already done the systematic approach of narrowing down stations and hardware etc - we've plenty of test equipment however I suspect at the time of the next budget we may invest in a fluke tester as well as training.
Can someone explain to me the difference between:
Yes, one is value and one is enterprise. But from directly comparing them, other than lifetime warranty - the value one to me looks better? Better switching capacity, throughput, latency, features identical - the value one looks overall better?
Is there something obvious I'm missing?
The V series one is a newer model, hp is in the throws of replacing a bunch of its stuff with the 3Com equivilents. That said the big difference is probably the software, the e series also has a bigger packet buffer. Have not used a V series one but it is interesting that they handle limited layer 3 by the looks of it. Other than that, as you have said. Its a warrenty thing.
Originally Posted by synaesthesia
Edit: its also heavier meaning its probably made of sturdier stuff (case and electronics) than the V series.
Cheers. The buffer won't make a huge difference in real world performance, at least not on this small a network. I have little doubt about the weight and quality thing - the V series will be little more than a rebadged 3com but that doesn't particularly worry me too much - there are THOUSANDS of 3com 4400's running most of this county and very few have passed away. If that continues then I shall not worry too much.
An update on today's findings. We noted that many of the Dlink switches were "disappearing" from SNMP monitors for a few seconds - so set up some rolling pings to various switches and stations. Every now and then on each switch (probably 5 times a minute) a ping would time out. But not to all switches - and the affected switches were often only half-laden edge devices. Spent more time trying to improve each area by moving around, and had the same by moving them to one of the Procurves. Finished enabling STP sitewide this evening. Things are definitely improving. We've now been able to accurately reproduce the problem of machines only connecting at 100mbit rather than a gig, even if they're only briefly turned off. When the machines are off, they drop to 10Mbps. When on, they ramp up to 100 rather than a gig. As I type this, I realise I haven't tried isolating that switch then trying again to see if it matters when it's not on the core - I suspect it will make a difference and will try tomorrow. We did however spend some time in that particular location re-wiring nearly half the patch panel from 568A to B - this shouldn't actually be a huge issue as modern switches will automatically detect and sort that out, but did also wonder if that detection & solving process would again throttle the device. It doesn't.
We find it difficult to try and pick out information that's useful, and certainly have to cherry-pick information from all of this thread. It's all useful, but people are often ignorant of budget and bias always comes into it. There's a hell of a lot of "omg, dlink/netgear/linksys/3com" etc yet at the same time we find plenty of schools running any combination and they've been extremely happy. Trouble is, you'll rarely hear the success stories, only the disasters - and some of those will be people blaming equipment rather than themselves. Same goes for all sources really so we're trying our hardest to manage what's affordable and what's reliable.
May i ask how you have enabled STP?? if this wasn't enabled correctly and you didn't elect a root bridge this will cause you even more issues, as all traffic will go via the root bridge that the switches elect themselves.
The switch elected root bridge is normally the switch with the lowest mac address and this normally means your oldest and slowest switch.
and can i ask what model are the procurve switches that you said were failing and you took out??
The procurves are 1800-24g. We're running about 5 currently to try and narrow down the issues (3 of them are direct replacements from HP on warranty due to failed ports)
STP settings are default on the switches - there is no option to elect the rootbridge, the mac address is automatically filled in. However the oldest switch will also be one of the same switches - there's nothing diabolically old left in the building so that's not too much of a worry. I shall check again in the morning and see what MAC address/switch they are electing out of idle interest.
As has been mentioned in an earlier post that STP sitewide could cause you more problems. I feel your problem lies in the fact that most of your edge switches are daisy chained instead of feeding directly back to your core switch.
I have only one edge switch which is daisy chained and this did cause me a little problem at one time due to having STP enabled on both switches which were linked together. I enabled STP on just the linked port on the uploading switch which is connected directly via fibre OM3 back to the core switch, all other ports have portfast enabled and this has been the case for the last 8 years.
Also because you have STP enabled this is probably the reason why your computer nics keep shutting down to 100 or 10Mbps as you should have auto negotiate turned on and STP turned off with portfast enabled on those ports which have only an end device on and those switch ports which have another switch linked in should have STP enabled.
STP takes around 30 secs to allow the packets through and this is where the auto-negotiation between the switch ports and the nics fail so by default the ports tune down to the speed they can manage at that moment and usually this is 10Mbps.
This can also affect both WOL and PXE build as the hosts or server cannot be reached in adequate time and so return a false to whichever requires a positive.
Hope this is readable as it is late and I don't have my glasses on.
Good luck anyway with whatever you decide to do, I personally would disconnect all edge switches and then start with one and test the workstations connected to that and work my way forward without daisy chaining any switches.
yes but if your root bridge is out in the edge of one side of your lan then traffic from the other side has to go through the root bridge causing wasted bandwidth and sending traffic up up-links it doesn't need to go up
I aren't actually sure if the HP1800 was an HP switch and not someone else badged as HP, please don't let your experience with this switch put you off HP as there networking kit is fantastic.
The 1800 is also a very basic switch, my rule of thumb is if it hasn't got a command line interface its not a real switch, as there are normally major limitations to web only managed switches.
The 1800 is definitely a pure HP switch (previous to the 3com buyout). Yeah it basic, but again there's that budget thing again.
STP has only recently been enabled - the problems were happening when it was entirely disabled. I can certainly enable it only on the uplink ports though so that may be worth looking at too.
We are trying to get rid of as many daisy chains as possible but with the distance involved it also means running new fibre. Again, budget reasons, that's absolutely impossible, everything we're doing is purely to keep things running as smooth as possible with the little resource we have.
To put it another way; the network industry is very competitive because it largely works on open standards meaning that anyone can enter the market and compete on a level playing field - unlike in other areas of IT. As a result of this you tend to get what you pay for because the more expensive vendors will tend to have less buggy software and more reliable components. Working within a budget is one thing, but trying to get a piece of equipment to do something that far exceeds what it is designed to do is another.
Originally Posted by synaesthesia
If you need a car that can do 200mph but don't have and can't get the funds, then you will probably settle for a car that does 150mph and try and force it to do the speed you need. Sound unlikely? Look at Brawn GP F1 team in 2009. There is absolutely no point other than mention in passing arguing about what we should and should not have. The same goes for all schools, there's a lot of making do and all of us have to make do.
My original point was that it would be better to concentrate on core services like networks rather than spending on windows 7 upgrades. BTW to use your analogy; when I was 17 I had an RD125Mk2 which I managed to squeeze 110MPH out of. I also rebuilt the engine twice and the gearbox once, it wasn't reliable - just great fun.
Originally Posted by synaesthesia