+ Post New Thread
Results 1 to 9 of 9
Wireless Networks Thread, Have you ever seen anything like this.... strange switch behaviour... in Technical; hello everyone, i have recently run into a very, very strange problem and i'm hoping at least one of you ...
  1. #1
    35mm's Avatar
    Join Date
    Jan 2011
    Location
    Wonderful Norfolk
    Posts
    4
    Thank Post
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0

    Have you ever seen anything like this.... strange switch behaviour...

    hello everyone,

    i have recently run into a very, very strange problem and i'm hoping at least one of you have run into the same or similar... it might just save my sanity!

    let me explain:

    we've been having very brief periods of complete packet loss across our main backbone switch (all our switches are HP Procurves of various vintages). during this phenomenon the cpu on the switch spikes and the activity LEDs illuminate in a more solid fashion. This loss of traffic rarely lasts longer than 30 seconds, usually only 5-10. This happens very sporadically, sometimes 2-3 times a day, sometimes not at all.

    now, i appreciate this *looks* like it might be a loop in the network, but we have STP enabled on all our switches and artificially inducing loops the lan, across switches causes no such loss. the switch log accessible from telnet reveals that the ports are instantly blocked by STP.

    Anyway, further investigation seems to have identified the root cause of this issue -

    It's *something* to do with the Marvell Yukon onboard LAN cards in some of our machines.

    I had a machine in the office which had been reported as prodically displaying the no domain error and dropping the network and when i tried to reimage it, it would, with a 100% reproducibility cause a complete loss of traffic on the backbone switch - every single time. replace the lan card. problem goes away completely for a few weeks.

    now, we have over 60 of these machines, but fortunately, these are all (mostly) contained on a couple of switches, so for a test, i've disconnected the switch where all these computers are plugged in and the problem (so far) has gone away.

    so speculating this is likely caused by the lan card spewing either loads of broadcasts or some other malformed transmissions, maybe during bootup causing the switch to be temporarily overloaded dealing with those broadcasts and causing it to drop a few packets - the times of the network drops seems to mostly coincide with lesson start too. it doesn't seem to be related to a specific driver, i've tried several for the card, including the DOS one and they all cause the effect on the defective machine. Note: i'm not using multicast or anything else like that on the machine, it was a simple image download.

    but, when this happens, it seems that the switch gets so bogged down with it all that it's very difficult to see that's going on the switch console as the connection is usually dropped.

    is there going to be an easy way for me to work out where the rouge device is? other than by turning off each port in turn on the switch and seeing if the problem goes away?

    any info/feedback greatly appreciated.

    Mark
    Last edited by 35mm; 28th January 2011 at 07:23 PM.

  2. #2

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,691
    Thank Post
    824
    Thanked 2,570 Times in 2,187 Posts
    Blog Entries
    9
    Rep Power
    731
    Several things come to mind, first up firmware upgrade for the switches. Next would be forcing 100mbit duplex on the cards to see if that alleviates it. Third would be having wireshark running constantly to see if there is any weird traffic causing it.

  3. #3
    35mm's Avatar
    Join Date
    Jan 2011
    Location
    Wonderful Norfolk
    Posts
    4
    Thank Post
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0
    Thanks Synack. i will indeed leave wireshark running on a laptop plugged into the switch - that's a great idea. it should at least reveal the origin mac address of the broadcasts so it might make the job of tracking down the faulty card a bit easier. will do that on monday.

    thanks again!

  4. #4
    35mm's Avatar
    Join Date
    Jan 2011
    Location
    Wonderful Norfolk
    Posts
    4
    Thank Post
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0
    think i've identified root cause of the issue.

    there are two machines causing the problem, both have mac addresses which are 000000000000

    when you ping the machine's ip address, the response gets broadcast to the entire network along with anything else you do (such as browse the file shares etc)

    there are loads and other really strange things just going in a loop too. so it seems this is strange problem is causing the switch to get overloaded under certain circumstances.

    very, very strange - thanks for the tips guys!

    never seen that before.

  5. #5
    HallX's Avatar
    Join Date
    Mar 2007
    Location
    Doncaster
    Posts
    230
    Thank Post
    20
    Thanked 26 Times in 21 Posts
    Rep Power
    20
    If I remember correctly, we had this issue a few years ago with some RM computers we had thrust upon us. The NIC's were SIS and the drivers were causing the problem.

    Downloading the drivers from the manufacturers site cured it.

    Paul

  6. #6
    35mm's Avatar
    Join Date
    Jan 2011
    Location
    Wonderful Norfolk
    Posts
    4
    Thank Post
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0
    thanks Paul.

    it seems to be something very specific with those cards - i've tried the latest drivers, including the vista driver (the machines run XP normally) and even the old DOS driver - using either the card still has a 00-00-00-00-00-00 mac address.

    i've picked up about 4 of these today now - it's such a strange thing seeing all the SMB traffic which is supposed to be going to that one machine being broadcast to everyone. as soon as the machine starts doing anything heavy you can see the cpu on the switch jump up to between 15-30% instantly.

    it's under very rare circumstances that there are sufficient broadcasts that the network just grinds to a halt - like downloading a ghost image (not as a multicast, just as a file) you can see all that traffic going over the lan. this really kills the switch.

    sadly, as i have discovered today... we have about 120 of these computers (they are Acer Power F1s purchased in 2007 iirc), i'm going to do a bit more digging to see if there is a known specific problem but realistically, it's just going to be a waiting game and just changing the cards as and when.

  7. #7

    plexer's Avatar
    Join Date
    Dec 2005
    Location
    Norfolk
    Posts
    12,968
    Thank Post
    587
    Thanked 1,495 Times in 1,341 Posts
    Rep Power
    398

  8. #8

    glennda's Avatar
    Join Date
    Jun 2009
    Location
    Sussex
    Posts
    7,714
    Thank Post
    269
    Thanked 1,116 Times in 1,012 Posts
    Rep Power
    345
    Maybe a mobo firmware flash/upgrade - if its only a couple of machines it could either be dodgy firmware or just a couple of dodgy cards.

    but i have had strange issues with network cards before. I had a whole batch of pc's (intel nics) which would not work with a Nortel switch i had - swapped it with a HP and it worked fine back to the nortel doesn't work. We where going to replace the switch in the summer but i just happened in may rather the august.

  9. #9

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,691
    Thank Post
    824
    Thanked 2,570 Times in 2,187 Posts
    Blog Entries
    9
    Rep Power
    731
    Quote Originally Posted by 35mm View Post
    thanks Paul.

    it seems to be something very specific with those cards - i've tried the latest drivers, including the vista driver (the machines run XP normally) and even the old DOS driver - using either the card still has a 00-00-00-00-00-00 mac address.
    If these are integrated MB cards then this is a standard isue with junk oem gear that has not been fully configured. The MAC in those cases shouldbe stored in the bios but never gets entered. The toolkit from the solution section here - The DMI Discontinuity and the Perils of Brand X Computing - Blogs - EduGeek.net - has tools to program this in. “BNOBTC v6” is what you want to search for, can't post a link as there are silly copywrite isues.

  10. Thanks to SYNACK from:

    Chad (4th February 2011)

SHARE:
+ Post New Thread

Similar Threads

  1. Laptop Strange behaviour
    By ianniow in forum Hardware
    Replies: 4
    Last Post: 15th December 2009, 01:33 PM
  2. Windows 7 strange behaviour
    By sychosis in forum Windows 7
    Replies: 1
    Last Post: 14th December 2009, 09:54 AM
  3. Strange HDD Behaviour
    By Gatt in forum Windows 7
    Replies: 2
    Last Post: 11th August 2009, 10:10 PM
  4. Strange Squid behaviour
    By ahuxham in forum *nix
    Replies: 5
    Last Post: 2nd July 2008, 08:50 PM
  5. strange keyboard behaviour
    By RabbieBurns in forum Windows
    Replies: 14
    Last Post: 14th May 2008, 05:39 PM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •