Thin Client and Virtual Machines Thread, DRBD, NIC bonding, load balancing, Openfiler... in Technical; We have 2 Openfiler SANs utilizing DRBD for HA/mirroring. DRBD data flows between the 2 servers at 1gb/s via a ...
-
30th July 2009, 02:36 PM #1 DRBD, NIC bonding, load balancing, Openfiler...
We have 2 Openfiler SANs utilizing DRBD for HA/mirroring. DRBD data flows between the 2 servers at 1gb/s via a directly connected x-over cable.
This poses the problem of that 1gb/s link being a bottleneck as harddisk I/O could potentially be double what the 1gb/s link can handle.
So... you just set up a bonded interface, right? Well no it doesnt seem to be as simple as that. From what i can gather the only bonding mode which can put a single stream of data onto serveral NICs is balance-rr (round robin), but due to its overheads its going to take 3 or 4 NICs bonded together to reach the 2gb/s target.
I have set up balance-rr by editting th modprobe.conf file, is this the only file that needs editting? Under testing its almost asthough the bond believes it can only transmit at 1gb/s total instead of 1gb/s per NIC.
Is there any other files which need editting? (documentation on balance-rr is very limited so any help much appreciated).
Has anyone done something similar? How did you do it?
Last edited by j17sparky; 30th July 2009 at 02:56 PM.
-
-
IDG Tech News
-
30th July 2009, 03:08 PM #2 
Originally Posted by
j17sparky
This poses the problem of that 1gb/s link being a bottleneck as harddisk I/O could potentially be double what the 1gb/s link can handle.
Are you sure this is where the bottleneck actually is? DRBD only needs to send writes to its mirror, not reads - are you going to hit peaks where that much data is going to be written? Can you reduce the bottleneck by adding more RAM for DRBD to cache writes should it need to? You might need to change the replication protocol that DRBD uses (from protocol C to protocol A, probably).
Are you mirroring a single volume between the two machines? I give our VMs a seperate LVM volume each and let DRBD mirror them over seperate network connections, each with their own IP address.
Do you have a switch that supports 802.1q VLANs (and flow control?)? Would you do better to run the bonded connections from each machine to the switch? If the switch's backplane can handle the bandwidth okay then there might be less overhead than directly linking two bonded connections on each machine.
--
David Hicks
-
-
30th July 2009, 03:57 PM #3 
Originally Posted by
dhicks
Are you sure this is where the bottleneck actually is? DRBD only needs to send writes to its mirror, not reads - are you going to hit peaks where that much data is going to be written? Can you reduce the bottleneck by adding more RAM for DRBD to cache writes should it need to? You might need to change the replication protocol that DRBD uses (from protocol C to protocol A, probably).
Our san is running raid5 so theres overhead there, but i wouldnt expect writes to drop below 1000Mbit/s, therefore next in line is the DRBD mirror link. TBH it isnt the end of the world if there is only a 1gb/s link, but for future proofing id rather have the best setup possible to start with rather than getting stung later down the line.
TBH im not massively keen on the idea of having DRBD working on anything other than protocol C.

Originally Posted by
dhicks
I give our VMs a seperate LVM volume each and let DRBD mirror them over seperate network connections, each with their own IP address.
That was going to be my backup plan, but it still only allows for 1gb/s on any one VG. Again not the end of the world.
Do you have a switch that supports 802.1q VLANs (and flow control?)? Would you do better to run the bonded connections from each machine to the switch? If the switch's backplane can handle the bandwidth okay then there might be less overhead than directly linking two bonded connections on each machine.
I assume you are talking about 802.11ad? This doesnt work as the way the protocol works is it sends all data for one client down a single link. It does this by design in order to stop out of order packets and the associated overheads. Obviously with me only having one "client" i can only use one link. 
Anything else worth knowing from a man who's actually got DRBD in production?
Last edited by j17sparky; 30th July 2009 at 04:12 PM.
-
Thanks to j17sparky from:
-
30th July 2009, 04:20 PM #4 
Originally Posted by
dhicks Do you have a switch that supports 802.1q VLANs (and flow control?)? Would you do better to run the bonded connections from each machine to the switch? If the switch's backplane can handle the bandwidth okay then there might be less overhead than directly linking two bonded connections on each machine.

Originally Posted by
j17sparky
I assume you are talking about 802.11ad? This doesnt work as the way the protocol works is it sends all data for one client down a single link. It does this by design in order to stop out of order packets and the associated overheads. Obviously with me only having one "client" i can only use one link.

No, 802.1q, I think 802.11ad is another wireless LAN protocol - admittedly, I'm only suggesting that after a Google search turned up a hint about it. Makes sense that 801.1q works as you describe above, though, which is a nuisance - I'd always just kind of assumed bonded links would give you double the bandwidth minus a bit of overhead, turns out that's not true. Unless any switches (er, probably the expensive ones?) have some feature where they will actually bond ports and let them be treated as one? Hmm, that's something I'll definatly have to ask the Cisco chap when he turns up (he's now coming in three weeks time or so instead of today).
--
David Hicks
-
-
30th July 2009, 09:14 PM #5 j17sparky is right when suggesting 803.11ad it is the LACP protocol but ballances across ip sources and destinations only utilising a maximum of one link for each conversation. The cisco alternative [ame="http://en.wikipedia.org/wiki/EtherChannel"]EtherChannel - Wikipedia, the free encyclopedia[/ame] is also hindered by the same basic mechanisum. Etherchannel (Cisco) does however allow ballancing via layer4 ports so if you could setup two seporate sync connections using different tcp/ip port numbers this would span multiple links with about 70-80% efficiency.
802.11q is VLAN trunking which uses a single link and encapsulates traffic from multiple different VLANs through a single link. This allows for multiple sitewide VLANs to be configured and accessed on edge switch without having to provide a seporate link for each VLAN back to the core. This standard does not inherently include aggreagation of multiple physical links into a faster single link.
-
-
30th July 2009, 10:05 PM #6 
Originally Posted by
SYNACK
j17sparky is right when suggesting 803.11ad it is the LACP protocol
This is getting confusing: the [ame="http://en.wikipedia.org/wiki/Link_aggregation"]Wikipedia page[/ame] mentions 802.3ad and 802.1AX. I thinkI prefer "LACP"!
but ballances across ip sources and destinations only utilising a maximum of one link for each conversation.
Drat.
And double drat.
Etherchannel (Cisco) does however allow ballancing via layer4 ports so if you could setup two seporate sync connections using different tcp/ip port numbers this would span multiple links with about 70-80% efficiency.
Ah ha. I'll definitely check with the Cisco chap when he turns up (although that sounds like a good chance for him to flog some Cisco switches...).
802.11q <snip>... does not inherently include aggreagation of multiple physical links into a faster single link.
Ah, thanks for the clarification - I found the reference to 802.11q on a random blog post, guess that was wrong.
Also, this might come in handy:
Gigabit Teaming or Bonding - Server Fault
--
David Hicks
-
-
31st July 2009, 12:15 AM #7 
Originally Posted by
SYNACK
Etherchannel (Cisco) does however allow ballancing via layer4 ports
Does the CentOS / any other Linux bonding driver support Etherchannel, though? Can you configure the client to use 802.3ad and the switch to use Etherchannel, are the two compatible?
Is it worth trying the bonding driver's balance-alb bonding policy? I can't really figure out from the documentation how that goes about load balancing, but I'm guessing the original poster has already tried it.
so if you could setup two seporate sync connections using different tcp/ip port numbers this would span multiple links with about 70-80% efficiency.
The DRBD manual explicitly says you can't do that, which is annoying. Might as well split the disk volumes up and use a separate link to synch each one.
--
David Hicks
-
-
31st July 2009, 12:54 AM #8 
Originally Posted by
dhicks
Does the CentOS / any other Linux bonding driver support Etherchannel, though? Can you configure the client to use 802.3ad and the switch to use Etherchannel, are the two compatible?
CentOS does support it, there is some config here Trunking CentOS 5 to a Cisco 3750 using etherchannel @ A Murder of Crows and no the two standards are not compatible unfortunatly. Even so without support for two sync threads with different ports etherchannel would not help anyway.

Originally Posted by
dhicks
Is it worth trying the bonding driver's balance-alb bonding policy? I can't really figure out from the documentation how that goes about load balancing, but I'm guessing the original poster has already tried it.
The load ballancing algorythem will probably be implemented in the same way as the other protocols to avoid the out of order packets problem. Unless it physically splits the data payload in two and sends one one half each link then it will suffer the same issues. This approach is used in propper channel bonding applications for DSL and POTS [ame="http://en.wikipedia.org/wiki/Channel_bonding"]Channel bonding - Wikipedia, the free encyclopedia[/ame] but not for ethernet it seems despite the articles pointer.
This might help optomize TCP/IP though How to achieve Gigabit speeds with Linux

Originally Posted by
dhicks
The DRBD manual explicitly says you can't do that, which is annoying. Might as well split the disk volumes up and use a separate link to synch each one.
This sounds like the most supported solution at this stage as it will divide up the sessions that way propperly. This could be used with etherchannel if each sync thread had its own port number, the connections could then be dynamicly ballanced over however many links you have in your etherchanel rather than the card per session method though this (card per session) may be cheaper and easier for a direct setup.
Last edited by SYNACK; 31st July 2009 at 12:59 AM.
-
SHARE: 
Similar Threads
-
By JamesC in forum Thin Client and Virtual Machines
Replies: 8
Last Post: 17th September 2010, 12:16 PM
-
By adamf in forum Windows Server 2000/2003
Replies: 0
Last Post: 30th May 2009, 04:44 PM
-
By Nick_Parker in forum Wireless Networks
Replies: 11
Last Post: 24th March 2009, 10:43 PM
-
By Nick_Parker in forum Wireless Networks
Replies: 2
Last Post: 13th October 2008, 08:35 AM
-
By Jonny_sims in forum Thin Client and Virtual Machines
Replies: 10
Last Post: 6th December 2006, 09:43 AM
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules