Hi All, just wanted to see if anyone here has had the pleasure of working with HP Blade Servers and utilising VMWare on them with a San for storage?
I can see we have a few threads on VMWare on here which are great , but wondered if anyone had any tips on them (good ones please not bad as there ordered and on there way so no changing our minds!!!)
Anyway, any hints and tips or has anyone got them and willing to share the ups and downs on this, or if there local to us (were in Cumbria) maybe willing to let us pop and drewl over your setup!
We're going to at a very simple level chuck all the existing servers away and build a complete new network on the blade system utilising Virtual Machines and using the San for storage, then come the right time when were happy with it wheel it in to the server room, plug it in and then re-image all the PCs in school and then start life with a shiny new network thats hopefully much quicker than the existing network, has more fail-safes due to the VMWare setup, and have no silly NT4 legacy bits coming back to bite us in the rear end!
I should say and make very clear, when I say chuck away the existing, I am not meaning literally throwing them all in the bin! We will re-use some of the newer ones for a backup box to go on another part of site as part of our Disaster Recovery Plan and another for some other use that we just don't see the point of using the San for such as storing ISOs of all the software used in school perhaps, but until we get it all going we don't know exactly what we will be doing with them but there will be some re-using so don't worry I'm not making a huge pile of WEEE waste We're still completing the nice diagram of what exactly we are doing and where (we do have a plan but its just missing the little things now )
So anyone else done this or any hints / tips / pitfalls of this at all that they want to share before myself and the A-team start on this mammoth project (all whilst fighting fires from the existing system I will add!!)
Well we are using an IBM blade system and an IBM SAN and have started using ESXi.
Our strategy has been to do things slowly and manage the change rather than trying to immediately change everything at once. Our virtualisation strategy is to be implemented over 18months. This way we learn as we go, of course we test things before deployment, but we already have made mistakes. Some of the blades have needed to be re-deployed into the citrix farm due to under provisioning - this has left us with a shortfall.
So far we have mostly been migrating fileserver data to a raw mapped LUN that is accessible from Samba. We'll move SIMS over Christmas and web services after that. lastly we'll move the directory services and email.
Please bear with me it's late and I am about to brain dump....... ;-)
Heres the setup:
1 x c3000 chasis
4 bl460 blades with 10Gb Ram and 2 quad core xeons and 2 x 73gb sas
1 x msa 2012i array with 9.3TB of un-carved space utilising iSCSI
The array was split into 6 x SAS (300Gb) providing 1.5Tb in RAID 5 for the VMs and 6 x SATA (700GB) providing 4 volumes each a 1Tb.
The MSA will now present 2 LUNs one with 1 volume and 1 with 4 volumes.
3 blades have VMWare ESX (v3.5.0) server installed and 1 has Server 2003 Enterprise with VMWare Virtual Center installed.
We have VMWare VMotion, DRS, and Consolidated Backup.
The whole process to get the blades up and running took about 2 days and that was from unpacking the kit to pushing out the OS's via HP's Rapid Deployment Kit, Windows Server 2003 installed and running on one blade 26mins.
We have 7 VMs running in the following:
- Domain Controller (2nd) - 100Gb Storage
- Server 2003 Server as front end for storage portion of the SAN - 100Gb Storage (C - 4Tb (D:\, E:,\ F:,\ G:\)
- Ubuntu Web Server (VLE) - 250Gb Storage
- Ubuntu Web Server (Student Server) - 100Gb Storage
- Windows Web Server (Windows WebApps) - 100Gb Storage
- Windows Applications Server (Non Web Apps) - 100Gb Storage
- Ubuntu Web Server (Training VLE) - 100Gb Storage
- Ubuntu Web Servver (Testing VLE) - 100Gb Storage
Physical servers retained:
- HP Server (Exchange)
- HP Server (SIMS)
- Dell Server (Domain Controller)
Servers Decommissioned: 15
I will look at possibly switching SIMS over to VM as we have enough power to do so but this would be over the next 12 months.
What to look out for:
REALLY, REALLY plan you VM setup and how you are going to deply your blade setup as this can make the whole process come to a stand still, also get your suppliers to talk to you on a regular basis, I was in contact with ours every other day with updates and timeframes.
So far the system is great, the HP insight control software is superb with perfomance management, alerts and monitoring, bandwidth control, oh and the switches in the blade chasis are layer2/3 so you can do internal VLAN'ing to seperate of the MSA traffic to the blades mimicking a fibre setup
Make sure that you allocate a block of 20 -30 IP addresses for this and yes you will need them considering each blade needs at least 5 ! (Server Nics x 2, iLo, Interconnects, Remote Management, Chasis OA, Switch Remote Access, 4 for the MSA (2 on a 10.x.x.x range) and so on....)
Total Planning Time: 5 months
Total Implementation Time: 1 Week including data migration and VM setup
Shout me if you need anymore info would be happy to provide any help
Last edited by ICTNUT; 24th October 2008 at 10:01 PM.
I've not used the HPs but I do use SAN, ESX and Dell blades. The following is just based on my personal experience over the last 3 1/2 years with the above. I have done no training for ESX, SANs, Networking etc so it may not be best practise but I have tried to absorb as much as possible over the years :P
There are are few key areas to get sorted out before implementation of the VMs themselves. This is assuming ESX (not ESXi) and a Virtual Center server.
You will probably want a substantial UPS with a network card, so that it can send a shutdown signal to all the blade hosts at the same time. You will need an agent on the blades. I do know this can be done with ESX and APC, and works with 3.5 (ESXi may be a more difficult issue as there is no agent afaik).
This will also mean that the switch which the blades and ups are plugged into needs an uptime greater than that of the blades during a powercut.
You may need an IEC309 16 or 32A socket for the power disribution bar for the blades (if you have one) and also the UPS (well, I use the socket as an alternative to hardwired UPS as it locks the plug in and is of sufficient capacity). You may want to consider an enviromental monitor card too, I swapped my apc on to include the temperature sensor in case of aircon failure as you will be producing a considerable amount of heat.
VMs will need to be configured with the VI client to shutdown automatically if they hosts begins a shutdown. This can be done by moving them to automatic startup, editing the time settings from default and moving them back to not being auto started if desired.
It's also good to enable ipmi (if the blade stays on the halt screen after shutting down) as the blade will turn off in case of a temperature initiated shutdown from the ups rather than staying on.
You will want vlans in place.
Mine is something like this:
10 VM Hardware
60 VM Servers A
70 VM Servers G
90 Client A
100 Client T
200-221 Switch/Room based for student machines.
This is only as a rough idea for how I have implemented things. The VM Hardware and VMotion (although VMKernel would be a better description these days) are non routable and only select servers can access them.
My VM physical hosts, SAN management infrastructures (ie all the managment ports), ups, tape libraries etc are all on the VM Hardware vlan
When using iscsi with ESX you have two connections from vmware unless you have a hardware iscsi HBA, one from the console and another from the VMKernel. I have had to add a second VMKernel connector for this in the VM Hardware vlan, a second vmkernel interface is used in the vmotion vlan when migrating between hosts.
The key decisions here are:
I will be exporting disk images to a NAS every term or so and currently use backup exec agents on the VMs I wish to backup, although there is now a ESX targeted product from symantec which will backup the images *and* use host based agents to allow block level application restore. You could also use SAN replication I guess but I haven't looked in to that route myself.
Using an additional server with an LTO3 tape library with the media server to backup a fairly respectible 490GB verified in 6 hours at the moment. File server backs up at around 2.5GB/min.
This is an area with a huge number of options and it will need some serious research, I've drifted towards my current solution because there was nothing else available when I started.
You need to get this right first time, as moving LUNs and such around after they are in use is quite tricky.
When you come to allocate space to your LUNs you do need to remember than you don't want too many VM disk images active at the same time on each one as there can be locking issues when writing and working with snapshots. I try to keep it to a maximum of 10 for mine, although the forums should give some up to date numbers and ideas for this.
This is important. The VMFS volumes AND the VM guest file system need to be aligned to a 32k boundary. Search for vmware and alignment and you should find the vmware whitepaper.
VMFS is aligned correctly for you if created the volume with Virtual Center. Windows 2008 automatically aligns its partitions to 1024k. Some linux will be, depending on the installer and distro. Windows 2003 and before will need aligning manually, although I simply attach and format on a 2008 VM temporarily now. It can be done with diskpart or diskpar.
ESX is a lot easier to use these days and I only have to install and configure APC agent, Dell OpenManage and backup exec (because I'm using an old version) from the console these days. When I was using 2.5 there was quite a lot of config files to modify manually!
TIME: This is important. You want you Hosts and guests to syncronise with an authorative time souce (or another server that has) to make sure they stay as accurate as possible. My DCs are VMs so they need to be right!
Virtual Center: Virtual center can be run as a VM itself and this is a supported configuration. However, you really need one physical host with windows, VI client and access to the blades so that it can be used to start up and configure anything if the DCs are virtual and powered off. I suggest it also have access to the switchs so vlan changes can be made if needed. Make sure it can logon with no DC available too (cached logins is ok for this).
I would suggest that if you do virtualise all your DCs, using AD authentication for any part of the Virtual hardware or software is *a very bad idea*.
And just a little info about my slightly elderly setup:
6 Dell 1855 blades. 8GB RAM, 2 x 3.0 P4 Xeon (missed out on Dual cores and 64 bit virtual machines due to not existing :/ )
1 Backup Server
LTO3 (soon 4) tape library.
EMC CX300 SAN with 2 brocade 3250 fibre channel switches. (14 146GB 10k rpm fibre channel drives, ouch!)
HP MSA2012i with 12 300GB 15k rpm SAS drives.
APC 5000 (Environmental monitor card) Blades and the CX300
APC 3000 For the Procurve
APC 2000 For the MSA.
Wow thanks for the great info already guys Very impressive, and glad to see that others have done the same as us and have got there. Also pleased to see that its been done with near enough the same kit as we have coming (ICTNUTS kit is similar to ours), so thanks so far, will read in more depth next week when i'm back in the office as i'm winding down for the weekend now, but its very much appreciated and I am sure we will have many questions once we get going with the kit
Hi; do any of you using blade and vmware have any redundancy built into the system in terms of power, hardware failure (ie circuit board) or even the whole blade chasis going down?
Also the same for VOIP if you use it; do you have any meshing in place for redundacy reasons?
I've been asked to look into the pros and cons of these two technologies before we implement them in the school, as it's almost half a million pound investment and we're interested to know if it's worth the money or just stick to old style solution i.e. get around 8 very powerfull servers to support a network of 500 clients fat and thin.
By the way can anyone give me an idiots guide to SAN interfaces? I am looking at various san's on HP website and I have no idea how they actually connect to a server?
The two main choices will be iscsi or Fibre channel, I'm not sure if you'd get anything fancy at the lower end yet like FCoE or infiniband.
iscsi works over normal ethernet, although you would want a dedicated switch or vlan for it. There is a cpu hit though, and you can get NICs with with TOE (tcpip offload engine) to help reduce it, but again unlikely at the lower end. My 2012i has 4 1Gb ports, 2 for each controller, but they can't be teamed and redundancy is a pain to set up on vmware, but can be done.
There are also iscsi HBAs which moves most of the load to a card like the fibre ones, but not many out there are supported.
Fibre channel is more expensive and will need a supported HBA (host bus adaptor) and a special fibre channel switch, with vmware these also can't be teamed, but redundancy works automatically for most configs. Much less overhead for fibre channel and it has a number of big advantages over iscsi such as data flow control. Most would be 4 or 8Gb now.
I prefer fibre, but the switches are expensive, as are the HBAs.
My fibre will switch over to the redundant path much quicker than my iscsi connector.
In terms of redundancy, the blade centre has 6 PSUs, these will be into differenent UPS's so if one UPS failed it won't take it all out, and the PSUs are set to run in Reduandent mode so that will also help it. Using the VMWare you have High Avalability options where if a blade goes down within a few minutes (if not less) the VMs that were running on the failed blade will auto restart on another blade that is part of the group based on how you set it all up so that will mean it will sort itself out fairly well.
As an update to my post above and in terms of failover:
We have only half filled our chasiss with blades (4), we have all 6 fans and 4 of 6 PSU's.
The PSU back into 2 APC 3000 UPS towers on a split load basis, 2 PSU's intoone APC and 2 into the other.
The UPS's are then plugged into 2 seperate circuits soif one cirsuit fails the whole lot does not go down.
The blades are double'd up on everything for failover, CPU's, HDD (SAS DualConnect) NICS (Teamed)
We have 2 Interconnects (L2/3 Switches in the chassis) setup for failover Active/Passive and 2 iLO modules again Active/Passive.
If youhave HA in VMWare then if a blade does fail it is a matter of moments and the VMs on that blade are restarting on another in that cluster, if you also add DRS and VMotion the change happens in milliseconds and the VM don't even go down this is due to triggers that can be set on resources of a blade and even services within a VM.
Although I can apreciate that most schools will only use 30% of what VMWare can do but if you spend a small amount of time looking into the many whitepapers the VMWare have on disaster solutions you will be amazed at what this stuff can really do.
I have spent at least 2 months setting up the failover and testing (actually pulling a blade during normal operation) to make sure it all works
By the way can anyone give me an idiots guide to SAN interfaces? I am looking at various san's on HP website and I have no idea how they actually connect to a server?
It's easy really, well depending on your choice of connection fibre ot iSCSI.
I opted for iSCSI and my 10Tb SAN has 4 NICs on it as I have a dual controller configuration (REALLY RECOMMENDED) so this mean that Ihave redundant network links running on a 192.168.x.x subnet and then there are 2 others, these act like a lifeline or heartbeat if you will and connect directly to the blade chassis and run on a 10.0.x.x subnet.
As DMcoy said above you really want your own VLAN for this, Ihave iSCSI HBAs and the NICs I have all have ToE so that does help.
The first 2 allow for network connectivity in the normal way but the second 2 allow for VMWare to utilise them for VMFS and, if like me, you utilise the SAN for VM image storage also this provided a dedicated route to these internally.
Now your SAN drives are worth thinking about with my setup being as follows:
6 x 300Gb SAS Dual Connect drives
6 x 750Gb SATA Dual Connect drives
Each drive having 2 fiber connections one to each controller so if a controller fails or even better one of the drive connections the whole thing will still function.
Hope this helps abit
Last edited by ICTNUT; 19th November 2008 at 09:31 PM.
HI John, everyone seems to have the Hardware sorted, so I will not blab on about this or that, however our biggest problem was coping with the extra data created by having so much space, we went from a 50mb RM CC3 allocation to 1gb for students and 10 GB for teachers on a vanilla network. Everything generally ticks over ok and you wont believe how you coped before you changed to VMware but your backups will be the sticky point as Symantec couldn't handle the workload and unfortunately we didn't have money for any fancy backup server etc, so we just used the old servers and plonked some bigger drives in. we didn't have any reliable backup for over 6 months as symantec bombed out as and when it felt like it so we bought a backup and monitoring software called Veeam and it works really well, it uses VMware snapshot technology and allows you to recover a full VM in no time at all and it works! Plus as the backup add to the main backup it only take 20mins to back up a fileserver after the original has been created.
I don't normally do plugs as its unprofessional as every experience is different but we purchased it from a company called Blue Coffee networks and i can highly recommend they even got us a HP 24tb SAS SAN for 6K, when HP had a big offer on