Duke (29th April 2010)
My best practice basic overview.
This is how I think about things today, well may change tomorrow!!
If the environment requires small/random (<32k) I/O then Mirror is best.
Database fit in to this area, also VMware with iSCSI.
VMware's VMFS is usually random I/O (typically around 5-10 VM's per VMFS)
Never used, people fell hard done by when they loose 50%, never mind 66%
RAIDZ & RAIDZ2
Ideal for large sequential reads/writes (>128k), eg. File Shares and backups images.
Paranoid users only, or when 2TB drive are available in S7000.
Ideal for Archive/Backup data.
Best on 7310/7410 with at least 1 Readzillia (100Gb L2ARC)
Test before use, seriously consider compression as an alternative
Record (Block) size is important, smaller better dedup ratio, large better overall system performance. (I'm still learning what the optimal number is for this, if any)
Don't use when performance is paramount. (i.e. Database)
Ideal for backups and VM templates.
Lower latency for Random I/O - Oracle, MS Exchange and MS SQL (Physical or VM's)
Use a MIRROR RAID level
Synchronous write bias to Latency (Use SSD ZIL if available)
Consider Records (block) sizes of 8k (Data/Client dependent)
Higher latency, but good for sequential reads/writes.
Ideal for Linux home directories
VMware for general machines and VDI.
This caches writes to volatile memory, if you suffer a power failure you can loose data.
The performance increase is very tempting, particularly when you don't have a Logzilla.
Choices is yours.
Duke (29th April 2010)
Very interesting read Andy. I currently have a 7110 which hosts a mix of Windows and Ubuntu VMs (XenServer). I've got things like WSUS, SAV, Fog, and Sharepoint running on this gear and it generally runs just like on bare metal. My Exchange 2007 box though is laggy when accessing it and I do see the odd disconect when using Outlook. I'm using a NFS share to host all the VMs, with the default share settings. I also am using an iSCSI LUN on my SQL box to back up the DBs to it. Again, with the default settings. I have a dedicated Procurve assigned to SAN duties, and use LACP to create a 3 NIC datalink to the switch. Unfortunatly XenServer does not use Jumbo frames, so I cannot enable that feature.
We are moving into a new building in the fall and I've asked for, and been given the go ahead, on getting a 7310 w/ a 4400 with 11 x 1TB disks and 1 x 18GB logzilla. The existing VMs will migrate over to this box (the 7110 will be repurposed to user drives and backup). In addition I will be building a XenApp farm, which will utilize the 7310. Any future servers will be built on to this gear. I beleive I have enough space for the near term and enough expandability for the long term.
I have to upgrade my SQL server this summer and was going to put it on hardware as the sluggishness of Exchange scared me away from using the 7110. Would the 7310 w/ Logzilla be enough to virtualize it, given all that I hve going onto it? Should I be looking at a Readzillla as well? Or multiples of both? My budget it set but I may be able to get some things changed up, or purchase through my operating budget.
I mention above that I use default settings for all my shares/LUNs. This is because I've never seen a whitepaper, forum post, or blog on what tweaks would be best to use depending on the usage. Your post shines some light on it for me, but, is anyone aware of up to date data on the subject?
I use the 7110 as well. I have not had much luck with it as far as write latency goes with my VMs. I did have a Citrix XenApp farm running off of it for about 2 days until I had to move it to local storage on the XenServers, ruining my HA, because of the horrible latency the users were experiencing.
I primarly use NFS with the exception of a couple Exchange datastores using iSCSI. Network utilization is low and I am also using LACP/trunking. We are also looking for a new NAS/SAN solution. I am interested in getting the 7320 and moving the 7110 to a remote site making use of the remote replication functionality to create an off-site disaster recovery location.
I think I screwed the pooch with the 7110 since the beginning choosing Double Parity Raid for the pool, which I think killed my write performance.
If I were to use this in a DR scenario and change the pool to RAID 10/Mirror, you think I could pull some decent performance out of this thing for around 15 VMs, 4 of them being terminal servers, 2 SQL servers, 1 Exchange server, and other various small VMs. I'm not looking for native performance since it would be a DR scenario but usable performance. Depending on when during the month the DR happens the SQL servers are not that hard hit.
Love all the features of the 7110, but so far write latency is a big problem for me.
Double Parity and Wide RAIDZ zpool are very poor for Random I/O.
I agree by recreating a mirrored zpool and enabling write cache use will get a significant performance increase and reduced write latency.
Plus look at
- Filesystem aliment, especially with SQL/Exchange on Windows 2003. (see Sun Storage 7000 Series - Articals & Blueprints)
- Latest firmware.
andy0789 (21st April 2011)
Just as a followup for the 7110, I switched one of my virtual machines from NFS to iSCSI with write cache enabled and have seen significant speed improvements.
VMWare was reporting the following averages before and after the switch (taken from the same time period on seperate days):
140ms Write Latency
80ms Read Latency
14ms Write Latency
34ms Read Latency
Still could do better but atleast it is an improvement. Changing the pool to Mirror instead of Double Parity Raid could only increase performance.
7110 on 2010.08.17.2.0,1-1.18
I was looking at both Datastore and Virtual Disk, the one I documented below is for the datastore. Both showed improvement, mostly with write latency.
In my environment I have two of the 1GbE interfaces on the 7110 using LACP over 2 interconnected Cisco 3750 switches. These two interfaces are dedicated to VLAN 11 for our ESX servers to use NFS or iSCSI.
When I created a LUN on the 7110 I enabled the "write cache" feature. I think if you start with a datastore using NFS, put load on it, specifically writes and see what your latency is, then switch to iSCSI with "write cache" enabled on the 7110 LUN and compair the two. If you are like me, you will see the write latency drop.
Duke (3rd May 2011)
Continuing on with the 7110 Performance, I have attached a link to a report of my I/O operations per second. I am still stumped as to why my iSCSI, NFS, and SMB operations remain low but the Disk I/O is always around 2000 ops/sec. What are the disks doing? Is it due to the raid level (Double Parity Raid)? Any insight would be appreciated.
I've been having a chat with Oracle Support over a good number of issues and we were looking at performance and the disk pool setup and the rule of thumb seems to be that the Protocol I/O should be 4x that of the Disk I/O if the Disk I/O is higher than Protocol I/O then the disk pool isn't setup efficently, this is very key on the 7110 as its got a small number of spindles to spread it across and very limited memory.
My 7110 is suffering now from the poor performance issue, but my 7120 which has bags more ram and a Flash Accelerator card is much better and the I/O is the other way around.
As you can see from the graphs below (which I sent to Oracle to look at and confirm about the 4x rule) I have marked up the 7110 and 7120 on them and have only shown SMB, however both are doing NFS as well.
7120 Disk Protocol.jpg
7110 Disk Protocol.jpg
As that shows the 7120 is working optimally in terms of Disk VS Protcol I/O (as confirmed by Oracle Support today when I sent them images to them) and the 7110 is under a great deal of strain. They are the analytics for those options for 1 week as obviously on SMB it peaks and troughs usually during the day.
Last edited by john; 10th May 2011 at 10:24 PM.
sun storage 7110
have a problem
when user only open shared folder
graphs smb operations shows that he generate about 3k ops per sec
graphs by type of operations show that operations is NtTransact
what could it be?
Last edited by Hanson; 25th October 2011 at 03:01 PM.
Firstly, restart the SMB service. It shouldn't take long to go down and come back up but warn people there will be short outage. If you SMB usage drops off then this is probably it, even if SMB starts to grow again.
There is a binary patch available for older releases:
However you'd be better off just using the latest release which includes it.The binary fix for 6991334 on the 2010.08.17.1.0,1-0.0 release of the firmware contains two files.
These files must not be installed on an appliance running any release other than 2010.08.17.1.0,1-0.0 (2010Q3.1)
Looks on the bright-side, my 7410 hit 92,100 SMB IOPS while I had this bug.
current version 2010.08.17.1.1.1-1.16
but have available 2 new update (haven't time to install it, think it's time to do this)
restart SMB doesn't help, it's first that i done.
and one question not for this problem
disk 13 in Double parity RAID
How many Disk IO a limit for it ?
I think that bug affected up to 2010.08.17.1.0,1-0.0 and the release after that, so it could still well be that bug. Applying updates is definitely the first route to go and Oracle will ask you to do that anyway probably. Have you contacted Oracle support about this problem? Over in the UK their support has been great so far.
Last edited by Duke; 26th October 2011 at 11:30 AM.
Hanson (27th October 2011)
one more question
i can install latest 2010.Q3.4.2
or previous (2.0, 2.1, 3.1, 4.0, 4.1) need to install too?
2010.Q3.4.2 requires you to be running 2010.Q3.2.1.
You'll need to upgrade to that first from where you are.
Software Updates - Fishworks - wikis.sun.com
The release notes show you which versions to follow through.
Duke (28th October 2011)
There are currently 1 users browsing this thread. (0 members and 1 guests)