+ Post New Thread
Page 1 of 2 12 LastLast
Results 1 to 15 of 21
*nix Thread, ZFS dedupe performance hit? (and implications for dedupe on 7110) in Technical; Reading the opensolaris forums, I notice a few people are seeing significant performance hits when turning on dedupe. ( OpenSolaris ...
  1. #1


    Join Date
    Dec 2005
    Location
    In the server room, with the lead pipe.
    Posts
    4,677
    Thank Post
    279
    Thanked 782 Times in 609 Posts
    Rep Power
    224

    ZFS dedupe performance hit? (and implications for dedupe on 7110)

    Reading the opensolaris forums, I notice a few people are seeing significant performance hits when turning on dedupe. (OpenSolaris Forums : [zfs-discuss] ZFS Dedup Performance ... and Re: [zfs-discuss] Troubleshooting dedup performance).

    The "fix" appears to be loads of ram or an SSD for cache, though it looks like it needs work ([zfs-discuss] $100 SSD = >5x faster dedupe).

    I'm building out a test vm now to see if I get similar issues.

    a) Has anyone else run into this?
    b) Bearing in mind the 7110s have no SSD, will it be sensible to turn on dedupe?

  2. #2
    apaton's Avatar
    Join Date
    Jun 2009
    Location
    Kings Norton
    Posts
    283
    Thank Post
    54
    Thanked 106 Times in 87 Posts
    Rep Power
    36
    Quote Originally Posted by pete View Post
    Reading the opensolaris forums, I notice a few people are seeing significant performance hits when turning on dedupe. (OpenSolaris Forums : [zfs-discuss] ZFS Dedup Performance ... and Re: [zfs-discuss] Troubleshooting dedup performance).

    The "fix" appears to be loads of ram or an SSD for cache, though it looks like it needs work ([zfs-discuss] $100 SSD = >5x faster dedupe).

    I'm building out a test vm now to see if I get similar issues.

    a) Has anyone else run into this?
    b) Bearing in mind the 7110s have no SSD, will it be sensible to turn on dedupe?
    I've played with De-dupe, but only under OpenSolaris build 128a running on VirtualBox, I haven't done any performance testing yet. I'm building X4200 with OpenSolaris b132 next week, I will let you know how I get on with performance.

    With regards the OpenSolaris forum posts listed, I'm struggling to get my head around how SSD with L2ARC is improving performance with de-dupe. L2ARC is a read cache, I thought the performance hit would be from writing de-dupe data to disk and not reading de-duped data. So some homework for me to do.

    With reference to 7110, I feel the limitation will be a combination of maximum 8Gb memory and no read SSD L2ARC. (L2ARC is a secondary read cache to main system memory)

    But we are still in development and the code will change, its a wait and see.

    Andy

  3. #3
    apaton's Avatar
    Join Date
    Jun 2009
    Location
    Kings Norton
    Posts
    283
    Thank Post
    54
    Thanked 106 Times in 87 Posts
    Rep Power
    36
    Basic performance testing of dedupe

    Dedup on a Sun X4200 Opteron 2 x Dual Core 2.6, 8Gb Memory & 4 x 73Gb SAS Drives (10Krpm)

    OS
    Code:
    apaton@osol:/export/home/iso/linux$ uname -a
    SunOS osol 5.11 snv_131 i86pc i386 i86pc Solaris
    Created 2 ZFS pools (dedupe,zfs) and a UFS Filesystem

    • /dedupe - ZFS with Dedup enabled (SHA256 - Verify off)
    • /zfs - ZFS with Dedup disabled (Defaults)
    • /ufs - UFS filesystem (Defaults)


    Data files, (CentOS-4.4.ServerCD-x86_64 is duplicated 3 times) (7.5Gb)

    Code:
    apaton@osol:/export/home/iso/linux$ ls -ilh *
    462 -rwxrwxrwx 1 root   root  601M 2010-02-01 16:31 CentOS-4.4.ServerCD-x86_64_A.iso
    463 -rwxrwxrwx 1 root   root  601M 2010-02-01 16:48 CentOS-4.4.ServerCD-x86_64_B.iso
    459 -rwxrwxrwx 1 root   root  601M 2010-02-01 13:28 CentOS-4.4.ServerCD-x86_64.iso
    457 -rwxrwxrwx 1 root   root  3.6G 2010-02-01 13:27 CentOS-5.1-i386-bin-DVD.iso
    455 -rwxrwxrwx 1 root   root  628M 2010-02-01 13:25 rhel-3-u6-i386-es-disc2.iso
    464 -rwxr-xr-x 1 apaton staff  68M 2010-02-01 23:27 rhel-3-u7-i386-as-disc1.iso
    461 -rwxrwxrwx 1 root   root   36M 2010-02-01 13:29 rhel-5.2-server-x86_64-dvd.iso
    453 -rwxrwxrwx 1 root   root  177M 2010-02-01 13:22 RHEL4-U2-x86_64-ES-disc1.iso
    458 -rwxrwxrwx 1 root   root  699M 2010-02-01 13:28 ubuntu-8.10-desktop-i386.iso
    456 -rwxrwxrwx 1 root   root  638M 2010-02-01 13:25 ubuntu-8.10-server-i386.iso
    Initial copy of files to DEDUP enable ZFS dataset.
    Code:
    apaton@osol:/export/home/iso/linux$ pfexec ptime tar cf - . | pv | ( cd /dedupe; tar xf - )
    real     5:28.688240194
    user        0.572326624
    sys        14.953198730
    7.53GB 0:05:28 [23.5MB/s]
    Initial copy of files default ZFS dataset.
    Code:
    apaton@osol:/export/home/iso/linux$ pfexec ptime tar cf - . | pv | ( cd /zfs; tar xf - )
    real     2:37.843718899
    user        0.575755074
    sys        14.397860363
    7.61GB 0:02:37 [49.3MB/s]
    Initial copy of files default UFS filesystem.
    Code:
    apaton@osol:/export/home/iso/linux$ pfexec ptime tar cf - . | pv | ( cd /ufs; tar xf - )
    real     2:53.149996234
    user        0.711298116
    sys        14.034587988
    7.61GB 0:02:53 [  45MB/s]
    Third copy of files to DEDUP enable ZFS dataset.
    Code:
    apaton@osol:/export/home/iso/linux$ pfexec ptime tar cf - . | pv | ( cd /dedupe/d3; tar xf - )
    real     2:49.402522619
    user        0.859008763
    sys        11.766251301
    7.53GB 0:02:49 [45.5MB/s]
    
    apaton@osol:/$ zpool list dedupe
    NAME     SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
    dedupe    68G  6.38G  61.6G     9%  3.56x  ONLINE  -
    Summary

    The above result look conclusive that a ZFS dataset with Dedup enabled will slow things down dramatically. But notice performance on dedpue did improve on second and third copies.

    I've researched on the critical performance factors for dedup and memory seems critical. The deduplication table (DDT) needs to be held in memory(ARC) or at least SSD (L2ARC/Readzilla). Simple answer more hardware the better!

    This is the same findings as Pete pointed out in his post. (I now understand them better )

    So therefore in my opinion the 7110 will give usable but not great performance for dedup. Sun 7310 or above will be the best option for serious dedupe performance/requirements.

    Andy
    Last edited by apaton; 2nd February 2010 at 11:17 PM. Reason: spelling mistake

  4. 3 Thanks to apaton:

    CyberNerd (2nd February 2010), j17sparky (3rd February 2010), pete (2nd February 2010)

  5. #4

    webman's Avatar
    Join Date
    Nov 2005
    Location
    North East England
    Posts
    8,413
    Thank Post
    642
    Thanked 964 Times in 664 Posts
    Blog Entries
    2
    Rep Power
    327
    It would be interesting to see if there was a difference in the initial write speed for lots of smaller files (but similar large overall size) instead of ISOs. For example, lots of office documents, images and audio files to represent as close to a "real world" environment as possible.

  6. #5

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,683
    Thank Post
    1,268
    Thanked 788 Times in 685 Posts
    Rep Power
    237
    Quote Originally Posted by pete View Post
    The "fix" appears to be loads of ram or an SSD for cache
    I've just been investigating using OpenSolaris for our backup server. Our backup system currently handles de-duplication at the file level, calculating a hash value for each file, but block-level de-duplication would seem rather more space-efficient. Searching Google, I came accross this post:

    ZFS Deduplication : Jeff Bonwick's Blog

    Which points out that you can change the hashing algorithm used from SHA256 to, say, fletcher4, but make sure you turn on verification, i.e. make sure the file system checks for hash collisions. If memory or CPU performance is an issue you could pick a hashing algorithm that produced a range that would fit better in to RAM and/or took less time to calculate.

    Going by the post above, block-level de-duplication has only been available in ZFS file systems since November just gone. The OpenSolaris install available on Sun's site seems to be from back in June. Can I actually use de-duplication right now on OpenSolaris, does it install straight from the CD, or do I have to do something complicated to install de-duplication support?

    --
    David Hicks

  7. #6

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,683
    Thank Post
    1,268
    Thanked 788 Times in 685 Posts
    Rep Power
    237
    Quote Originally Posted by dhicks View Post
    you can change the hashing algorithm used from SHA256 to, say, fletcher4, but make sure you turn on verification
    Okay, scratch that, just read the FAQ - seems like that's not actually working yet.

    --
    David Hicks

  8. #7


    Join Date
    Dec 2005
    Location
    In the server room, with the lead pipe.
    Posts
    4,677
    Thank Post
    279
    Thanked 782 Times in 609 Posts
    Rep Power
    224
    Quote Originally Posted by dhicks View Post
    Can I actually use de-duplication right now on OpenSolaris, does it install straight from the CD, or do I have to do something complicated to install de-duplication support?

    --
    David Hicks
    Look here for newer builds: Genunix

    I also found the EON nas build, but only really poked around with it to see if it worked. E O N

    I suppose we could always campaign for Sun to let us bump up the ram on the 7110s. It's not like there aren't free slots in the chassis.

    On ZFS I was hoping for a delayed dedupe functionality (as you can have with file-based dedupe), so it could spend the weekend/overnight sorting out duplicates.

  9. 2 Thanks to pete:

    dhicks (2nd February 2010), Soulfish (2nd February 2010)

  10. #8

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,683
    Thank Post
    1,268
    Thanked 788 Times in 685 Posts
    Rep Power
    237
    Quote Originally Posted by dhicks View Post
    Can I actually use de-duplication right now on OpenSolaris, does it install straight from the CD, or do I have to do something complicated to install de-duplication support
    Okay, so from reading the de-duplication FAQ, it seems I need "SXCE build 128 or 129". Does anyone know what SXCE is or how I go about installing it on a PC or virtual machine?

    --
    David Hicks

  11. #9

    Join Date
    Jan 2009
    Location
    England
    Posts
    1,406
    Thank Post
    304
    Thanked 304 Times in 263 Posts
    Rep Power
    82
    Quote Originally Posted by pete View Post
    Look here for newer builds: Genunix

    I also found the EON nas build, but only really poked around with it to see if it worked. E O N

    I suppose we could always campaign for Sun to let us bump up the ram on the 7110s. It's not like there aren't free slots in the chassis.

    On ZFS I was hoping for a delayed dedupe functionality (as you can have with file-based dedupe), so it could spend the weekend/overnight sorting out duplicates.
    I'd agree that it'd be a real shame if we couldn't use Dedupe on the 7110 series. Here's hoping that we're allowed to bump the ram up!

  12. #10


    Join Date
    Jan 2006
    Posts
    8,202
    Thank Post
    442
    Thanked 1,032 Times in 812 Posts
    Rep Power
    339
    is anyone using dedupelication ZFS with samba homedirectories? We been using samba for some time now, and I'd certainly consider a solaris migration just wondered if anyone had success here

  13. #11
    apaton's Avatar
    Join Date
    Jun 2009
    Location
    Kings Norton
    Posts
    283
    Thank Post
    54
    Thanked 106 Times in 87 Posts
    Rep Power
    36
    Quote Originally Posted by dhicks View Post
    Okay, so from reading the de-duplication FAQ, it seems I need "SXCE build 128 or 129". Does anyone know what SXCE is or how I go about installing it on a PC or virtual machine?

    --
    David Hicks

    Build 128 was first to have Dedupe, 128a was released to fix bug in Fletcher4.
    Current binary release is build 131.

    Upgrade to latest build release.

    pfexec pkg set-authority -O http://pkg.opensolaris.org/dev/ opensolaris.org
    pfexec pkg image-update

    Andy

  14. Thanks to apaton from:

    dhicks (2nd February 2010)

  15. #12

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,683
    Thank Post
    1,268
    Thanked 788 Times in 685 Posts
    Rep Power
    237
    Quote Originally Posted by dhicks
    you can change the hashing algorithm used from SHA256 to, say, fletcher4, but make sure you turn on verification
    Quote Originally Posted by dhicks
    Okay, scratch that, just read the FAQ - seems like that's not actually working yet.
    Quote Originally Posted by apaton View Post
    Build 128 was first to have Dedupe, 128a was released to fix bug in Fletcher4.
    Okay, un-scratch the above, then, it seems it does work. If / when I get OpenSolaris working I guess I'll find out myself - no luck so far, EON booted just fine as a live CD but seems to be missing instructions on how to get it on to my harddrive, the OpenSolaris CD didn't get past GRUB, and I managed to screw up the CentOS install to run OpenSolaris as a Xen VM 3 times in one day. I'll try again tomorrow.

    --
    David Hicks

  16. #13
    apaton's Avatar
    Join Date
    Jun 2009
    Location
    Kings Norton
    Posts
    283
    Thank Post
    54
    Thanked 106 Times in 87 Posts
    Rep Power
    36
    Quote Originally Posted by dhicks View Post
    ... and I managed to screw up the CentOS install to run OpenSolaris as a Xen VM 3 times in one day. I'll try again tomorrow.
    Give VirtualBox a try, worked first time for me.

  17. #14


    Join Date
    Dec 2005
    Location
    In the server room, with the lead pipe.
    Posts
    4,677
    Thank Post
    279
    Thanked 782 Times in 609 Posts
    Rep Power
    224
    Quote Originally Posted by dhicks View Post
    Okay, un-scratch the above, then, it seems it does work. If / when I get OpenSolaris working I guess I'll find out myself - no luck so far, EON booted just fine as a live CD but seems to be missing instructions on how to get it on to my harddrive,

    --
    David Hicks
    Start here: EON ZFS Storage (NAS) (EON ZFS Storage) halfway down the page. Specifically:
    After the image (eon.iso) is burned to a CD and booted. Login info is:
    user: admin pass: eonstore
    user: root pass: eonsolaris

    Type and run the following. This script prompts the user through configuration questions like hostname, IP/DHCP, netmask, domain name and more. This step will ask questions to configure and ID the system for live image use.
    # /usr/bin/setup

    This step is optional but necessary if the configuration changes made are to be preserved beyond a reboot or power off. This requires a writable destination, USB or CF drive attached before the command is run. The command will facilitate formatting and installing the live image (image on the CD) to the USB or CF drive.
    # /usr/bin/install.sh

    This step should be done after install.sh or, to preserve configuration changes made to the image. This preserves the original image to /mnt/eonX/boot/x86.eon.orig (bootable by the OEM choice from GRUB) and saves a new default boot image to /boot/x86.eon. It will move the live image to x86.eon.1, x86.eon.2 and so on each time it is run.
    # /usr/bin/updimg.sh

  18. Thanks to pete from:

    dhicks (3rd February 2010)

  19. #15


    Join Date
    Oct 2006
    Posts
    3,414
    Thank Post
    184
    Thanked 356 Times in 285 Posts
    Rep Power
    149
    Quote Originally Posted by pete View Post
    I also found the EON nas build, but only really poked around with it to see if it worked. E O N

    Ive found EON to be by far the easiest and most reliable for doing testing in virtualbox.
    Theres a web interface for it too. No substitute for CLI quite yet but great for doing general mantainance on an OS you most likely never hardly touch

    // napp-it free ZFS NAS-SAN-Server: installed quickly - ready to run - easy to manage

SHARE:
+ Post New Thread
Page 1 of 2 12 LastLast

Similar Threads

  1. Sun Storage 7110 Performance
    By Ric_ in forum Hardware
    Replies: 64
    Last Post: 7th November 2011, 07:52 PM
  2. laptop power implications
    By Disease in forum How do you do....it?
    Replies: 0
    Last Post: 11th September 2009, 01:06 PM
  3. Replies: 3
    Last Post: 21st April 2009, 05:32 PM
  4. Replies: 0
    Last Post: 5th March 2009, 11:32 AM
  5. Proxy performance hit after bandwidth upgrade?
    By pete in forum Wireless Networks
    Replies: 16
    Last Post: 18th January 2007, 11:58 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •