Hardware Thread, NAS Box for backup in Technical; Originally Posted by dhicks
I think for block-level deduplication I'd prefer to have a file system that I could use ...
16th November 2011, 08:56 AM #16
thats fair enough. I don't have the space to do that for the amount of data we have! even the 28tb would only keep a month of so's stuff in that case!
Originally Posted by dhicks
16th November 2011, 09:15 AM #17
No, that's the whole point of a deduplicating file system - you store replicated data once and have multiple pointers to it. At the block device / dedicated deduplicating file system level this means calculating a checksum on each block of data. This is the approach used by ZFS and a bunch of assorted FUSE-based filesystems available for Linux. Unfortuantly, I found ZFS for Linux rather unreliable, and I think development has now stalled, and the FUSE-based systems available are experimental and/or horribly slow. Therefore, the best approach is probably file-level deduplication, where you simply calculate a checksum on each individual file and replace duplicates with a hard link. This won't work for a live filesystem (because you might end up changing a file that you think is a separate file but is actually a hard link), but it's ideal for a read-only backup.
Originally Posted by glennda
rsync seems to operate in a way that is exactly what you'd want to be compatible with the above system, i.e. it creates a whole new file when updating an existing file so it doesn't end up appending data to the linked file. Therefore, all you need is a very simple script that, every night, duplicates the previous day's backup folder into a new folder by duplicating the directory structure and creating hard links to files, then runs rsync to update any changes with your live file server. Each day you only use however much space it takes to duplicate the directory structure and space for changed files. I'm guessing your 28TB server could store daily backups of a 1TB file share going back several years.
16th November 2011, 10:11 AM #18
Interestingly, I've built a backup box a little while ago - 8x2TB disks (2x80GB for OS) with space for 2 more disks if I need - total cost around £1.5k, though now would cost a lot more due to HDD prices. Case purchased off xcase (significantly cheaper case than the Supermicros, a little lower build quality, but does the job), low power hard drives (it's only a backup server, we're not expecting significant iops) and am hitting fairly decent speeds (about 80-90Mb/s continual across 1GB link). OS wise, grabbed NexentaStor Community Edition, though could've gone with Nexenta Core and would've done the job as well. Actually, just remembered I blogged about it - Willog » Terabytes on a budget… 2U 14.5TB usable backup device, although haven't updated that with OS HDDs (which increased the network throughput, USBs were causing some issues on some aspect of the system )
Depending on what exactly you're backing up, @dhicks idea is a good one, we're looking at backup software with capability to dedupe due to our backup arrangements. Expansion of dhicks idea could be to use inbuilt zfs snapshotting (so rsync just has to do update, you don't have it doing the move of older files), and possibly robocopy on windows (keep acls and use cifs for file transfers)
By lionsl2005 in forum How do you do....it?
Last Post: 23rd November 2010, 03:18 PM
By ful56_uk in forum Thin Client and Virtual Machines
Last Post: 11th September 2010, 10:13 PM
By garethedmondson in forum Hardware
Last Post: 27th October 2009, 09:42 AM
By Gonk in forum Hardware
Last Post: 3rd February 2009, 03:57 PM
By richard in forum Hardware
Last Post: 8th July 2007, 10:31 AM
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)