+ Post New Thread
Results 1 to 13 of 13
Windows Server 2012 Thread, 'Size on Disk' has to be wrong in Technical; Overall, dedupe is working great for us. However, if I go to an individual file (for example an image for ...
  1. #1

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    18,530
    Thank Post
    527
    Thanked 2,648 Times in 2,049 Posts
    Blog Entries
    24
    Rep Power
    925

    'Size on Disk' has to be wrong

    Overall, dedupe is working great for us. However, if I go to an individual file (for example an image for a PC captured in Clonezilla), it tells me the file is 256Mb Size, and 4kb Size on Disk.

    The latter *must* be wrong. There are no other copies of this file on the disk.

    So, somewhere, there's something odd going on. Any ideas what is causing this?

  2. #2

    sonofsanta's Avatar
    Join Date
    Dec 2009
    Location
    Lincolnshire, UK
    Posts
    5,375
    Thank Post
    958
    Thanked 1,630 Times in 1,103 Posts
    Blog Entries
    47
    Rep Power
    711
    Dedupe works on sectors, not files, so it could have enough similar data to another file, perhaps?

    4KB sounds too neat though, as I'm guessing that would be the block size on the HDD - so it could actually be almost 0 in size.

  3. #3

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    18,530
    Thank Post
    527
    Thanked 2,648 Times in 2,049 Posts
    Blog Entries
    24
    Rep Power
    925
    Ah, just read up on it - it uses 'sub file chunking', so not sector dedupe as such. It chops files up into 32-128kb chunks and works on them.

    However, its definitely not right - if I select everything on the deduped disk, I get a 'size on disk' of 3.89GB, and a size of 400GB. If I go to disk properties, it lists used space as 284GB, and Server Manager says my dedupe savings are 148GB...

  4. #4

    sonofsanta's Avatar
    Join Date
    Dec 2009
    Location
    Lincolnshire, UK
    Posts
    5,375
    Thank Post
    958
    Thanked 1,630 Times in 1,103 Posts
    Blog Entries
    47
    Rep Power
    711
    I would imagine that if you took the number of files on the drive, multipled by 4kb, you'd get 3.89Gb.

    No idea why though, not done dedupe yet. It sounds very much like a bug.

  5. #5


    Join Date
    Mar 2009
    Location
    Leeds
    Posts
    7,060
    Thank Post
    232
    Thanked 926 Times in 795 Posts
    Rep Power
    309
    perhaps its looked inside a load of similar capture files and found umpteen hal.dll's etc and can recognise they are part of that file when used?

  6. #6

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    18,530
    Thank Post
    527
    Thanked 2,648 Times in 2,049 Posts
    Blog Entries
    24
    Rep Power
    925
    Nah, its student files, images of computers, WSUS files, shared files and software installers. So its definitely malfunctioning.

  7. #7


    Join Date
    Mar 2009
    Location
    Leeds
    Posts
    7,060
    Thank Post
    232
    Thanked 926 Times in 795 Posts
    Rep Power
    309
    but computer images they are compressed single files if its one it understands it might see that image1.gho, image2.gho etc all contain hal.dll

  8. #8

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    18,530
    Thank Post
    527
    Thanked 2,648 Times in 2,049 Posts
    Blog Entries
    24
    Rep Power
    925
    Quote Originally Posted by sted View Post
    but computer images they are compressed single files if its one it understands it might see that image1.gho, image2.gho etc all contain hal.dll
    Yes, but they are all very different images - I have a Citrix thin client image (the default that came from a thin client), an Ubuntu image, and a Windows image. There's no overlap.

    The issue still remains - nearly all files report 4kb as their size. The numbers above prove there's something wrong. 3.89GB is not 284GB as it should be. There are only 148GB of savings.

  9. #9

    tmcd35's Avatar
    Join Date
    Jul 2005
    Location
    Norfolk
    Posts
    6,069
    Thank Post
    902
    Thanked 1,013 Times in 825 Posts
    Blog Entries
    9
    Rep Power
    350
    Silly thought, are the files you are looking at - 4kb - not just pointer records to where the data is actually being stored?

    After seeing this thread I've tried to do some reading up to see if it's worth impmenting here (no real need (yet)), and thought of one question. Is it working at a binary level or is it seeing something more in the data before it decides if it needs deduplicating?

    What I mean is - if it chops a file into 32kb binary chunks. It's entirely possible that a music file might have the same 32kb chunk as a word document or a movie file. Of course that chunk might appear in a different location in the other files data streams and the data will be interpreted differently by a running program accordingly, but there still only needs to beone copy of that 32kb chunk?

    Eg. File A (.mp3) might use chunks - A, B, E, J, K, L, F
    and File B (.doc) might be made of chunks - Z, W, R, S, A, T

    for instance? Only one copy of chunk A is needed for the two files?

  10. #10

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    18,530
    Thank Post
    527
    Thanked 2,648 Times in 2,049 Posts
    Blog Entries
    24
    Rep Power
    925
    Indeed, that's how it works, but as I say, I should still have 284GB of 'size on disk', but only get 3.89GB.

  11. #11

    sonofsanta's Avatar
    Join Date
    Dec 2009
    Location
    Lincolnshire, UK
    Posts
    5,375
    Thank Post
    958
    Thanked 1,630 Times in 1,103 Posts
    Blog Entries
    47
    Rep Power
    711
    Silly question, but... have you tried restarting your server? It just "feels" (for what that's worth) like a display bug, or miscalculation, and it might resolve itself if Google has no answers.

    "nearly all files are 4kb" - does there seem to be any pattern to those that are OK? I'd have thought it'd affect them all.

  12. #12

    tmcd35's Avatar
    Join Date
    Jul 2005
    Location
    Norfolk
    Posts
    6,069
    Thank Post
    902
    Thanked 1,013 Times in 825 Posts
    Blog Entries
    9
    Rep Power
    350
    More silly questions...

    Have you checked the data is there? You can randomly go to any file and view it as if it hadn't been dedupe'd?

    If so, I go back to wondering if - for what ever reason - the 4kb files and 3.89Gb you are seeing are little more than the pointers to the actual data. The hard disk itself is reporting 284Gb used space. And dedupe is saying that you should have used 432Gb of space?

    So that suggests you have a little over 1 million files on your file system? (3.89Gb/4kb)

  13. #13

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    18,530
    Thank Post
    527
    Thanked 2,648 Times in 2,049 Posts
    Blog Entries
    24
    Rep Power
    925
    Its not quite a complete match - the data is 400GB almost exactly, so the numbers don't quite add up.

    The disk has about 400k files on it.

    The server has been rebooted multiple times too.



SHARE:
+ Post New Thread

Similar Threads

  1. This has to be a personal best!
    By Little-Miss in forum General Chat
    Replies: 16
    Last Post: 13th January 2011, 12:50 PM
  2. [Pics] Character in green has to be a teacher.....
    By mattx in forum Jokes/Interweb Things
    Replies: 0
    Last Post: 15th September 2010, 11:19 PM
  3. [Pics] One Has To Be On The Lookout For Commoners On One's Land!
    By DaveP in forum Jokes/Interweb Things
    Replies: 0
    Last Post: 9th April 2010, 08:42 PM
  4. [Website] This story has to be made up.....right ?
    By mattx in forum Jokes/Interweb Things
    Replies: 3
    Last Post: 19th June 2009, 02:06 PM
  5. Cookies Size on Disk
    By Mikey in forum How do you do....it?
    Replies: 8
    Last Post: 20th February 2006, 08:57 PM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •