# Christmas Lectures - compression

• 31st December 2008, 10:03 PM
dbrown
Christmas Lectures - compression
Just listening to RI Xmas lectures and heard an explanation of data compression which explained how numbers 0-31 can be represented in 5 bits (this followed an earlier explanation of natural binary showing 0-15 represented by 4 bits and 0-1023 by ten bits.)
Since natural binary represents 0-31 in 5 bits, this does not seem a very convincing explanation of data compression.

Does anyone know of a good, simple explanation of data compression?

Don.
• 31st December 2008, 10:17 PM
russdev
I had forgotten about the Christmas Lectures thanks for the reminder and looks like they are free on five.tv

Russ
• 1st January 2009, 06:40 PM
srochford
I haven't seen the programmes so I'm not sure what point they were trying to make with the 0-31 in 5 bits but I'm guessing it's to do with the fact that normally you'd store numbers in at least 8 bits. If you can be sure that you only need the values 0-31 then you can save space by packing stuff more tightly (but it's not a good example!)

Suppose you have a bitmap image. the "standard" way of storing it is to look at each pixel, work out how much red, green and blue and allocate 1 byte for that (so a pure red pixel might be stored as 3 bytes - 255,0,0; yellow would be 255,255,0; magenta would be 255,0,255 and so on). This means that if you have a 1024 x 768 image you need 1024 x 768 x 3 bytes to store it - a total of 2304 kilobytes which is a big file!

A simple example of compression is something called run length encoding. Instead of looking at each pixel, you look for patterns. Suppose the picture has a lot of sky in it. It might be that you have 200 pixels in a row which are all (0,100,200) (in RGB values). What you can now do is store this as 4 numbers - a value 200 followed by 0,100,200 - this reduces the 600 bytes to just 4.

What you're now doing is storing blocks of 4 bytes; the first is a count, the next 3 are the RGB values. If there are lots of blocks of identical colour then this works well. If the colours are more varied then it falls over - you can end up having to use 4 bytes instead of 3 if you have 2 different colours alternating, for example.

text compression is often done using dictionary type tables. If you are just using a simple alphabet (ie no accented letters) then there are spare numbers in the ASCII set from 128 to 255. This allows you to say that instead of storing (say) "THE" as 84,72, 69 you store it as 128; "AND" can be stored as 129 instead of 65, 78, 68 and so on. In these examples, you use 1 byte instead of 3 for the words; if you can replace longer words then you get even better compression but you do also have to store the dictionary which says "128=THE; 129=AND; 130= anti-disestablishmentarianism" etc so again, if you don't have many words which are repeated then you don't gain.

I think that the same sort of technique is used by zip compression but instead of looking for words, you look for patterns of bytes and replace them with single codes.

Each of these techniques is lossless - you can get back exactly what you started with. mp3 and jpg compression are lossy - you don't get back what you started with but what you get is probably good enough. The maths behind these is a bit difficult (look up discrete cosine transform and be boggled!)
• 2nd January 2009, 12:11 AM
mattx
If anyone wants these [ and some of them are quite good ] - the explanation on quantum computing was well thought out - then I have the AVI's
They will be going on our media server when I return to work.......
• 2nd January 2009, 12:12 AM
mattx
Quote:

Originally Posted by dbrown
Just listening to RI Xmas lectures and heard an explanation of data compression which explained how numbers 0-31 can be represented in 5 bits (this followed an earlier explanation of natural binary showing 0-15 represented by 4 bits and 0-1023 by ten bits.)
Since natural binary represents 0-31 in 5 bits, this does not seem a very convincing explanation of data compression.

Does anyone know of a good, simple explanation of data compression?

Don.

Wikipedia has quite a good entry on it.....

[ame=http://en.wikipedia.org/wiki/Data_compression]Data compression - Wikipedia, the free encyclopedia[/ame]
• 2nd January 2009, 12:46 AM
Trapper
Very good lectures this year.

I agree with Mattx, even I understood the basic theory behind quantum computing :)