I know that it is used for compression and you need to construct a tree. But I don't understand where the compression is used. From lectures and what I've found by googling, it seems everything stops after the tree is constructed with binary values assigned.

But when do you compress? Or is the binary value the compression?

# Confused about the uses of Huffman Coding

Page 1 of 1## 6 Replies - 732 Views - Last Post: 17 August 2016 - 02:38 PM

##
**Replies To:** Confused about the uses of Huffman Coding

### #2

## Re: Confused about the uses of Huffman Coding

Posted 17 August 2016 - 03:05 AM

The binary value is the compression.

If you compare the space required to represent the original data, and the encoded data, you'll find that the encoded data requires less space. Hence, the original data has been compressed.

If you compare the space required to represent the original data, and the encoded data, you'll find that the encoded data requires less space. Hence, the original data has been compressed.

### #3

## Re: Confused about the uses of Huffman Coding

Posted 17 August 2016 - 08:24 AM

gzip uses Huffman encoding: https://en.wikipedia.org/wiki/Gzip

the trick is to write bits rather than bytes. By encoding your data as a string of bits you get compression.

the trick is to write bits rather than bytes. By encoding your data as a string of bits you get compression.

### #4

## Re: Confused about the uses of Huffman Coding

Posted 17 August 2016 - 09:25 AM

So using gzip as an example. Let's say I write some code to take an input file (e.g text file), do I read the letters in the file and construct the tree with those letters? Once the tree is complete, what do I output as the compressed file?

### #5

## Re: Confused about the uses of Huffman Coding

Posted 17 August 2016 - 01:23 PM

The tree gives you a function to encode your string. Each letter maps to a binary string. You construct the binary string by starting at the root and searching for your letter. You append a 0 for visiting the left child and a 1 for visiting the right child. The path to the desired letter is the corresponding encoding of that letter.

So now given your input string, you encode each letter using the above procedure. To decode a binary string, you simply start at the root and follow the prescribed path (0 for left, 1 for right) until you hit a leaf. That leaf is the corresponding letter. You then start at the root with the next available binary digit.

The Huffman Coding Algorithm is really for prefix-free encoding of letters. That is, any two binary strings will be uniquely determined for the given Huffman tree. There is no ambiguity if you have an 'a' vs. 'e' for example.

So now given your input string, you encode each letter using the above procedure. To decode a binary string, you simply start at the root and follow the prescribed path (0 for left, 1 for right) until you hit a leaf. That leaf is the corresponding letter. You then start at the root with the next available binary digit.

The Huffman Coding Algorithm is really for prefix-free encoding of letters. That is, any two binary strings will be uniquely determined for the given Huffman tree. There is no ambiguity if you have an 'a' vs. 'e' for example.

### #6

## Re: Confused about the uses of Huffman Coding

Posted 17 August 2016 - 02:19 PM

macosxnerd101, on 17 August 2016 - 01:23 PM, said:

The tree gives you a function to encode your string. Each letter maps to a binary string. You construct the binary string by starting at the root and searching for your letter. You append a 0 for visiting the left child and a 1 for visiting the right child. The path to the desired letter is the corresponding encoding of that letter.

So now given your input string, you encode each letter using the above procedure. To decode a binary string, you simply start at the root and follow the prescribed path (0 for left, 1 for right) until you hit a leaf. That leaf is the corresponding letter. You then start at the root with the next available binary digit.

The Huffman Coding Algorithm is really for prefix-free encoding of letters. That is, any two binary strings will be uniquely determined for the given Huffman tree. There is no ambiguity if you have an 'a' vs. 'e' for example.

So now given your input string, you encode each letter using the above procedure. To decode a binary string, you simply start at the root and follow the prescribed path (0 for left, 1 for right) until you hit a leaf. That leaf is the corresponding letter. You then start at the root with the next available binary digit.

The Huffman Coding Algorithm is really for prefix-free encoding of letters. That is, any two binary strings will be uniquely determined for the given Huffman tree. There is no ambiguity if you have an 'a' vs. 'e' for example.

I completely understand every you said. What I don't understand is what happens after the tree is constructed.

Like gzip for example. It compresses a file and gives you a zipped file. So that's my main question, do I output the binary values into a file after? What kind of file do I output etc.

### #7

## Re: Confused about the uses of Huffman Coding

Posted 17 August 2016 - 02:38 PM

You would need to output an encoding of the tree and your encoded word so that the original string can be recovered.

A text file works. The Huffman algorithm doesn't specify a file type. This is an abstract algorithm, and details about what to do with the output are left for those implementing the algorithm.

Quote

What kind of file do I output etc.

A text file works. The Huffman algorithm doesn't specify a file type. This is an abstract algorithm, and details about what to do with the output are left for those implementing the algorithm.

Page 1 of 1