# Help needed in Dedoding using Huffman codes

• 04-10-2011, 09:42 AM
warlock
Help needed in Dedoding using Huffman codes
In my program to implement huffman algorithm.I have created the huffman codes and stored the ascii values and corresponding codes in a map.While creating the encoded file I followed this approach:I firstly converted the ascii codes in to their corresponding integer values and wrote them in a file.Now ,while decoding I read each byte from that file and converted it into its binary equivalent and searched for their ascii values in the map.This approach is creating the following problem.Since,01,001,1 all have integer equivalent as 1,so when I convert the codes into integer it no longer remains unique.I have ran out of ideas. And I faled to store the actual huffman tree.So I could not follow the decoding scheme where I can traverse the tree a file the codes Help!:(
• 04-10-2011, 11:18 AM
JosAH
You can't read ints, you have to read bits because a Huffman encoding encodes characters (or whatever) in sequences of bits. After reading a single bit you make a decision for going to a left or right subtree of the Huffman tree; if you have reached a leaf node you have decoded a chararcter.

kind regards,

Jos
• 04-10-2011, 03:02 PM
warlock
As I said I could not store the tree.All that I can use in my decoding scheme is the map where I have the characters and their corresponding frequency stored in it.Then what to do?And how will I read bits.?Suppose the encoded file is 10101,where 101 is code for a and 01 is code for b.Now,I can read a character at a time but not bits.And I changed the codes into their integer equivalent so that the encoded file does not get bigger than the actual file.
• 04-10-2011, 03:18 PM
JosAH
Quote:

Originally Posted by warlock
As I said I could not store the tree.All that I can use in my decoding scheme is the map where I have the characters and their corresponding frequency stored in it.Then what to do?

Then your Huffman algorithm is of no value; in general a Huffman (de)compression scheme is a so called 'off line' algorithm: first you determine the tree given the text, next you save the tree and the compressed text and the decompressor reads the tree and the compressed text and decompresses it.

kind regards,

Jos
• 04-10-2011, 03:26 PM
warlock
Ok.I feel lost now.looks like I have to redo it completely.How will I store the tree?What I did was stored the leaf nodes in a TreeSet<Node>.Then I combined the nodes to get the first non-leaf node and while doing that give its left and right child 0 and 1 .and so on.Any ideas what i should do next?
• 04-10-2011, 03:34 PM
JosAH
Quote:

Originally Posted by warlock
Ok.I feel lost now.looks like I have to redo it completely.How will I store the tree?What I did was stored the leaf nodes in a TreeSet<Node>.Then I combined the nodes to get the first non-leaf node and while doing that give its left and right child 0 and 1 .and so on.Any ideas what i should do next?

You can still store the letter combinations together with their encoding bit sequences but you have to know the length (measured in bits) of each bit sequence. Thanks to a property of Huffman encoding (prefix free encoding) you can do that; you have to rebuild your tree from those bit sequences before you can decode the text in an efficient way.

kind regards,

Jos
• 04-10-2011, 03:43 PM
warlock
Thanks again.But how will I build this tree.And by bits do you mean characters ?Because my codes are String.So,I can read each character from it but how bits?
• 04-10-2011, 04:00 PM
JosAH
Quote:

Originally Posted by warlock
Thanks again.But how will I build this tree.And by bits do you mean characters ?Because my codes are String.So,I can read each character from it but how bits?

If you want to store a sequence of bits as a String with the characters '1' and '0', your encoded 'compression' (mind the quotes) will end up a much larger file than the original one. Reading individual bits takes a bit of bit fiddling but can be done ...

kind regards,

Jos
• 04-10-2011, 04:32 PM
warlock
Quote:

Originally Posted by JosAH
You can still store the letter combinations together with their encoding bit sequences but you have to know the length (measured in bits) of each bit sequence. Thanks to a property of Huffman encoding (prefix free encoding) you can do that; you have to rebuild your tree from those bit sequences before you can decode the text in an efficient way.

kind regards,

Jos

Quote:

Originally Posted by JosAH
If you want to store a sequence of bits as a String with the characters '1' and '0', your encoded 'compression' (mind the quotes) will end up a much larger file than the original one. Reading individual bits takes a bit of bit fiddling but can be done ...

kind regards,

Jos

Can you please guide me on how to build the tree & how to read bits?
• 04-10-2011, 05:12 PM
JosAH
Quote:

Originally Posted by warlock
Can you please guide me on how to build the tree & how to read bits?

Sure, you show us what you have done and tell us what's troubling you and maybe (probably?) we can help you out.

kind regards,

Jos
• 04-10-2011, 05:34 PM
warlock
Here's my code to decode the encoded file.
class HuffmanDeCoding{
String src="encoded file.txt";
String op="decoded file.txt";
int bytes;
TreeMap<String,Integer>tm1;
HuffmanDeCoding(TreeMap<String,Integer>tm1)
{
this.tm1=tm1;}

void decode(){ try{
FileInputStream fin=new FileInputStream(src);
FileOutputStream fout=new FileOutputStream(op);
System.out.println("Given File----");
do{
if(bytes!=-1){
System.out.print(bytes);
String binary=Integer.toBinaryString(bytes);
if(tm1.containsKey(binary))
{ int x=tm1.get(binary);
char ch=(char)x;
fout.write(ch);
}}

}while(bytes!=-1);
}
catch(Exception ob){
System.out.println("Exception while decoding");
}
}
}
P.S. TreeMap tm1 stores the huffman codes(String) and corresponding ascii codes (integer)
• 04-15-2011, 04:15 PM
warlock
I have solved the problem of decoding an encoded file using huffman tree traversal mechanism but now I am facing problem with compression of the file.
Since my huffman codes are of string types ,so suppose huffman code for a=101
now ,this takes 3 bytes.So,the encoded file becomes larger than the actual file.
But how to read and write bits in a file using java??:confused:

Here is my code for decoding...
class Decode{
public int firstbyte=0;Node r;int asc,size;char c;int count=0;Node root,r1;
FileInputStream in;FileWriter out;
int x;
Decode(FileInputStream in,FileWriter out,Node r)
{this.in=in;
this.out=out;
this.r=r;

}
void aux_d(Decode d){
try {root =r;
size=in.available();
System.out.println("Size=="+size);
{System.out.println("Moving in left branch");
dec(r.left,48);
}
else
{System.out.println("Moving in right branch");
dec(r.right,49);}
} catch (IOException e) {

e.printStackTrace();
}
}
void dec(Node r,int x){
count++;
try{
System.out.println("count=="+count);
//System.out.print("root=="+root);
while(count<=size)
{
if(x==48)
{System.out.println("In left Branch");
if((r.left==null)&&(r.right==null))
{System.out.print("leaf..");
asc=r.ascii;
char c1=(char)asc;
System.out.println("Writing char=="+c1);
out.write(c1);
r1=root;
dec(r1.left,48);
else
dec(r1.right,49);
}
else{System.out.print("Nonleaf..");
dec(r.left,48);
else
dec(r.right,49);}
}
else if(x==49){System.out.println("In Right Branch");
if(((r.left==null)&&(r.right==null))){
System.out.print("leaf..");
asc=r.ascii;
char c1=(char)asc;
System.out.println("Writing char"+c1);
out.write(c1);
r1=root;
dec(r1.left,48);
else
dec(r1.right,49);

}
else{System.out.print("Nonleaf..");
dec(r.left,48);
else
dec(r.right,49);
}

}
}
}
catch(Exception o)
{}

}
}
• 04-18-2011, 10:03 AM
warlock
here is the problem i am facing now.I have a file consist of a b c d.codes a=100,b=101,c=110,d=111 and space=0.Now,the encoded file is 100010101100111.so ,I take 8 bytes(of 0 and 1) from the encoded file (in order to compress it)convert it into integer and store it in another file(say auxfile.txt) so, auxfile.txt contains 138 and 103.Now while decoding I first convert 138 and 103 into its bainary equivalent.But to my horror ,while reading the first byte(from auxfile.txt) I get 63 and not 138!The next byte is coming properly as 103.Please help
Here is the code where I am converting strings of 0 ,1 into numbers.

size=fin1.available();
while(count<=size){
count++;
if(bytes!=-1)
{
if(tm.containsKey(bytes))
{ str1+=tm.get(bytes);
while((str1.length()<8) && (count<=size))
{
count++;

if(bytes!=-1){
str2=tm.get(bytes);
str1 +=str2;}
}
if(str1.length()>8)
{str2=str1.substring(0,8);
int x=Integer.parseInt(str2,2);
fout2.write(x);
str1=str1.substring(8);

}

else if(str1.length()==8)

{

int x=Integer.parseInt(str1,2);
fout2.write(x);
str1="";

}
}
}

}
if((str1.length()<8))
{
int x=Integer.parseInt(str1,2);
fout2.write(x);
}
fin1.close();fout2.close();