Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 07-11-2009, 05:14 AM
Member
 
Join Date: Jul 2009
Posts: 2
Rep Power: 0
hellochar is on a distinguished road
Default Gif decoding/LZW troubles
Hi. As a project, I'm trying to write a GIF parser that takes a file as an input and displays the image as output. I've been going pretty steadily through the GIF documentation and etc. etc. I've been looking at the "Graphics Interchange Format" article on wikipedia and I've recreated the image file that wikipedia gives the bytes for.

Now, here's my problem: the raster data that the GIF contains doesn't make sense. I'm using a very simple 3x5 GIF image that has one black pixel in the top corner and one black pixel directly below and to the left.

For those of you unfamiliar with GIF, the image data is stored using the LZW compression algorithm, which saves space by compressing multiple pixels into single entries. The encoded data is "00 51 FC 1B 28 70 A0 C1 83 01 01". Taking a look at the actual bits, the data is (underscores seperate the bytes)

00000000_01010001_11111100_00011011_00101000_01110 000_10100000_11000001_10000011_00000001_00000001

I'm supposed to be reading 9 bit codes, as per the GIF specification. Each code corresponds to a list of pixels to be output. For instance, if I read the hex code "0x028", that translates to color 40, which maps to a black pixel. So I'd output a black pixel. The map starts with predefined mappings for codes 0-255, and special treatment for codes 256 and 257. More mappings are created as you read through the data. If I'm reading 9 bits per code, the data looks like this (underscores seperate the 9 bit codes)

000000000_101000111_111100000_110110010_100001110_ 000101000_001100000_110000011_000000010_0000001

The first code I read is 0, which is at the start of every GIF raster data stream. The next code I read is "101000111", which is code #327. That doesn't make sense though, because I don't have a mapping for code #327 yet. #327 doesn't exist. What is going on? Also, there's that last 7 bits that I'm not sure what to do about.

I wrote my own LZW compressor to help me, and plugged in the pixels to see the data that came out of it. My compressor uses the exact same algorithm that is described by the various GIF documentations I have looked at. The coded raster data that was outputted was different than the one saved in the GIF file;

Bits found in GIF file
00000000010100011111110000011011001010000111000010 10000011000001100000110000000100000001


Bits given by algorithm
10000000000010100001111111110000001110000001010000 0011100000110100000111100000001

They're even of different lengths sizes!
If I plugged the bits given by the algorithm into my GIF parser, it decoded the data with no problem. It almost seems like the GIF's data was encoded using a different algorithm! (I just used Microsoft Paint to make the GIF). But there's no way that's possible; I know that there's something wrong with the decompressor, even though I followed the documentation to the letter.

Here's the code for the decompressor:

Code:
	void readImageData(InputStream fio, int[] table) throws Exception {
		println("Reading image data...");
		curBlock = readBlock(fio); //reads the raster data.

		dictionary = new LinkedList<List<Integer>>();
		for (int i : table) {
			List<Integer> l = new LinkedList<Integer>();
			l.add(i);
			dictionary.add(l);
		}
		dictionary.add(new LinkedList()); //placeholder for clr
		dictionary.add(new LinkedList()); //placeholder for end

		println("Starting code size: " + codeSize);
		int oldCode = readBits(codeSize);
		output(dictionary.get(oldCode));
		int curCode;
		println();
		while (index < curBlock.length) {
			curCode = readBits(codeSize);
			List<Integer> string;
			int c = 0;
			if(curCode < dictionary.size()) {
				print("Code exists -- ");
				string = dictionary.get(curCode);
			}
			else {
				print("Code doesn't exist -- ");
				string = dictionary.get(oldCode);											 //line 177
				string.add(c); //this shouldn't happen on the first run.
			}
			output(string);
			c = string.get(0);
			List<Integer> entry = new LinkedList(dictionary.get(oldCode));
			entry.add(c);
			addToDictionary(entry);
			oldCode = curCode;
			println();
		}
		loop();
	}

...
...

	byte[] curBlock;
	List<List<Integer>> dictionary;
	int codeSize;

	void addToDictionary(List<Integer> str) {
		dictionary.add(str);
		if(dictionary.size() > pow(2, codeSize)) codeSize++;
	}
Here's the output of the program:

Reading image data...
Reading block of size 11
Starting code size: 9
Read code 0(9) -- output (0, 0, 0) --
Read code 327(9) -- Code doesn't exist -- output (0, 0, 0) -- output (0, 0, 0) --
Read code 480(9) -- Code doesn't exist --

java.lang.IndexOutOfBoundsException: Index: 327, Size: 259
at java.util.LinkedList.entry(LinkedList.java:365)
at java.util.LinkedList.get(LinkedList.java:315)
at gifparser.Main.readImageData(Main.java:177)
at gifparser.Main.open(Main.java:108)


I know this is all quite convoluted, but I'm thoroughly confused at this point. What is going on?!
Bookmark Post in Technorati
Reply With Quote
  #2 (permalink)  
Old 07-12-2009, 11:37 PM
OrangeDog's Avatar
Senior Member
 
Join Date: Jan 2009
Location: Cambridge, UK
Posts: 838
Rep Power: 2
OrangeDog is on a distinguished road
Default
How is index initialised/updated?
Likewise with codesize.
What is the purpose of having a linked lists of linked lists containing one integer each? An actual dictionary should implement some kind of Map.
__________________
Don't forget to mark threads as [SOLVED] and give reps to helpful posts.
How To Ask Questions The Smart Way
Bookmark Post in Technorati
Reply With Quote
  #3 (permalink)  
Old 07-15-2009, 12:26 AM
Member
 
Join Date: Jul 2009
Posts: 2
Rep Power: 0
hellochar is on a distinguished road
Default
Oops, forgot to add that. index is defined as a class member. It gets incremented through a method called readBits().
Codesize is set to the value 9 right before the call to readImageData.

Only the reset list of linked lists contains only one integer (when dictionary is first set to a new list). As the algorithm runs, it adds linked lists with multiple integers onto dictionary, and it's easier just to store everything in one dictionary.
There basically is a Map; it's just mapping integers to linked lists, and all adds simply associate the next integer with a new linked list, so there's no need to use an actual Map.

Code:
    byte[] curBlock;
    String curBits = "";
    int index = 0;
...
...
    int readBits(int bits) {
        while (bits > curBits.length()) {
            String s = formatByte(curBlock[index++]);
//            println("Adding string ("+s+") to curBits");
            curBits += s;
        }
        String s = curBits.substring(0, bits);
        curBits = curBits.substring(bits);
        print("Read code "+Integer.parseInt(s, 2)+"("+bits+") -- ");
//        println("Read code "+Integer.parseInt(s, 2)+" as "+s);
        return Integer.parseInt(s, 2);
    }
...
...
String formatByte(byte b) {
        return lpad(Integer.toString(uint(b), 2), '0', 8);
    }

    public static String makeStr(char c, int len) {
        StringBuilder b = new StringBuilder(len);
        for (int i = 0; i < len; i++) {
            b.append(c);
        }
        return b.toString();
    }

    public static String lpad(String s, char c, int wantLen) {
        return makeStr(c, wantLen - s.length()) + s;
    }
Bookmark Post in Technorati
Reply With Quote
Reply

Bookmarks

Tags
decompression, file, gif, image, lzw

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Multi Client Server applications, troubles with File I/O zackpudil Networking 3 01-29-2010 07:53 AM
Translation/Decoding Program wyldstyle New To Java 14 05-05-2009 10:17 PM
subclass troubles xf021209 New To Java 12 04-21-2009 12:46 AM
[SOLVED] Array troubles, yes I searched... Reiyn New To Java 11 04-17-2009 12:28 AM
StreamCorruptedException and Casting troubles Wassa Networking 2 02-18-2009 04:07 PM


All times are GMT +2. The time now is 05:27 PM.



VBulletin, Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2009, Crawlability, Inc.
Copyright ©2006 - 2007, www.java-forums.org