Results 1 to 3 of 3
  1. #1
    hellochar is offline Member
    Join Date
    Jul 2009
    Rep Power

    Default Gif decoding/LZW troubles

    Hi. As a project, I'm trying to write a GIF parser that takes a file as an input and displays the image as output. I've been going pretty steadily through the GIF documentation and etc. etc. I've been looking at the "Graphics Interchange Format" article on wikipedia and I've recreated the image file that wikipedia gives the bytes for.

    Now, here's my problem: the raster data that the GIF contains doesn't make sense. I'm using a very simple 3x5 GIF image that has one black pixel in the top corner and one black pixel directly below and to the left.

    For those of you unfamiliar with GIF, the image data is stored using the LZW compression algorithm, which saves space by compressing multiple pixels into single entries. The encoded data is "00 51 FC 1B 28 70 A0 C1 83 01 01". Taking a look at the actual bits, the data is (underscores seperate the bytes)

    00000000_01010001_11111100_00011011_00101000_01110 000_10100000_11000001_10000011_00000001_00000001

    I'm supposed to be reading 9 bit codes, as per the GIF specification. Each code corresponds to a list of pixels to be output. For instance, if I read the hex code "0x028", that translates to color 40, which maps to a black pixel. So I'd output a black pixel. The map starts with predefined mappings for codes 0-255, and special treatment for codes 256 and 257. More mappings are created as you read through the data. If I'm reading 9 bits per code, the data looks like this (underscores seperate the 9 bit codes)

    000000000_101000111_111100000_110110010_100001110_ 000101000_001100000_110000011_000000010_0000001

    The first code I read is 0, which is at the start of every GIF raster data stream. The next code I read is "101000111", which is code #327. That doesn't make sense though, because I don't have a mapping for code #327 yet. #327 doesn't exist. What is going on? Also, there's that last 7 bits that I'm not sure what to do about.

    I wrote my own LZW compressor to help me, and plugged in the pixels to see the data that came out of it. My compressor uses the exact same algorithm that is described by the various GIF documentations I have looked at. The coded raster data that was outputted was different than the one saved in the GIF file;

    Bits found in GIF file
    00000000010100011111110000011011001010000111000010 10000011000001100000110000000100000001

    Bits given by algorithm
    10000000000010100001111111110000001110000001010000 0011100000110100000111100000001

    They're even of different lengths sizes!
    If I plugged the bits given by the algorithm into my GIF parser, it decoded the data with no problem. It almost seems like the GIF's data was encoded using a different algorithm! (I just used Microsoft Paint to make the GIF). But there's no way that's possible; I know that there's something wrong with the decompressor, even though I followed the documentation to the letter.

    Here's the code for the decompressor:

    Java Code:
    	void readImageData(InputStream fio, int[] table) throws Exception {
    		println("Reading image data...");
    		curBlock = readBlock(fio); //reads the raster data.
    		dictionary = new LinkedList<List<Integer>>();
    		for (int i : table) {
    			List<Integer> l = new LinkedList<Integer>();
    		dictionary.add(new LinkedList()); //placeholder for clr
    		dictionary.add(new LinkedList()); //placeholder for end
    		println("Starting code size: " + codeSize);
    		int oldCode = readBits(codeSize);
    		int curCode;
    		while (index < curBlock.length) {
    			curCode = readBits(codeSize);
    			List<Integer> string;
    			int c = 0;
    			if(curCode < dictionary.size()) {
    				print("Code exists -- ");
    				string = dictionary.get(curCode);
    			else {
    				print("Code doesn't exist -- ");
    				string = dictionary.get(oldCode);											 //line 177
    				string.add(c); //this shouldn't happen on the first run.
    			c = string.get(0);
    			List<Integer> entry = new LinkedList(dictionary.get(oldCode));
    			oldCode = curCode;
    	byte[] curBlock;
    	List<List<Integer>> dictionary;
    	int codeSize;
    	void addToDictionary(List<Integer> str) {
    		if(dictionary.size() > pow(2, codeSize)) codeSize++;
    Here's the output of the program:

    Reading image data...
    Reading block of size 11
    Starting code size: 9
    Read code 0(9) -- output (0, 0, 0) --
    Read code 327(9) -- Code doesn't exist -- output (0, 0, 0) -- output (0, 0, 0) --
    Read code 480(9) -- Code doesn't exist --

    java.lang.IndexOutOfBoundsException: Index: 327, Size: 259
    at java.util.LinkedList.entry(
    at java.util.LinkedList.get(
    at gifparser.Main.readImageData(

    I know this is all quite convoluted, but I'm thoroughly confused at this point. What is going on?!

  2. #2
    OrangeDog's Avatar
    OrangeDog is offline Senior Member
    Join Date
    Jan 2009
    Cambridge, UK
    Rep Power


    How is index initialised/updated?
    Likewise with codesize.
    What is the purpose of having a linked lists of linked lists containing one integer each? An actual dictionary should implement some kind of Map.
    Don't forget to mark threads as [SOLVED] and give reps to helpful posts.
    How To Ask Questions The Smart Way

  3. #3
    hellochar is offline Member
    Join Date
    Jul 2009
    Rep Power


    Oops, forgot to add that. index is defined as a class member. It gets incremented through a method called readBits().
    Codesize is set to the value 9 right before the call to readImageData.

    Only the reset list of linked lists contains only one integer (when dictionary is first set to a new list). As the algorithm runs, it adds linked lists with multiple integers onto dictionary, and it's easier just to store everything in one dictionary.
    There basically is a Map; it's just mapping integers to linked lists, and all adds simply associate the next integer with a new linked list, so there's no need to use an actual Map.

    Java Code:
        byte[] curBlock;
        String curBits = "";
        int index = 0;
        int readBits(int bits) {
            while (bits > curBits.length()) {
                String s = formatByte(curBlock[index++]);
    //            println("Adding string ("+s+") to curBits");
                curBits += s;
            String s = curBits.substring(0, bits);
            curBits = curBits.substring(bits);
            print("Read code "+Integer.parseInt(s, 2)+"("+bits+") -- ");
    //        println("Read code "+Integer.parseInt(s, 2)+" as "+s);
            return Integer.parseInt(s, 2);
    String formatByte(byte b) {
            return lpad(Integer.toString(uint(b), 2), '0', 8);
        public static String makeStr(char c, int len) {
            StringBuilder b = new StringBuilder(len);
            for (int i = 0; i < len; i++) {
            return b.toString();
        public static String lpad(String s, char c, int wantLen) {
            return makeStr(c, wantLen - s.length()) + s;

Similar Threads

  1. Replies: 3
    Last Post: 01-29-2010, 06:53 AM
  2. Translation/Decoding Program
    By wyldstyle in forum New To Java
    Replies: 14
    Last Post: 05-05-2009, 09:17 PM
  3. subclass troubles
    By xf021209 in forum New To Java
    Replies: 12
    Last Post: 04-20-2009, 11:46 PM
  4. [SOLVED] Array troubles, yes I searched...
    By Reiyn in forum New To Java
    Replies: 11
    Last Post: 04-16-2009, 11:28 PM
  5. StreamCorruptedException and Casting troubles
    By Wassa in forum Networking
    Replies: 2
    Last Post: 02-18-2009, 03:07 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts