Hi. As a project, I'm trying to write a GIF parser that takes a file as an input and displays the image as output. I've been going pretty steadily through the GIF documentation and etc. etc. I've been looking at the "Graphics Interchange Format" article on wikipedia and I've recreated the image file that wikipedia gives the bytes for.
Now, here's my problem: the raster data that the GIF contains doesn't make sense. I'm using a very simple 3x5 GIF image that has one black pixel in the top corner and one black pixel directly below and to the left.
For those of you unfamiliar with GIF, the image data is stored using the LZW compression algorithm, which saves space by compressing multiple pixels into single entries. The encoded data is "00 51 FC 1B 28 70 A0 C1 83 01 01". Taking a look at the actual bits, the data is (underscores seperate the bytes)
00000000_01010001_11111100_00011011_00101000_01110 000_10100000_11000001_10000011_00000001_00000001
I'm supposed to be reading 9 bit codes, as per the GIF specification. Each code corresponds to a list of pixels to be output. For instance, if I read the hex code "0x028", that translates to color 40, which maps to a black pixel. So I'd output a black pixel. The map starts with predefined mappings for codes 0-255, and special treatment for codes 256 and 257. More mappings are created as you read through the data. If I'm reading 9 bits per code, the data looks like this (underscores seperate the 9 bit codes)
000000000_101000111_111100000_110110010_100001110_ 000101000_001100000_110000011_000000010_0000001
The first code I read is 0, which is at the start of every GIF raster data stream. The next code I read is "101000111", which is code #327. That doesn't make sense though, because I don't have a mapping for code #327 yet. #327 doesn't exist. What is going on? Also, there's that last 7 bits that I'm not sure what to do about.
I wrote my own LZW compressor to help me, and plugged in the pixels to see the data that came out of it. My compressor uses the exact same algorithm that is described by the various GIF documentations I have looked at. The coded raster data that was outputted was different than the one saved in the GIF file;
Bits found in GIF file
00000000010100011111110000011011001010000111000010 10000011000001100000110000000100000001
Bits given by algorithm
10000000000010100001111111110000001110000001010000 0011100000110100000111100000001
They're even of different lengths sizes!
If I plugged the bits given by the algorithm into my GIF parser, it decoded the data with no problem. It almost seems like the GIF's data was encoded using a different algorithm! (I just used Microsoft Paint to make the GIF). But there's no way that's possible; I know that there's something wrong with the decompressor, even though I followed the documentation to the letter.
Here's the code for the decompressor:
|
Code:
|
void readImageData(InputStream fio, int[] table) throws Exception {
println("Reading image data...");
curBlock = readBlock(fio); //reads the raster data.
dictionary = new LinkedList<List<Integer>>();
for (int i : table) {
List<Integer> l = new LinkedList<Integer>();
l.add(i);
dictionary.add(l);
}
dictionary.add(new LinkedList()); //placeholder for clr
dictionary.add(new LinkedList()); //placeholder for end
println("Starting code size: " + codeSize);
int oldCode = readBits(codeSize);
output(dictionary.get(oldCode));
int curCode;
println();
while (index < curBlock.length) {
curCode = readBits(codeSize);
List<Integer> string;
int c = 0;
if(curCode < dictionary.size()) {
print("Code exists -- ");
string = dictionary.get(curCode);
}
else {
print("Code doesn't exist -- ");
string = dictionary.get(oldCode); //line 177
string.add(c); //this shouldn't happen on the first run.
}
output(string);
c = string.get(0);
List<Integer> entry = new LinkedList(dictionary.get(oldCode));
entry.add(c);
addToDictionary(entry);
oldCode = curCode;
println();
}
loop();
}
...
...
byte[] curBlock;
List<List<Integer>> dictionary;
int codeSize;
void addToDictionary(List<Integer> str) {
dictionary.add(str);
if(dictionary.size() > pow(2, codeSize)) codeSize++;
} |
Here's the output of the program:
Reading image data...
Reading block of size 11
Starting code size: 9
Read code 0(9) -- output (0, 0, 0) --
Read code 327(9) -- Code doesn't exist -- output (0, 0, 0) -- output (0, 0, 0) --
Read code 480(9) -- Code doesn't exist --
java.lang.IndexOutOfBoundsException: Index: 327, Size: 259
at java.util.LinkedList.entry(LinkedList.java:365)
at java.util.LinkedList.get(LinkedList.java:315)
at gifparser.Main.readImageData(Main.java:177)
at gifparser.Main.open(Main.java:108)
I know this is all quite convoluted, but I'm thoroughly confused at this point. What is going on?!