Results 1 to 16 of 16
  1. #1
    johann_p is offline Member
    Join Date
    Jun 2007
    Posts
    19
    Rep Power
    0

    Default Create GZIP compressed data using DeflaterInputStream?

    I want to provide to a client with an input stream that gets the content of a string in GZIP compressed format.

    However, the util.zip.GZIPInputStream class can only be used to decompress, not compress data.

    So I tried to use java.util.zip.DeflaterInputStream instead. However this does not seem to generate GZIP format - when I try to read the generated data back using a GZIPInputStream I get a "Not in GZIP format" error.

    DeflaterInputStream can take an instance of a Deflater as its second argument and Deflater can be constructed with a second argument "nowrap" set to true. In that case, according to the docs, "if true then use GZIP compatible compression" I thought the DeflaterInputStream should generate GZIP compatible format, but I still get "Not in GZIP format".

    Here is the code I use:
    Java Code:
          InputStream is = IOUtils.toInputStream(theString, theEncoding);
          InputStream iscomp =
            new DeflaterInputStream(is,new Deflater(6,true));
         // pas iscomp to the client to read the content of theString
         // in GZIP compressed format
    Is there a way to create an input stream that will provide a GZIP compressed version of theString?

  2. #2
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    17,902
    Rep Power
    25

    Default

    If you want to compress data, would you use the GZipOutputStream?

  3. #3
    johann_p is offline Member
    Join Date
    Jun 2007
    Posts
    19
    Rep Power
    0

    Default

    Well GZIPOutputStream works but I cannot use it here, because the client wants to read from an input stream and GZIPOutputStream is an output stream.

  4. #4
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    17,902
    Rep Power
    25

    Default

    Please explain again the data flow.
    What format is the input data in and what format does the client want to receive?

    Have you looked at pipes? The output from one class is the input for another
    Last edited by Norm; 05-16-2011 at 03:35 PM.

  5. #5
    johann_p is offline Member
    Join Date
    Jun 2007
    Posts
    19
    Rep Power
    0

    Default

    theString is some String containing textual data. The client wants to read the content of theString in GZIP compressed format from an InputStream object. What I tried to do is create such a stream that would allow the client to read the GZIP compressed version of what is stored uncompressed as plain text in theString.

    The client actually gets some binary data. At a later time, I can tell the client to hand me back an input stream so I can read back the data. I wrap that input stream into a GZIPInputStream and try to read the compressed data back. However, at that point I get the "Not in GZIP format" error.

    The client is actually the jdbc driver: the PreparedStatement.setBinaryStream method can be used to let JDBC fill a large field with data. This is what I use in my attempt to store the compressed version of theString in the database field.
    See PreparedStatement (Java Platform SE 6)

    At a later time I use ResultSet.getBinaryStream to get the stream which I then wrap into a GZIPInputStream.
    See ResultSet (Java Platform SE 6)

    The compressed data is stored in a blob. I can store the content of that blob in a file (it looks binary) and when I try to decompress using the gzip command I get the exact same message that the content is not in GZIP format.
    Last edited by johann_p; 05-16-2011 at 03:44 PM.

  6. #6
    Toll's Avatar
    Toll is offline Senior Member
    Join Date
    May 2011
    Location
    Sweden
    Posts
    393
    Rep Power
    4

    Default

    Have a look at PipedInputStream (Java Platform SE 6) and PipedOutputStream (Java Platform SE 6). Perhaps you can use those?

  7. #7
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    17,902
    Rep Power
    25

    Default

    Can you write a small simple executable program that to demonstrate the problem?
    When you get the techniques down, you can copy the logic to your big program.

  8. #8
    johann_p is offline Member
    Join Date
    Jun 2007
    Posts
    19
    Rep Power
    0

    Default

    It looks a lot like there is no way to make the output of DeflaterInputStream compatible with what is expected by GZIPInputStream. A quickly looked at the source code of GZIPInputStream and they seem to add a header and a trailer that is not part of what DeflaterInputStream creates.
    When I just replace both compression and decompression with DeflaterInputStream and InflaterInputStream, things work. The downside is that what is stored in the database fields is not compatible with GZIP format any more.

  9. #9
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,783
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by johann_p View Post
    It looks a lot like there is no way to make the output of DeflaterInputStream compatible with what is expected by GZIPInputStream. A quickly looked at the source code of GZIPInputStream and they seem to add a header and a trailer that is not part of what DeflaterInputStream creates.
    When I just replace both compression and decompression with DeflaterInputStream and InflaterInputStream, things work. The downside is that what is stored in the database fields is not compatible with GZIP format any more.
    I suspect you're mixing things up: the Deflater streams (both input and output) compress what they read or write in the 'deflate' format. The GZip streams only compress for the OutputStream and decompress for the InputStream. You can't mix deflater- and GZip streams, not can your mix the inflater streams with other streams. If you want GZip compressed data use the GZipOutputStream for compression and the GZipInputStream for decompression.


    kind regards.
    cenosillicaphobia: the fear for an empty beer glass

  10. #10
    DarrylBurke's Avatar
    DarrylBurke is offline Forum Police
    Join Date
    Sep 2008
    Location
    Madgaon, Goa, India
    Posts
    11,457
    Rep Power
    20

  11. #11
    johann_p is offline Member
    Join Date
    Jun 2007
    Posts
    19
    Rep Power
    0

    Default

    Quote Originally Posted by JosAH View Post
    I suspect you're mixing things up: the Deflater streams (both input and output) compress what they read or write in the 'deflate' format.
    Yes, I know that now, however the documentation for the two parameter Deflater states explicitly: "If 'nowrap' is true then the ZLIB header and checksum fields will not be used in order to support the compression format used in both GZIP and PKZIP. " This got me to assume that maybe you can create a format that is compatible, but even if such a deflater object is specified in the constructor of DeflaterInputStream, the created compressed data is NOT compatible with GZIP.

    The GZip streams only compress for the OutputStream and decompress for the InputStream.
    That is exactly the limitation I am facing. There are both input and output streams for compression and decompression in deflate format, but for some reason no input stream for GZIP compression and no output stream for GZIP decompression. This is a weird restriction.

    You can't mix deflater- and GZip streams, not can your mix the inflater streams with other streams. If you want GZip compressed data use the GZipOutputStream for compression and the GZipInputStream for decompression.
    The problem is exactly that the library which needs the compressed data requires an input stream and not an output stream. This is because the jdbc library only has a method that accepts an input stream for writing a large binarye field. It does not have a method that could hand back an output stream that I could use to write that data.

  12. #12
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,783
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by johann_p View Post
    That is exactly the limitation I am facing. There are both input and output streams for compression and decompression in deflate format, but for some reason no input stream for GZIP compression and no output stream for GZIP decompression. This is a weird restriction.


    The problem is exactly that the library which needs the compressed data requires an input stream and not an output stream. This is because the jdbc library only has a method that accepts an input stream for writing a large binarye field. It does not have a method that could hand back an output stream that I could use to write that data.
    Would it be an option to supply an ordinary InputStream that reads the gzip compressed data (and leaves it compressed)? Hand over that input stream to your blob so the compressed data is stored in your database. If you want to decompress it, use a GZipInputStream on the (compressed) data ...

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  13. #13
    johann_p is offline Member
    Join Date
    Jun 2007
    Posts
    19
    Rep Power
    0

    Default

    Would it be an option to supply an ordinary InputStream that reads the gzip compressed data (and leaves it compressed)?
    That would require that I first have the gzip compressed data. What I have is the plain text string. Now I could of course create a byte array that contains the compressed string and then hand over an imput stream that reads from the byte array but with that solution I would have to allocate space for the whole compressed data in my program which I originally wanted to avoid (since I thought there had to be some purely stream-based solution).
    However, it seems this will actually end up to be the only possible solution here.

  14. #14
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,783
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by johann_p View Post
    That would require that I first have the gzip compressed data. What I have is the plain text string. Now I could of course create a byte array that contains the compressed string and then hand over an imput stream that reads from the byte array but with that solution I would have to allocate space for the whole compressed data in my program which I originally wanted to avoid (since I thought there had to be some purely stream-based solution).
    However, it seems this will actually end up to be the only possible solution here.
    Maybe the PipedInputStream and PipedOutputStream can do the job. You wrap the PipedOutputStream in your GZipOutputStream so the data will be compressed. The PipedInputStream (handed to your blob) will read the compressed data.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  15. #15
    johann_p is offline Member
    Join Date
    Jun 2007
    Posts
    19
    Rep Power
    0

    Default

    Yes, using PipedInputStream etc. has already been suggested by Toll above too.
    The problem with this approach in the context of JDBC is that one does not know if, nor have control of, whether the JDBC library will use a different thread for reading from the stream. So I am not totally sure how to best ensure that no deadlocks or other troubles can occur.
    Given all I have learned from your helpful comments here (and in the oracle forums) I tend towards the two-step solution of creating a buffer that contains the compressed data first and then passing a stream that reads from that buffer.

  16. #16
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,783
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by johann_p View Post
    Yes, using PipedInputStream etc. has already been suggested by Toll above too.
    The problem with this approach in the context of JDBC is that one does not know if, nor have control of, whether the JDBC library will use a different thread for reading from the stream. So I am not totally sure how to best ensure that no deadlocks or other troubles can occur.
    Given all I have learned from your helpful comments here (and in the oracle forums) I tend towards the two-step solution of creating a buffer that contains the compressed data first and then passing a stream that reads from that buffer.
    Simply run your PipedOutputStream in a different Thread; that way you're sure that the PipedInputStream runs in another Thread (whatever that may be).

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

Similar Threads

  1. Decompress (un-gzip) a byte[]?
    By NeuroFuzzy in forum New To Java
    Replies: 2
    Last Post: 02-10-2011, 03:12 PM
  2. How can I create Graph(Data structure) with Java ?
    By mir.shahidul in forum Advanced Java
    Replies: 2
    Last Post: 02-10-2009, 11:19 AM
  3. How can I create Graph(Data structure) with Java
    By mir.shahidul in forum New To Java
    Replies: 4
    Last Post: 02-09-2009, 09:48 AM
  4. How to create folder(s) and store data?
    By Grom in forum New To Java
    Replies: 2
    Last Post: 08-20-2008, 10:01 AM
  5. gZIp decompression with j2se
    By ashakthi84 in forum Networking
    Replies: 1
    Last Post: 12-25-2007, 05:03 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •