Results 1 to 3 of 3
  1. #1
    andresfr84 is offline Member
    Join Date
    Jan 2014
    Posts
    2
    Rep Power
    0

    Default DOCX gets corrupted after several opening and closing

    Hello,
    first of all, sorry about my english. I hope someone could know if there is any problem with my code or it's a problem with poi library. I'm using poi 3.9.

    The fact is this:

    1.- I read a file with XWPFDocument:

    Java Code:
    File docFile = null;
    		     docFile = new File(fileUrl);
    		     FileInputStream fis;
    			try {
    				fis = new FileInputStream(docFile.getAbsolutePath());
    				XWPFDocument doc = new XWPFDocument(fis);
    2.- I do some operations, but I've tried the code withour any code here, so I'll pass.

    3.- I write the file into disk:

    Java Code:
    FileOutputStream out = new FileOutputStream(outFile);
    				 doc.write(out);
    				 out.close();
    The size of the document get's changed, but this doesn't worry me. The problem is that after several modifications, the size of it are increased each time. At about 14 or 15 "open-close" iterations, the file stucks in this line:

    Java Code:
    XWPFDocument doc = new XWPFDocument(fis);
    It takes 100% cpu and 100% of available Java memory, until out of memory error is thrown. I think poi modifies the internal structure adding some things to it. If I open the file with Word, and save it, it turns to the beginning state, all ok, but I need to get it modified automatically a lot of times.

    Anyone has faced something similar?

  2. #2
    gimbal2 is offline Just a guy
    Join Date
    Jun 2013
    Location
    Netherlands
    Posts
    4,030
    Rep Power
    6

    Default Re: DOCX gets corrupted after several opening and closing

    Indeed high memory usage is a bit of a negative characteristic of POI, I also suffered from that a lot when using the Excel part of the API. Of course Office documents are complex beasts.

    The document growing in size on each save is a clear indicator that something is up though. Exactly how does the document grow? Does it double in size each time? The first idea (sometimes also known as a guess) that popped into my head is that this has to do with versioning of the document going wrong. If you use word you know it tracks changes to the document which makes the document grow in size over time even when you only remove something from it, perhaps this is related. Of course I wouldn't know how or what, I would personally only use POI indirectly to generate new documents through JasperReports and not to modify existing ones (at least I think it uses POI, not sure). I have always found the development of the .doc support a bit lacking in strength to really use it for production worthy solutions.
    Last edited by gimbal2; 01-14-2014 at 08:02 PM.
    "Syntactic sugar causes cancer of the semicolon." -- Alan Perlis

  3. #3
    andresfr84 is offline Member
    Join Date
    Jan 2014
    Posts
    2
    Rep Power
    0

    Default Re: DOCX gets corrupted after several opening and closing

    I've just talked to a poi developer, and he has helped me. The issue was the foot notes, they was being duplicated each time, so in the 10 first times, it doesn't affected so much. I've upgraded to 3.10-beta3 and it works ok.

    Thank you anyway :)

Similar Threads

  1. HTML to DOCX or DOC
    By Baijik in forum Apache POI
    Replies: 0
    Last Post: 10-21-2013, 10:38 AM
  2. Replies: 16
    Last Post: 08-08-2013, 04:34 PM
  3. Opening and Closing Accounts
    By loftus727 in forum New To Java
    Replies: 3
    Last Post: 04-15-2013, 05:15 AM
  4. Stream Corrupted Exception
    By AkisV in forum Advanced Java
    Replies: 2
    Last Post: 11-24-2008, 06:25 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •