Results 1 to 2 of 2
  1. #1
    hangdtt is offline Member
    Join Date
    Aug 2012
    Posts
    1
    Rep Power
    0

    Default Help me! convert word document to pdf

    I'm working with Apache POI , I have a project Convert word document to pdf. Now, I used Apache POI ,org.apache.poi.hwpf.extractor library to getText from word document:
    Java Code:
    import com.itextpdf.text.*;
    import com.itextpdf.text.pdf.*;
    import org.apache.poi.hwpf.HWPFDocument;
    import org.apache.poi.hwpf.extractor.WordExtractor;
    import org.apache.poi.hwpf.usermodel.Range;
    import org.apache.poi.poifs.filesystem.POIFSFileSystem;
     POIFSFileSystem fs = null;
                fs = new POIFSFileSystem(new FileInputStream(filename));
                //Couldn't close the braces at the end as my site did not allow it to close
    
                HWPFDocument doc = new HWPFDocument(fs);
                WordExtractor we = new WordExtractor(doc);
                Document document = new Document();
                PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("/Program Files/NCMSCT/HopDong.pdf"));
                Range range = doc.getRange();
                document.open();
                writer.setPageEmpty(true);
                document.newPage();
                writer.setPageEmpty(true);
    
                String[] paragraphs = we.getParagraphText();
     for (int i = 0; i < paragraphs.length; i++) {
                    paragraphs[i] = paragraphs[i].replaceAll("\\cM?\r?\n", "");
                    System.out.println("Paragraph  " + i + ":  " + paragraphs[i]);
                   System.out.println("Length:" + paragraphs[ i].length());
    
                }
    but i can't get object :hyperlink, table, image and format of word document . I used other library as: jdoctopdf-0.9-beta.jar , tika-parsers-0.9-jdk14.jar library but doesn't get all format from word document. Therefore who have way help me, please reply soon. Thank all!

  2. #2
    Join Date
    Oct 2012
    Posts
    1
    Rep Power
    0

    Default Re: Help me! convert word document to pdf

    I am also looking for same. Did you find any solution for this?

    Thanks

    Quote Originally Posted by hangdtt View Post
    I'm working with Apache POI , I have a project Convert word document to pdf. Now, I used Apache POI ,org.apache.poi.hwpf.extractor library to getText from word document:
    Java Code:
    import com.itextpdf.text.*;
    import com.itextpdf.text.pdf.*;
    import org.apache.poi.hwpf.HWPFDocument;
    import org.apache.poi.hwpf.extractor.WordExtractor;
    import org.apache.poi.hwpf.usermodel.Range;
    import org.apache.poi.poifs.filesystem.POIFSFileSystem;
     POIFSFileSystem fs = null;
                fs = new POIFSFileSystem(new FileInputStream(filename));
                //Couldn't close the braces at the end as my site did not allow it to close
    
                HWPFDocument doc = new HWPFDocument(fs);
                WordExtractor we = new WordExtractor(doc);
                Document document = new Document();
                PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("/Program Files/NCMSCT/HopDong.pdf"));
                Range range = doc.getRange();
                document.open();
                writer.setPageEmpty(true);
                document.newPage();
                writer.setPageEmpty(true);
    
                String[] paragraphs = we.getParagraphText();
     for (int i = 0; i < paragraphs.length; i++) {
                    paragraphs[i] = paragraphs[i].replaceAll("\\cM?\r?\n", "");
                    System.out.println("Paragraph  " + i + ":  " + paragraphs[i]);
                   System.out.println("Length:" + paragraphs[ i].length());
    
                }
    but i can't get object :hyperlink, table, image and format of word document . I used other library as: jdoctopdf-0.9-beta.jar , tika-parsers-0.9-jdk14.jar library but doesn't get all format from word document. Therefore who have way help me, please reply soon. Thank all!

Similar Threads

  1. How to Convert Excel document to word document?
    By sudheer.v47 in forum Advanced Java
    Replies: 3
    Last Post: 05-09-2014, 10:26 AM
  2. writing word document using poi hwpf
    By devday in forum New To Java
    Replies: 4
    Last Post: 08-07-2011, 12:15 PM
  3. Convert Word document(.doc) to XML through JAVA
    By priya2184 in forum Advanced Java
    Replies: 6
    Last Post: 05-10-2011, 04:33 PM
  4. Access a styles in a word document
    By bhuvanakarthikk in forum New To Java
    Replies: 1
    Last Post: 03-12-2011, 10:11 AM
  5. reading tables word document
    By ashik03 in forum Advanced Java
    Replies: 1
    Last Post: 02-06-2010, 02:01 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •