Results 1 to 3 of 3
Like Tree1Likes
  • 1 Post By gimbal2

Thread: Converting pdf to csv

  1. #1
    asai is offline Senior Member
    Join Date
    Feb 2012
    Location
    Norway
    Posts
    115
    Rep Power
    0

    Default Converting pdf to csv

    Heres my code:
    Java Code:
    package Classes;
    
    import com.itextpdf.text.Document;
    import com.itextpdf.text.DocumentException;
    import com.itextpdf.text.Paragraph;
    import com.itextpdf.text.pdf.AcroFields;
    import com.itextpdf.text.pdf.PRIndirectReference;
    import com.itextpdf.text.pdf.PRStream;
    import com.itextpdf.text.pdf.PRTokeniser;
    import com.itextpdf.text.pdf.PdfDictionary;
    import com.itextpdf.text.pdf.PdfName;
    import com.itextpdf.text.pdf.PdfReader;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.io.StringReader;
    
    public class Pdf2CsvConvert {
    
        public Pdf2CsvConvert() throws IOException, DocumentException {
            
        
                Document document = new Document();
                document.open();
                PdfReader reader = new PdfReader("C:\\Indiaops-projects\\PREMIUM_PAID_ACKNOWLEDGEMENT.pdf");
                PdfDictionary dictionary = reader.getPageN(1);
                AcroFields fields = reader.getAcroFields();
                PRIndirectReference reference = (PRIndirectReference) 
                dictionary.get(PdfName.CONTENTS);
                        PRStream stream = (PRStream) PdfReader.getPdfObject(reference);
                        byte[] bytes = PdfReader.getStreamBytes(stream);
                        PRTokeniser tokenizer = new PRTokeniser(bytes);
                        FileOutputStream fos=new FileOutputStream("C:\\Indiaops-projects\\pdf.csv");
                        StringBuffer buffer = new StringBuffer();
                        StringBuffer data = new StringBuffer();
                        int i=0;
                        while (tokenizer.nextToken()) {
                        if (tokenizer.getTokenType() == PRTokeniser.TokenType.STRING) {
                            String value = tokenizer.getStringValue();
    
                            if("x-none".equals(value)){
                                String datastr =data.toString();
                                if(!"".equals(datastr)){
                                    buffer.append("\""+datastr+"\",");
                                    data = new  StringBuffer();
                                }
                            }else{
                                   data.append(value);
                            }
                         }
                      }
                String test=buffer.toString();
                StringReader stReader = new StringReader(test);
                int t;
                while((t=stReader.read())>0)
                fos.write(t);
                document.add(new Paragraph(".."));
                document.close();
        }
    }
    But I get an error in the line:
    Java Code:
    PRTokeniser tokenizer = new PRTokeniser(bytes);
    The problem is:

    constructor PRTokeniser in class PRTokeniser cannot be applied to given types;
    required: RandomAccessFileOrArray
    found: byte[]
    reason: actual argument byte[] cannot be converted to RandomAccessFileOrArray by method invocation conversion
    ----
    I am a bit confused why I am getting this error... Any suggestions?

  2. #2
    masijade is offline Senior Member
    Join Date
    Jun 2008
    Posts
    2,571
    Rep Power
    9

    Default Re: Converting pdf to csv

    Your feeding it a byte[] and it wants an instance of RandomAccessFileOrArray (which I assume is a class in that lib).

  3. #3
    gimbal2 is offline Just a guy
    Join Date
    Jun 2013
    Location
    Netherlands
    Posts
    4,029
    Rep Power
    6

    Default Re: Converting pdf to csv

    Process that should have been followed before going to this forum:

    1) google "itext javadoc". Result: iText: API documentation
    2) go to the "core" javadoc: iText, a Free Java-PDF library 5.4.2 API
    3) go to the PRTokeniser class: PRTokeniser (iText, a Free Java-PDF library 5.4.2 API)
    4) go to the constructor summary
    5) done.

    And from there you can click on the RandomAccessFileOrArray class and see what that thing is all about. You won't learn much from the javadoc, so that is a dead end for now. So on to the next phase of research:

    6) google "itext PRTokeniser example". Result: iText in Action: example part4.chapter15.ParsingHelloWorld
    asai likes this.

Similar Threads

  1. Converting to jsp
    By aalhazm in forum JavaServer Pages (JSP) and JSTL
    Replies: 1
    Last Post: 05-10-2013, 09:40 AM
  2. Replies: 2
    Last Post: 12-08-2012, 04:32 AM
  3. Help me Converting jar to jpg and jpg to jar
    By nap_patague in forum Advanced Java
    Replies: 0
    Last Post: 03-20-2011, 08:33 AM
  4. converting hex to dec -
    By Symbiot in forum New To Java
    Replies: 3
    Last Post: 05-27-2010, 01:56 PM
  5. need help converting
    By sr20guy in forum New To Java
    Replies: 16
    Last Post: 04-02-2010, 01:07 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •