|
Reading text using PDFBOX
Hi Ranchers
I am using PDFBox to read the text from the PDF file and display the x-y cordinates of each character.For simple file it works fine.But when i have a pdf file containing text in different fonts,tables,graphs etc.The output is somewhat jumbled like it reads first two paragraphs ,then last paragraphs then third para.Also the text written vertically is not read property ,for example market is read as "mark" and then in next line it prints "et".Is there any solution.?Kinldly help
Thanks
Umadas
|