Hi,

I'm trying to use PDFbox to load a large pdf document (>1gb):
Java Code:
                      File inputPdf = new File("c:\\DSHW-2011-007704.pdf");
	   	PDFTextStripper stop = new PDFTextStripper ();
		    //Pattern p=Pattern.compile("^(?!\\d{3}-?\\d{2}-?\\d-?\\d{3}$).*");
			Pattern p = Pattern.compile("(\\d{3}-\\d{2}-\\d{4})((?:.(?!\\d{3}-\\d{2}-\\d{4}))*)", Pattern.DOTALL);
			FileInputStream fis=null;
			
			fis=new FileInputStream(inputPdf);
			pd = PDDocument.load(fis,true);
This code works fine for smaller pdfs, but only larger ones I'm getting:

org.apache.pdfbox.exceptions.WrappedIOException
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFPar ser.java:245)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocume nt.java:1192)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocume nt.java:1159)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocume nt.java:1130)
at PDFRedact.main(PDFRedact.java:19)
Caused by: java.lang.IndexOutOfBoundsException: Index: 15625, Size: 15625
at java.util.ArrayList.RangeCheck(Unknown Source)
at java.util.ArrayList.get(Unknown Source)
at org.apache.pdfbox.io.RandomAccessBuffer.seek(Rando mAccessBuffer.java:84)
at org.apache.pdfbox.io.RandomAccessFileOutputStream. write(RandomAccessFileOutputStream.java:106)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
at java.io.FilterOutputStream.close(Unknown Source)
at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStr eam(BaseParser.java:610)
at org.apache.pdfbox.pdfparser.PDFParser.parseObject( PDFParser.java:568)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFPar ser.java:188)
... 4 more


Any ideas or help would be appreciated.