Using apache for reading a doc file
Hi...!
I downloaded the apace hwpf.I wnat to read a doc file with it and get the text of it and write into another file aauch .txt.
I don't know the hwpf so well.
My very simple program is here:
I have 3 problem now:
1-Some of packages have error ( they can't find apache hdf).I also dowloaded apache aommons and apache model.But it now needs these:
- org.apache.log4j
- org.apache.avalon
- junit.framework.Test
- junit.framework.TestSuite
- org.apache.log
- org.slf4j
2-How I can use the method of hwpf to find and extract the images out?
3-some piece of my program is incomplete and incorrect.So please help me ot complete it.
I have to complete this program in 2 day.
once again I repeate Please Please help me to complete this.
Thanks you Guys a lot for your help!!!:confused::confused::confused:
This is my elemantary code:
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.hwpf.model.PicturesTable;
import org.apache.poi.hwpf.usermodel.Picture;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
public void m1 () throws FileNotFoundException, IOException{
String filesname = "Hello.doc";
POIFSFileSystem fs = null;
fs = new POIFSFileSystem(new FileInputStream(filesname ) );
HWPFDocument doc = new HWPFDocument(fs);
WordExtractor we = new WordExtractor(doc);
String str = we.getText() ;
String[] paragraphs = we.getParagraphText();
Picture pic = new Picture(. . .) ;
pic.writeImageContent( . . . ) ;
PicturesTable picTable = new PicturesTable( . . . ) ;
if ( picTable.hasPicture( . . . ) ){
picTable.extractPicture(..., ...);
picTable.getAllPictures() ;
}