Results 1 to 1 of 1
Thread: Help me Index writer for Ngram
- 03-26-2012, 04:27 PM #1
Member
- Join Date
- Mar 2012
- Posts
- 1
- Rep Power
- 0
Help me Index writer for Ngram
I want create indexWriter for character Ngram. ex: Lucene is a great language. Then i want to use Ngram with n=3 to become: Luc uce cen ene is a gre rea eat....
my code:
IndexWriter writer = new IndexWriter(INDEX_DIR, new PositionalPorterStopAnalyzer(), true, IndexWriter.MaxFieldLength.UNLIMITED);IndexWriter. MaxFieldLength.UNLIMITED);
writer.setMaxFieldLength(100000);
Reader reader = new FileReader(f);
Document doc = new Document();
NGramTokenizer token=new NGramTokenizer(token,3,3);
doc.add(new Field("contents", new FileReader(f)));
doc.add(new Field("vector",token,Field.TermVector.YES));
With above code I only create IndexWriter for token with extract 3 character but it is not gram.
Who can help me for this issues? because token on above NgramTokenizer only extract 3 character without 3 character of Ngram?
Thanks very much in advance for your help?
Similar Threads
-
Java and Writer
By wing in forum New To JavaReplies: 11Last Post: 08-19-2011, 10:27 AM -
Encoding in java.io.writer
By hariharabalan in forum New To JavaReplies: 1Last Post: 12-06-2010, 10:27 AM -
Pulling an Index made with Lucene to dev batch code for index listings
By txgeekgirl in forum LuceneReplies: 0Last Post: 10-29-2010, 08:15 PM -
creating word writer
By Anchal in forum AWT / SwingReplies: 3Last Post: 04-06-2010, 09:00 AM -
CSV file writer
By nida in forum Java ServletReplies: 3Last Post: 05-08-2009, 02:08 PM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks