Results 1 to 2 of 2
- 09-06-2009, 11:42 AM #1
Member
- Join Date
- Sep 2009
- Posts
- 1
- Rep Power
- 0
Does Lucene allows to split a document tokens between several indexes?
Hi,
(1)
I'm trying to make some changes in Lucene Java to provide some better support for Farsi.The problem I'm concerned now is searching for both simplified (accent removed) and original form of a phrase in the same time. To do this, I'm going to store the original form of the file content in a primary index, and store just the accent-removed form of accented words in a little secondary index. By this way, I'll use a simple index searcher for the exact form search and a multiple index searcher for the accent-removed form search.
Now my question is does Lucene allows to split a document tokens between several indexes? And if so, whether it can load the different parts from the indexes into one internal document object?
(2)
I need to make some changes to the index creation process, too. I think it's an error-prone process if I try to change the whole hierarchy of calls initiated by the IndexWriter.addDocument method. It will be much easier if I could just ask the secondary index writer to store the required tokens (which are simplified form of accented words).
To do this, I first created extensions over Token, Tokenizer and Filter classes which let me hold two string values for each token. By the way, as the tokenizing process take place deep inside IndexWriter.addDocument call hierarchy, this seems useless. Also if I preprocess the extracted text and just send the desired words to the secondary index writer, it will cause misdata in position, etc. Any idea about how to do that?
Regards,
Zakeri
- 11-04-2009, 05:58 PM #2
Member
- Join Date
- Oct 2008
- Location
- US
- Posts
- 58
- Rep Power
- 0
1. I am not quite clear why you would store the accent in a seperate index. If its stored in same index with a different field name then your job will be easier.
2. I did not understand this question.Have fun....
JAVA FAQs
Similar Threads
-
Removing Indexes
By gilbertsavier in forum JDBCReplies: 0Last Post: 07-17-2009, 07:23 AM -
Creating Indexes
By gilbertsavier in forum JDBCReplies: 0Last Post: 07-17-2009, 07:23 AM -
How to split a String using split function
By Java Tip in forum java.langReplies: 4Last Post: 04-17-2009, 08:27 PM -
Auto updation of Editable Column Indexes
By Gajesh Tripathi in forum AWT / SwingReplies: 0Last Post: 10-23-2008, 10:23 AM -
How to split a String using split function
By JavaBean in forum Java TipReplies: 0Last Post: 10-04-2007, 09:32 PM


LinkBack URL
About LinkBacks

Bookmarks