Results 1 to 2 of 2
  1. #1
    Join Date
    Sep 2009
    Posts
    1
    Rep Power
    0

    Question Does Lucene allows to split a document tokens between several indexes?

    Hi,

    (1)
    I'm trying to make some changes in Lucene Java to provide some better support for Farsi.The problem I'm concerned now is searching for both simplified (accent removed) and original form of a phrase in the same time. To do this, I'm going to store the original form of the file content in a primary index, and store just the accent-removed form of accented words in a little secondary index. By this way, I'll use a simple index searcher for the exact form search and a multiple index searcher for the accent-removed form search.

    Now my question is does Lucene allows to split a document tokens between several indexes? And if so, whether it can load the different parts from the indexes into one internal document object?

    (2)
    I need to make some changes to the index creation process, too. I think it's an error-prone process if I try to change the whole hierarchy of calls initiated by the IndexWriter.addDocument method. It will be much easier if I could just ask the secondary index writer to store the required tokens (which are simplified form of accented words).
    To do this, I first created extensions over Token, Tokenizer and Filter classes which let me hold two string values for each token. By the way, as the tokenizing process take place deep inside IndexWriter.addDocument call hierarchy, this seems useless. Also if I preprocess the extracted text and just send the desired words to the secondary index writer, it will cause misdata in position, etc. Any idea about how to do that?

    Regards,
    Zakeri

  2. #2
    kzvi.kzvi.1 is offline Member
    Join Date
    Oct 2008
    Location
    US
    Posts
    58
    Rep Power
    0

    Default

    1. I am not quite clear why you would store the accent in a seperate index. If its stored in same index with a different field name then your job will be easier.

    2. I did not understand this question.
    Have fun....
    JAVA FAQs

Similar Threads

  1. Removing Indexes
    By gilbertsavier in forum JDBC
    Replies: 0
    Last Post: 07-17-2009, 07:23 AM
  2. Creating Indexes
    By gilbertsavier in forum JDBC
    Replies: 0
    Last Post: 07-17-2009, 07:23 AM
  3. How to split a String using split function
    By Java Tip in forum java.lang
    Replies: 4
    Last Post: 04-17-2009, 08:27 PM
  4. Auto updation of Editable Column Indexes
    By Gajesh Tripathi in forum AWT / Swing
    Replies: 0
    Last Post: 10-23-2008, 10:23 AM
  5. How to split a String using split function
    By JavaBean in forum Java Tip
    Replies: 0
    Last Post: 10-04-2007, 09:32 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •