Results 1 to 8 of 8
Thread: IndexSearcher
- 03-30-2011, 01:23 PM #1
Member
- Join Date
- Mar 2011
- Posts
- 18
- Rep Power
- 0
IndexSearcher
Hello, I am new to Lucene, so I have been dealing with some problems.
I am creating an Index and I am adding two .rtf documents in it. I suppose the adding part is correct because when the index is created, the numDocs() returns 2, which is right.
But, when it comes to the search, I am getting 0 hits in return. I am using Lucene 3.0.3 and this is my code.
Any help would be useful..thanksJava Code:/* * To change this template, choose Tools | Templates * and open the template in the editor. */ package myThesis; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; import java.io.Reader; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriter.MaxFieldLength; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.store.Directory; import org.apache.lucene.store.LockObtainFailedException; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; /** * * @author Andreas */ public class Lucene1{ private Analyzer analyzer; private Directory directory; private IndexWriter iwriter; private Document doc; private IndexSearcher isearcher; private File dir; private String path; private FileInputStream is; private MaxFieldLength mlf; private Boolean b; private int originalNumDocs; public Lucene1(String search) throws FileNotFoundException, CorruptIndexException, LockObtainFailedException, IOException, ParseException{ // Store the index in memory: // To store an index on disk, use this instead: //Directory directory = FSDirectory.open("/tmp/testindex"); directory = new RAMDirectory(); mlf = IndexWriter.MaxFieldLength.UNLIMITED; path = "C:/Users/Andreas/Documents/NetBeansProjects/docs/"; dir = new File(path); StoreIndex(search); } public void StoreIndex(String searchString) throws CorruptIndexException, LockObtainFailedException, IOException, ParseException{ analyzer = new StandardAnalyzer(Version.LUCENE_30); if (iwriter == null) b = true; else b = false; iwriter = new IndexWriter(directory, analyzer, b, mlf); System.out.println("Creating index with the following files ..."); File[] files = dir.listFiles(); originalNumDocs = iwriter.numDocs(); for (File file : files) { //System.out.println(file); is = new FileInputStream(file); doc = new Document(); path = file.getCanonicalPath(); doc.add(new Field("path", path, Field.Store.YES, Field.Index.ANALYZED)); Reader reader = new FileReader(file); doc.add(new Field("contents", reader,Field.TermVector.WITH_POSITIONS_OFFSETS)); iwriter.addDocument(doc); System.out.println("Added: " + file); //System.out.println(iwriter.numDocs()); //System.out.println(iwriter.numRamDocs()); } iwriter.optimize(); iwriter.close(); System.out.println("Index has been created."); System.out.println(); System.out.println((iwriter.numDocs() - originalNumDocs) + " documents added."); SearchIndex(searchString); } public void SearchIndex(String searchString) throws IOException, ParseException{ System.out.println("Searching for '" + searchString + "'"); IndexReader ireader = IndexReader.open(directory, true); isearcher = new IndexSearcher(ireader); analyzer = new StandardAnalyzer(Version.LUCENE_30); // Parse a simple query that searches for "text": QueryParser parser = new QueryParser(Version.LUCENE_30, "content", analyzer); // Search for documents that contain the word searchString Query query = parser.parse(searchString); TopScoreDocCollector collector = TopScoreDocCollector.create(1000, true); isearcher.search(query, collector); /* First parameter is the query to be executed and second parameter indicates the no of search results to fetch */ //TopDocs topDocs = isearcher.search(query,1000); // Get an array of references to matched documents ScoreDoc[] hits = collector.topDocs().scoreDocs; System.out.println("Total hits: " + collector.getTotalHits()); for (ScoreDoc scoredoc : hits) { //Retrieve the matched document and show relevant details Document hitDoc = isearcher.doc(scoredoc.doc); String path2 = hitDoc.get("path"); System.out.println("Hit: " + path2); String path3 = hitDoc.get("content"); System.out.println(path3); } isearcher.close(); directory.close(); } }Last edited by axenos; 03-30-2011 at 09:26 PM.
- 03-31-2011, 07:49 AM #2
Hi. Can you show some example a file, which you index and a search word?
Skype: petrarsentev
http://TrackStudio.com
- 03-31-2011, 08:49 AM #3
Member
- Join Date
- Mar 2011
- Posts
- 18
- Rep Power
- 0
Hi,
the document I want to parse was a .java file and I changed its format to .txt.
This is the document's contens:
Java Code:package katastimaperifereiakwn; import java.io.Serializable; /** * A class that legates its variables and functions to the * classes Admin, Employee, Manager. Its functions * can be used to create and initialize an object, get and set its * values. * * @author Xenos Andreas-1391, Neroutsos Efthimis-1515 */ public class User implements Serializable { protected String firstName; protected String lastName; protected String username; protected String password; protected String post; protected String phoneNum; protected boolean k; /** * This function allows us to set the value of the variable k * which is used to control and check the flow of the user * (Admin, Employee, Manager) functions in the main class. * @param aK The value of the k (true/false) */ public void setK(boolean aK) { k = aK; } /** * This function allows us to set the value of the variable k * which is used to control and check the flow of the user * (Admin, Employee, Manager) functions in the main class. * @return The value of the k. */ public boolean getK() { return k; } /** * This is the constructor of the class User * @param first The first name of the user * @param last The last name of the user * @param user The username of the user * @param pass The password of the user * @param phone The phone number of the user */ public User (String first,String last,String user,String pass,String phone) { firstName = first; lastName = last; username = user; password = pass; phoneNum = phone; } public User(){}; /** * Use this function to set the first name of the user * @param first The first name of the user */ public void setFirstName(String first) { firstName = first; } /** * Use this function to get the first name of the user * @return The first name of the user */ public String getFirstName() { return firstName; } /** * Use this function to set the last name of the user * @param last The last name of the user */ public void setLastName(String last) { lastName = last; } /** * Use this function to get the last name of the user * @return The last name of the user */ public String getLastName() { return lastName; } /** * Use this function to set the username of the user * @param user The username of the user */ public void setUsername(String user) { username = user; } /** * Use this function to get the username of the user * @return The username of the user */ public String getUsername() { return username; } /** * Use this function to set the password of the user * @param pass The password of the user */ public void setPassword(String pass) { password = pass; } /** * Use this function to get the password of the user * @return The password of the user */ public String getPassword() { return password; } /** * Use this function to set the phone number of the user * @param phone The phone number of the user */ public void setPhoneNum(String phone) { phoneNum = phone; } /** * Use this function to get the phone number of the user * @return The phone number of the user */ public String getPhoneNum() { return phoneNum; } /** * Use this function to set the post of the user * @param pos The post of the user */ public void setIdiotita(String pos) { post = pos; } /** * Use this function to get the post of the user * @return The post of the user */ public String getIdiotita() { return post; } }
Anyway, this document is an example. I want to be able to parse any document. Especially .java files.
The search word I am using is 'user' and it gives me 0 hits.
I am adding another file in the index also, similar to that.
Have a look if you can and if you need anything else, please let me know.
ok..thank you again.bye
- 03-31-2011, 09:24 AM #4
Member
- Join Date
- Mar 2011
- Posts
- 18
- Rep Power
- 0
I tried to write the code differently. I can't really understand the difference between this and the original.
This code finds some hits, not all of them and in the line that I have d.get("content"), it prints null.
Java Code:package myThesis; import java.io.File; import java.io.FileReader; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.search.*; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import java.io.IOException; import java.io.Reader; import java.util.Scanner; public class Lucene2 { private File dir; private String path; public void RunMe() throws IOException, ParseException { // 0. Specify the analyzer for tokenizing text. // The same analyzer should be used for indexing and searching StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_30); // 1. create the index Directory index = new RAMDirectory(); path = "C:/Users/Andreas/Documents/NetBeansProjects/docs/"; dir = new File(path); // the boolean arg in the IndexWriter ctor means to // create a new index, overwriting any existing index IndexWriter iwriter = new IndexWriter(index, analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED); File[] files = dir.listFiles(); for (File file : files) { Reader reader = new FileReader(file); addDoc(iwriter, reader); } iwriter.optimize(); iwriter.close(); // 2. query System.out.print("Query: "); Scanner input = new Scanner(System.in); String query = input.next(); Query q = new QueryParser(Version.LUCENE_30, "content", analyzer).parse(query); // 3. search int hitsPerPage = 1000; IndexSearcher searcher = new IndexSearcher(index, true); TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true); searcher.search(q, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; // 4. display results System.out.println("Found " + hits.length + " hits."); for(int i=0;i<hits.length;++i) { Document d = searcher.doc(hits[i].doc); System.out.println((i + 1) + ". " + d.get("content")); } // searcher can only be closed when there // is no need to access the documents any more. searcher.close(); } private void addDoc(IndexWriter w, Reader reader) throws IOException { Document doc = new Document(); doc.add(new Field("content", reader,Field.TermVector.WITH_POSITIONS_OFFSETS)); w.addDocument(doc); } }
- 03-31-2011, 09:57 AM #5
It's cool. You posted three examples code, But they haven't main method. :) Can you show a example which I can run?
Skype: petrarsentev
http://TrackStudio.com
- 03-31-2011, 06:46 PM #6
Member
- Join Date
- Mar 2011
- Posts
- 18
- Rep Power
- 0
ok! you're right. For now, I have a main class that only creates objects of these two (attempts of) lucene classes. So, for the two classes, I have this main:
Java Code:package myThesis; import java.io.IOException; import java.sql.SQLException; import java.util.Scanner; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.queryParser.ParseException; /** * * @author Xenos Andreas */ public class Main { /** * @param args the command line arguments */ public static void main(String[] args) throws IOException, SQLException, CorruptIndexException, ParseException { // System.out.print("Query: "); //query = input.next(); //System.out.println(); //Lucene1 luc = new Lucene1(query); Lucene2 l = new Lucene2(); l.RunMe(); } }
I had some more code in the main class, but I erased it from here because it had to do with JDBC anf apache bcel. Here, I am just running lucene.
- 04-01-2011, 09:51 AM #7
Oh I confuse yourself. So I don't understand why it happens, but it can't get full context file from IndexSearch.
I little changed your code, looks like
It's class Lucene2.Java Code:for (File file : files) { addDoc(iwriter, file); } ... for(int i=0;i<hits.length;++i) { Document d = searcher.doc(hits[i].doc); System.out.println((i + 1) + ". " + d.get("id")); } ... private void addDoc(IndexWriter w, File file) throws IOException { Document doc = new Document(); doc.add(new Field("id", file.getName(), Field.Store.YES, Field.Index.NOT_ANALYZED)); doc.add(new Field("content",new FileReader(file), Field.TermVector.WITH_POSITIONS_OFFSETS)); w.addDocument(doc); }
Now It works correct, But I really confuse why it's not work. I mean about this statement
Hope this helps.Java Code:System.out.println("Hit: " + document.get(FIELD_CONTENTS));Skype: petrarsentev
http://TrackStudio.com
- 04-01-2011, 10:23 PM #8
Member
- Join Date
- Mar 2011
- Posts
- 18
- Rep Power
- 0
Similar Threads
-
IndexSearcher Single Instance Bottleneck
By RobM in forum LuceneReplies: 0Last Post: 03-10-2011, 09:06 PM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks