Results 1 to 1 of 1
- 01-11-2011, 11:46 AM #1
Member
- Join Date
- Jan 2011
- Posts
- 1
- Rep Power
- 0
How to search russian texts in Lucene index?
Hello.
I can not understand where I was wrong. My code, where "/ home/test/03m8894---20070213134234.txt" - file with the English text, and "/ home/test/01---20061121103506.txt" - file with the Russian text. Both files are encoded in UTF-8. The result of the execution of the program: 1 0 Ie the program finds only text in English and Russian text ignored. Although if you do
the text field partnum correctly, no errors in the output encoding on the screen.Java Code:for (int m = 0; m <totalDocs; m + +) { Document thisDoc = reader.document (m); System.out.print (thisDoc.get ("partnum"));
Java Code:RAMDirectory directory = new RAMDirectory(); IndexWriter writer = //new IndexWriter(directory, new SimpleAnalyzer(), true, IndexWriter.MaxFieldLength.UNLIMITED); new IndexWriter(directory, new RussianAnalyzer(Version.LUCENE_30), true, IndexWriter.MaxFieldLength.UNLIMITED); File f1[] = {new File("/home/test/03m8894---20070213134234.txt"), new File("/home/test/01---20061121103506.txt")}; String strLine1 = ""; for (int x = 0; x < f1.length; x++) { Document doc = new Document(); int length = (int) f1[x].length(); if (length != 0) { char[] cbuf = new char[length]; InputStreamReader isr = new InputStreamReader(new FileInputStream(f1[x])); final int read = isr.read(cbuf); strLine1 = new String(cbuf, 0, read); isr.close(); doc.add(new Field("partnum", strLine1, Field.Store.YES, Field.Index.NOT_ANALYZED)); //doc.add(new Field("description", "Illidium Space Modulator", Field.Store.YES, Field.Index.ANALYZED)); writer.addDocument(doc); } } writer.close(); IndexSearcher searcher = new IndexSearcher(directory); IndexReader reader = searcher.getIndexReader(); int totalDocs = reader.numDocs(); for (int m = 0; m < totalDocs; m++) { Document thisDoc = reader.document(m); String tmp_str=thisDoc.get("partnum"); Query query = new TermQuery(new Term("partnum", tmp_str)); TopDocs rs = searcher.search(query, null, 10); System.out.println(rs.totalHits);
Similar Threads
-
Pulling an Index made with Lucene to dev batch code for index listings
By txgeekgirl in forum LuceneReplies: 0Last Post: 10-29-2010, 08:15 PM -
Sign Up Now!!One day left for a free technical webinar on Mastering the Lucene Index
By charlescruz in forum Java SoftwareReplies: 1Last Post: 08-10-2010, 11:05 AM -
Lucene logical view of index and running time
By toeh101 in forum LuceneReplies: 1Last Post: 01-27-2010, 10:43 AM -
How to index the special characters in Lucene
By talktoudaykumar in forum LuceneReplies: 2Last Post: 04-23-2009, 07:51 AM -
Need help in lucene with hibernate search
By gopalbisht in forum Java SoftwareReplies: 1Last Post: 04-20-2009, 01:54 PM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks