Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 07-01-2009, 08:53 AM
Member
 
Join Date: Jul 2009
Posts: 1
Rep Power: 0
saurabh01 is on a distinguished road
Default Unicode string serach problem
Hi,
I am new in Java. Problem is the Unicode string. I have a Unicode file where I am searching a few words/sentence. The search string is defined as
String StringToBeSearch = "Because the Agent software is already";
When I am opening the file and try to compare it with the file content, it don’t succeeds. If I am opening that Unicode text file the contents are there.
What print the every line of the txt file and observed that the every latter have one more space like word “Because” is printed as “B e c a u s e”. Hence the search has failed.

public static boolean ReadFileAndSearchString()
{
String FileName = "Install.txt";
String StringToBeSearch = "Because the Agent software is already";

boolean found = false;

File file = new File(FileName);
FileInputStream fis = null;
BufferedInputStream bis = null;
DataInputStream dis = null;

try {
fis = new FileInputStream(file);

// Here BufferedInputStream is added for fast reading.
bis = new BufferedInputStream(fis);
dis = new DataInputStream(bis);

// dis.available() returns 0 if the file does not have more lines.
while (dis.available() != 0) {

// this statement reads the line from the file and print it to
// the console.
String line = dis.readLine();
System.out.println(line);

int ret = line.indexOf(StringToBeSearch);
if ( ret >= 0 )
{
System.out.println("Got The text");
found = true;
break;
}
}

// dispose all the resources after using them.
fis.close();
bis.close();
dis.close();

} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}

return found;
}

also tried with the following code

FileInputStream fis = new FileInputStream(FileName);
InputStreamReader isr = new InputStreamReader(fis, "UTF8");

but did not worked.

Providing my txt file as attachment.
Attached Files:
File Type: txt install.txt (3.4 KB, 3 views)
Bookmark Post in Technorati
Reply With Quote
  #2 (permalink)  
Old 07-01-2009, 07:32 PM
OrangeDog's Avatar
Senior Member
 
Join Date: Jan 2009
Location: Cambridge, UK
Posts: 838
Rep Power: 2
OrangeDog is on a distinguished road
Default
Probably because you're using a DataInputStream. Just stick with the BufferedInputStream.

Also, you only need to close the outer-wrapping stream.
__________________
Don't forget to mark threads as [SOLVED] and give reps to helpful posts.
How To Ask Questions The Smart Way
Bookmark Post in Technorati
Reply With Quote
  #3 (permalink)  
Old 07-02-2009, 11:22 AM
RamyaSivakanth's Avatar
Senior Member
 
Join Date: Apr 2009
Location: Chennai
Posts: 533
Rep Power: 1
RamyaSivakanth is on a distinguished road
Default
Hi,
Use UTF16 instead of UTF8 and try once using InputStreamReader.

FileInputStream fis = new FileInputStream(FileName);
InputStreamReader isr = new InputStreamReader(fis, "UTF16");



-Regards
Ramya

Gothru this sample code
Code:
import java.io.*;
class  Test
{
	public static void main(String[] args) throws Exception
	{
		String FileName = "trial.txt";
		String StringToBeSearch = "Because the Agent software is already";

		boolean found = false;

		File file = new File(FileName);

		FileInputStream fis = new FileInputStream(file);
		InputStreamReader isr = new InputStreamReader(fis, "UTF16");
	             StringBuffer buffer = new StringBuffer();
		Reader in = new BufferedReader(isr);
		int ch;
		while ((ch = in.read()) > -1) {
			buffer.append((char)ch);
		}//while
		in.close();
	
		int ret = buffer.toString().indexOf(StringToBeSearch);
		if ( ret >= 0 )
		{
			System.out.println("Got The text");
			found = true;

		}


		// dispose all the resources after using them.

		in.close();
		fis.close();

	}//main


}//class
__________________
Ramya
Bookmark Post in Technorati
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Fedora Itext Unicode Problem gautamn Java 2D 0 04-13-2009 09:12 AM
String/sentence to unicode convertion sandeepvreddy New To Java 5 11-20-2008 04:33 PM
SWT 2D Unicode Example Java Tip SWT 0 06-28-2008 10:21 PM
How to Draw Unicode String in Java Java Tip java.awt 0 06-24-2008 12:15 AM
Unicode problem rovshanb Database 0 02-14-2008 07:41 AM


All times are GMT +2. The time now is 04:37 PM.



VBulletin, Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2009, Crawlability, Inc.
Copyright ©2006 - 2007, www.java-forums.org