Java Forums

Main Menu
Home
Today's Posts
FAQ
Search
Contact Us

Java Network
Java Tips
Java Tips Blog

Sponsored Links





Welcome to the Java Forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community, you will:

  • have access to post topics
  • communicate privately with other members (PM)
  • not see advertisements between posts
  • have the possibility to earn one of our surprises if you are an active member
  • access many other special features that will be introduced later.

Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 06-25-2008, 01:08 PM
Member
 
Join Date: May 2008
Posts: 12
vaskarbasak is on a distinguished road
dentify the language type from a given String.
Hi all,

Do you have some source code sample or any idea how can i identify the language type from a given String.

e.g-

“林悦旻” -Chinese language
“ABC”- English language
etc.

Thanks!
vaskar
Bookmark Post in Technorati
Reply With Quote
Sponsored Links
  #2 (permalink)  
Old 06-25-2008, 05:46 PM
Member
 
Join Date: Jun 2008
Posts: 9
kurenai is on a distinguished road
where are you getting the string from?
Bookmark Post in Technorati
Reply With Quote
  #3 (permalink)  
Old 06-25-2008, 06:17 PM
Member
 
Join Date: Jun 2008
Posts: 9
kurenai is on a distinguished road
maybe this might help you...jchardet.sourceforge.net

i don't know how it could possibly be accurate though...
Bookmark Post in Technorati
Reply With Quote
  #4 (permalink)  
Old 06-25-2008, 07:33 PM
Niveditha's Avatar
Senior Member
 
Join Date: May 2008
Posts: 282
Niveditha is on a distinguished road
Send a message via Skype™ to Niveditha
Hi,
The code in this link is for japanese language and it may be converted to chinese also itseems as i dont know about Chinese i cant help u out with the code
Java - Chinese Language Processing and Chinese Computing
__________________
To finish sooner, take your own time....
Nivedithaaaa
Bookmark Post in Technorati
Reply With Quote
  #5 (permalink)  
Old 06-27-2008, 05:33 PM
Norm's Avatar
Senior Member
 
Join Date: Jun 2008
Location: SW MO, USA
Posts: 975
Norm is on a distinguished road
Since the a String is made up of Unicode characters, convert one of the characters in the string to an int value and use that to see where in the range of values for two byte values of characters that make up the full range of Unicode characters that it fits. For example ASCII/english chars could range from 0 - 256. To guess the language/alphabet a char was from you need a table that maps the ranges of Unicode characters for each language.
Something like: English 0-255, Japanese 1200-1400 etc for the full range of Unicode values 0-64K
Bookmark Post in Technorati
Reply With Quote
  #6 (permalink)  
Old 06-28-2008, 08:51 AM
Member
 
Join Date: May 2008
Posts: 12
vaskarbasak is on a distinguished road
Can u give me some sample code? From where do i get the unicode range table.Can u give me some url.

Thanks!
vaskar
Bookmark Post in Technorati
Reply With Quote
  #7 (permalink)  
Old 06-28-2008, 08:56 AM
Eranga's Avatar
Moderator
 
Join Date: Jul 2007
Location: Colombo, Sri Lanka
Posts: 3,065
Eranga has a spectacular aura aboutEranga has a spectacular aura about
Send a message via Yahoo to Eranga
This page may help you.
__________________
Use an appropriate Subject. "Help, urgent!" isn't one.
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

Has someone helped you? Then you can
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
their helpful post.

Want to make your IDE the best?
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
(Close on September 4, 2008)

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
Bookmark Post in Technorati
Reply With Quote
  #8 (permalink)  
Old 06-28-2008, 11:52 AM
Member
 
Join Date: May 2008
Posts: 12
vaskarbasak is on a distinguished road
From where do i get the information

English 0-255, Japanese 1200-1400....etc?

pls help me.
Bookmark Post in Technorati
Reply With Quote
  #9 (permalink)  
Old 06-28-2008, 12:07 PM
Eranga's Avatar
Moderator
 
Join Date: Jul 2007
Location: Colombo, Sri Lanka
Posts: 3,065
Eranga has a spectacular aura aboutEranga has a spectacular aura about
Send a message via Yahoo to Eranga
You have to use UNICODE tables.
__________________
Use an appropriate Subject. "Help, urgent!" isn't one.
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

Has someone helped you? Then you can
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
their helpful post.

Want to make your IDE the best?
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
(Close on September 4, 2008)

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
Bookmark Post in Technorati
Reply With Quote
  #10 (permalink)  
Old 06-28-2008, 06:18 PM
Nicholas Jordan's Avatar
Senior Member
 
Join Date: Jun 2008
Location: Southwest
Posts: 422
Nicholas Jordan is on a distinguished road
use Norm's suggestion.
vaskarbasak, this is a remarkably involved subject. I found some work on the subject that is literally millions of lines of text, there are issues that are not apparent to native ISO Latin - 1 speakers.

Just digging through the information available would require writing specailized Java programs. It would be better if you tell us what you are trying to achieve. Java has remarkable ability to handle a String as a String without the coder trying to disentangle the Unicode Constortium.

Have you ever read an RFC?
__________________
Please provide your feedback on our
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
.
Cybercartography: A new theoretical construct proposed by D.R. Fraser Taylor
Bookmark Post in Technorati
Reply With Quote
  #11 (permalink)  
Old 07-24-2008, 06:58 AM
Member
 
Join Date: Jul 2008
Posts: 4
sravan_tel is on a distinguished road
Chinese character issue
Hey Vaskarbasak,
now i need exactly this same thing. Pls help me as i hope it 'd be solved for u by today.
Anybody has any idea please tell me.
Bookmark Post in Technorati
Reply With Quote
  #12 (permalink)  
Old 07-24-2008, 07:00 AM
Eranga's Avatar
Moderator
 
Join Date: Jul 2007
Location: Colombo, Sri Lanka
Posts: 3,065
Eranga has a spectacular aura aboutEranga has a spectacular aura about
Send a message via Yahoo to Eranga
Did you go through all the replies in this thread? There are lots of hints for you.
__________________
Use an appropriate Subject. "Help, urgent!" isn't one.
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

Has someone helped you? Then you can
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
their helpful post.

Want to make your IDE the best?
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
(Close on September 4, 2008)

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
Bookmark Post in Technorati
Reply With Quote
  #13 (permalink)  
Old 07-24-2008, 07:11 AM
Member
 
Join Date: Jul 2008
Posts: 4
sravan_tel is on a distinguished road
The ans frn Nivedita found useful for my problem, for Japanese. I have this same requirement for chinese and korean as well.
And the character set range specified above,, i'm not dare enough to decide the language of the String, by just using range of values.
Bookmark Post in Technorati
Reply With Quote
  #14 (permalink)  
Old 07-24-2008, 07:16 AM
Eranga's Avatar
Moderator
 
Join Date: Jul 2007
Location: Colombo, Sri Lanka
Posts: 3,065
Eranga has a spectacular aura aboutEranga has a spectacular aura about
Send a message via Yahoo to Eranga
There in the Niveditha's link talking about UNICODE. So the thing it you have to fine the correct range of UNICODE values for correct language.
__________________
Use an appropriate Subject. "Help, urgent!" isn't one.
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

Has someone helped you? Then you can
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
their helpful post.

Want to make your IDE the best?
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
(Close on September 4, 2008)

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
Bookmark Post in Technorati
Reply With Quote
  #15 (permalink)  
Old 07-29-2008, 11:49 AM
Member
 
Join Date: Jul 2008
Posts: 4
sravan_tel is on a distinguished road
Hey Eranga,
Thanku so much.., i got it and working fine.....
Bookmark Post in Technorati
Reply With Quote
  #16 (permalink)  
Old 07-29-2008, 12:13 PM
Eranga's Avatar
Moderator
 
Join Date: Jul 2007
Location: Colombo, Sri Lanka
Posts: 3,065
Eranga has a spectacular aura aboutEranga has a spectacular aura about
Send a message via Yahoo to Eranga
Nice to here that. If you can I think it's better to briefly explain here how did you so it. Because in later another member can follow the way you take to solve a problem.
__________________
Use an appropriate Subject. "Help, urgent!" isn't one.
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

Has someone helped you? Then you can
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
their helpful post.

Want to make your IDE the best?
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
(Close on September 4, 2008)

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
Bookmark Post in Technorati
Reply With Quote
  #17 (permalink)  
Old 07-29-2008, 12:38 PM
Niveditha's Avatar
Senior Member
 
Join Date: May 2008
Posts: 282
Niveditha is on a distinguished road
Send a message via Skype™ to Niveditha
Ya Eranga's suggestion was correct, atleast any one else wont have to spend one more month again to find the same solution...
__________________
To finish sooner, take your own time....
Nivedithaaaa
Bookmark Post in Technorati
Reply With Quote
  #18 (permalink)  
Old 07-31-2008, 11:24 AM
Member
 
Join Date: Jul 2008
Posts: 4
sravan_tel is on a distinguished road
Heres the brief explanation of my problem and solution,
I want to recognize chinese(both traditional and simplified) , japanese and Korean. In our code we can't recognize these characters directly as we do for English characters/strings. This is done with the help of unicode character set. Each language has different range of values to represent their characters. For example,
for Korean language the range of values are '\uAC00' to '\uD7A3'. Which means, every korean letter has some value within this range. In this way we will come to a conclusion that this letter belong to Korean language.
Please note that above range of values belongs to Hangul Syllablus, which is a type of languages in Korean, as there are different type of Koran langs i seen(but we actually won't see much difference.).
Please make sure your java file is set to unicode(UTF-8) format.
More questions? mail me.
Bookmark Post in Technorati
Reply With Quote
  #19 (permalink)  
Old 07-31-2008, 11:29 AM
Eranga's Avatar
Moderator
 
Join Date: Jul 2007
Location: Colombo, Sri Lanka
Posts: 3,065
Eranga has a spectacular aura aboutEranga has a spectacular aura about
Send a message via Yahoo to Eranga
That's fine. So that anyone refer this thread can have a brief idea what he/she have to do.
__________________
Use an appropriate Subject. "Help, urgent!" isn't one.
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

Has someone helped you? Then you can
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
their helpful post.

Want to make your IDE the best?
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
(Close on September 4, 2008)

To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
Bookmark Post in Technorati
Reply With Quote
Sponsored Links
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Cast string type to int type GilaMonster New To Java 6 06-13-2008 08:10 AM
[SOLVED] curiosity about String type variable monir6464 Advanced Java 1 04-08-2008 12:13 PM
How to cast an Object into a specific type (Integer/String) at runtime mailtogagan@gmail.com Advanced Java 2 12-03-2007 02:04 PM
Using java.util.Scanner to search for a String in a String Java Tip Java Tips 0 11-20-2007 05:59 PM
V language 0.004 JavaBean Java Announcements 0 07-19-2007 04:18 PM


All times are GMT +3. The time now is 08:55 PM.


VBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2006 - 2007, www.java-forums.org