Results 1 to 5 of 5
Thread: Java UTF-16 Encoding
- 05-03-2011, 07:48 PM #1
Member
- Join Date
- May 2011
- Posts
- 3
- Rep Power
- 0
Java UTF-16 Encoding
I'm doing some research work on java and unicode, and I need the algorithm that codes a unicode point into bytes (How it is EXACTLY implemented by Java). I found that this function is described in the Character.java library, but unfortunately it only shows the following casting to get the char byte(s) from a unicode point value:
I tried to find elsewhere how this conversion is done, but no luck so far.Java Code:public static char[] toChars(int codePoint) { .... [INDENT]return new char[] { (char) codePoint };[/INDENT] }
Does anyone know where can I find how the conversion from a code point value to a byte is actually done ?
Thanks in advance for any help provided :)
- 05-03-2011, 08:02 PM #2
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,385
- Blog Entries
- 7
- Rep Power
- 17
Strange, because this is what my copy of the source code says:
Pay special attention to the toSurrogates( ... ) method.Java Code:public static char[] toChars(int codePoint) { if (codePoint < 0 || codePoint > MAX_CODE_POINT) { throw new IllegalArgumentException(); } if (codePoint < MIN_SUPPLEMENTARY_CODE_POINT) { return new char[] { (char) codePoint }; } char[] result = new char[2]; toSurrogates(codePoint, result, 0); return result; }
kind regards,
JosLast edited by JosAH; 05-03-2011 at 08:04 PM.
When people rob a bank they get a penalty; when banks rob people they get a bonus.
- 05-03-2011, 08:27 PM #3
Member
- Join Date
- May 2011
- Posts
- 3
- Rep Power
- 0
The problem remains there, in your code you are in fact returning:
Which doesn't explain how that particular conversion is done. The surrogate pair distinction is just done to return 2 chars instead of 1, but the conversion is done in the same way. What i need to know is how does it cast an integer value (codePoint) to a char (bytes).Java Code:return new char[] { [B](char) codePoint[/B] };
- 05-04-2011, 07:57 AM #4
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,385
- Blog Entries
- 7
- Rep Power
- 17
- 05-04-2011, 06:40 PM #5
Member
- Join Date
- May 2011
- Posts
- 3
- Rep Power
- 0
Similar Threads
-
Encoding in java.io.writer
By hariharabalan in forum New To JavaReplies: 1Last Post: 12-06-2010, 10:27 AM -
Need encoding for Korean
By RamaNalayini in forum Advanced JavaReplies: 1Last Post: 11-25-2010, 02:34 PM -
encoding > java, javamail and mysql
By litpuvn in forum Advanced JavaReplies: 6Last Post: 10-21-2010, 03:35 PM -
Character encoding in Java (Linux to Windows)
By BeholdMyGlory in forum New To JavaReplies: 2Last Post: 01-16-2009, 06:24 PM -
Some help with encoding...
By nm123 in forum NetworkingReplies: 0Last Post: 04-15-2008, 12:22 AM


LinkBack URL
About LinkBacks
Reply With Quote

Bookmarks