Results 1 to 3 of 3
Thread: [SOLVED] Codepage conversion
- 02-08-2009, 10:49 PM #1
[SOLVED] Codepage conversion
Hello
I'm working on this little program that retrieves a html-page using the URL and BufferedReader.
But I'm having some problems with special characters like 'ê' with some pages.
My own guess is that the reason is that it is a ISO-8859-1 page loaded into a UTF-8 stream. But is it possible to perform a conversion of the retrieved string ?
Live long and prosper...Last edited by flywheel; 02-09-2009 at 05:56 PM. Reason: Sufficient input to fix problem achieved
- 02-09-2009, 12:14 AM #2
Look at CharsetDecoder and ByteBuffer. The Java network I/O classes make use of these to handle conversions from code page data to Unicode. Of course, you are expected to know in which code page the bytes are encoded. They also offer replacement options for unknown characters.
This means you will have to read the data at the byte level, not as characters. ByteBuffer is a thin wrapper class, so once you have the bytes in an array, creating a ByteBuffer around the array is simple and low overhead.
This is obviously a different approach. I hope this helps...
- 02-09-2009, 05:54 PM #3
Similar Threads
-
XLS to PDF conversion
By nitin2k2k in forum Advanced JavaReplies: 17Last Post: 09-20-2011, 08:41 AM -
Doc to Pdf conversion
By praveen.kb in forum Advanced JavaReplies: 2Last Post: 01-16-2009, 12:27 PM -
Word to xml Conversion
By kushagra in forum Advanced JavaReplies: 3Last Post: 10-16-2008, 08:23 AM -
Conversion from wav to vox
By bozovilla in forum Advanced JavaReplies: 1Last Post: 07-31-2008, 05:54 AM -
Text to XML conversion
By tarandeep.singh in forum XMLReplies: 1Last Post: 06-14-2008, 02:17 AM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks