Results 1 to 3 of 3
  1. #1
    sagarsway is offline Member
    Join Date
    Dec 2008
    Posts
    2
    Rep Power
    0

    Smile Getting problem in UTF-8 Encode/Decode with Java

    Hi All ,

    I am facing problem with encoding/decoding UTF-8 characters . I have one string with special characters "ю" which i want to decode using UTF-8.

    I checked with one of utility of Adobe i.e. UTF-8 Encoder/Decoder , which gives me o/p after decoding above string "€Á‚…‰‹œÝ›‘ŽŸåã”"

    Now i need the java programme which will endode "ю" & give me o/p as "€Á‚…‰‹œÝ›‘ŽŸåã”" & Vice versa for decoding.

    I already used java.net.URLDecoder & java.nio.charset.CharsetDecoder but didn't get expected above o/p.

  2. #2
    Steve11235's Avatar
    Steve11235 is offline Senior Member
    Join Date
    Dec 2008
    Posts
    1,046
    Rep Power
    7

    Default Encoding/Decoding

    You are right in using Charset, CharsetDecoder, and CharsetEncoder to turn UTF-8 into Unicode. However, that will not provide the output Adobe is providing. Performing decode, followed by encode, should give you back exactly the same characters you started with. Adobe is mapping some of the input characters to different output characters; I'm not sure why.

    Also, it sounds like you are dealing with 8-bit character sets, not Unicode, perhaps for a Web application. Make sure you know what the source and destination character sets are.

    If your input is 8-bit, and you simply want to transform certain characters, don't decode your input to Unicode at all. Leave it in a byte array, and scan through the bytes, changing the ones you want to map by examining the numeric values. Remember that byte is signed, so characters above 127 will have negative values. Just replace the values in the array, then output the array.

    If you want a String, then do the above and then decode it.

    I hope this helps...

    -Steve

  3. #3
    neilcoffey is offline Senior Member
    Join Date
    Nov 2008
    Posts
    286
    Rep Power
    7

    Default

    The exact answer to your problem depends on where exactly you want the encoded characters to end up. If you want the "raw" bytes encoded in UTF-8, then use getBytes() on a String:

    Java Code:
    String str = "StringThatINeedToEncode";
    byte[] utf8bytes = str.getBytes("UTF-8");
    If you need the charcters writing to a file (or indeed, any output stream), then wrap the output stream with an OutputStreamWriter, constructing the latter to encode as UTF-8:

    Java Code:
    Writer w = new OutputStreamWriter(new FileOutputStream(f), "UTF-8");
    // ... write chars to w
    Usually, you'd want to insert a BufferedOutputStream between the OutputStreamWriter and the FileOutputStream.

Similar Threads

  1. JAVA and XML Problem
    By jackchang in forum XML
    Replies: 4
    Last Post: 02-22-2009, 09:28 PM
  2. allowable characters from URLDecoder.decode(String
    By Nicholas Jordan in forum Networking
    Replies: 4
    Last Post: 10-18-2008, 06:46 PM
  3. MimeUtility.decode encoding
    By mwildam in forum Advanced Java
    Replies: 2
    Last Post: 08-19-2008, 03:41 PM
  4. Problem in java
    By saytri in forum New To Java
    Replies: 4
    Last Post: 01-16-2008, 11:09 PM
  5. JAVA if problem
    By toby in forum New To Java
    Replies: 2
    Last Post: 07-25-2007, 08:58 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •