Results 1 to 7 of 7
  1. #1
    Genom is offline Member
    Join Date
    Feb 2011
    Posts
    10
    Rep Power
    0

    Question A simple question about converting byte array to unicode string

    Hi,

    I have a bytearray which is coding for a unicode text ("BlaBla"). So my byte array is 12 bytes long (there are 6 characters in the text and 2 bytes for each character).

    And here is the byte array:
    66, 0, 108, 0, 97, 0, 66, 0, 108, 0, 97, 0

    Everything seems to be normal but I can't convert this byte array back to "BlaBla" and show it in an text area. Instead of "BlaBla" I get some chinese like characters (䈀氀愀䈀氀愀BB䈀氀愀䈀氀愀).

    I have tried so far:
    String Message=new String(packageBytes,0,12, "UTF-16");
    String Message=new String(packageBytes,0,12, "US-ASCII");
    String Message=new String(packageBytes,0,12, "UTF-16BE");
    String Message=new String(packageBytes,0,12, "UTF-16LE");

    I have tried a lot of things and googled a lot but can't find a solution :confused:?

  2. #2
    javaforum$ is offline Member
    Join Date
    Feb 2011
    Posts
    7
    Rep Power
    0

    Default Character Set Name for UniCode -->> "US-ASCII"

    Try this -

    String msg = new String(new byte[]{66, 0, 108, 0, 97, 0, 66, 0, 108, 0, 97, 0},0,12,"US-ASCII");
    System.out.println(msg);

  3. #3
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    14,051
    Blog Entries
    7
    Rep Power
    23

    Default

    Quote Originally Posted by javaforum$ View Post
    Try this -

    String msg = new String(new byte[]{66, 0, 108, 0, 97, 0, 66, 0, 108, 0, 97, 0},0,12,"US-ASCII");
    System.out.println(msg);
    That is incorrect; print this instead:

    Java Code:
    System.out.println(Arrays.toString(msg.toCharArray())); 
    System.out.println(msg.length());
    ... and you'll see; the encoding is UTF-16, low byte first.

    kind regards,

    Jos
    The only person who got everything done by Friday was Robinson Crusoe.

  4. #4
    Genom is offline Member
    Join Date
    Feb 2011
    Posts
    10
    Rep Power
    0

    Default

    I can't put the output here because of the squares representing "0" bytes. 0's are being recognized as a character. Actually "66" and "0" together have to be a unicode character.
    I get something like this as output:

    [B, *, l, *, a, * ...]
    12

    *... is actually a square. Maybe I have to explain it better:

    I am writing a java client which binds over a socket to a vb.net written server. Programs written with .Net together can exchange strings over network without a problem. But If I send the string to a java client, java client doesn't have a readString() method like .Net. So I decided to send data in byte array and convert it to string in java client. But then illegal characters are being a problem "ş" is then replaced with 2 other symbols in java side. So I decided to send data as unicode. Data is being delivered with 2 headers:

    1 byte as "1" or "2" telling to other side what type of data is coming (same connection is going to be used for file transfer as well. 1 for text 2 for file chunks).

    1 byte as length of data (in bytes) to be read (as x).

    x bytes of data.

    Java Code:
    				InputStream is = socket.getInputStream();
    				int packageType;
    				int packageLen;			
    				
    				packageType=is.read();
    				
    				if (packageType==1) {
    				packageLen=is.read();
    				System.out.println("Package Length: " + packageLen);
    				byte[] packageBytes= new byte[packageLen];
    				is.read(packageBytes, 0, packageLen);
    					int i;
    				for(i=0;i<packageLen;i++){
    					System.out.println(packageBytes[i]);//just to verify bytes are stored in my byte array
    				}
    					String Message=new String(packageBytes,0,12, "UTF-16");
    					System.out.println(Message);
    					ta.append(Message); //ta is a text area. Result is "B" (byte=66) so only the first character is written. I think that is because of the character with byte=0.
    				}
    I really don't get why String(packageBytes,0,12, "UTF-16"); can create a string according to my byte array?

  5. #5
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    14,051
    Blog Entries
    7
    Rep Power
    23

    Default

    Check the CharSet class for the available encodings; it is a low byte first (little endian) UTF-16 encoding.

    kind regards,

    Jos
    The only person who got everything done by Friday was Robinson Crusoe.

  6. #6
    Genom is offline Member
    Join Date
    Feb 2011
    Posts
    10
    Rep Power
    0

    Default

    "UTF-16LE" as charset worked thanks a lot!!!! D But I have tried that as well. Interestingly it didn't work I think I forgot something. Thanks again...!

  7. #7
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    14,051
    Blog Entries
    7
    Rep Power
    23

    Default

    Quote Originally Posted by Genom View Post
    "UTF-16LE" as charset worked thanks a lot!!!! D But I have tried that as well. Interestingly it didn't work I think I forgot something. Thanks again...!
    Good to hear that your problem is solved now; I recognize UTF-16 low byte first when I see it ;-) Just using UTF-16 is platform dependent when there are no two 'byte order' bytes at the start of the byte sequence (i.e. 0xFF 0xFE or 0xFE 0xFF)

    kind regards,

    Jos
    The only person who got everything done by Friday was Robinson Crusoe.

Similar Threads

  1. Converting string to byte[]
    By bobo67 in forum New To Java
    Replies: 12
    Last Post: 09-10-2010, 09:10 PM
  2. converting byte array to bmp file
    By Moorkh in forum New To Java
    Replies: 2
    Last Post: 09-07-2010, 02:58 PM
  3. Need help converting int to a 4 byte array
    By kook04 in forum Advanced Java
    Replies: 5
    Last Post: 02-26-2010, 09:59 PM
  4. Converting Image to byte array[] ?
    By afflictedd2 in forum CLDC and MIDP
    Replies: 0
    Last Post: 04-11-2009, 11:33 PM
  5. Byte arrays and MIDI - simple question?
    By Ravaa in forum New To Java
    Replies: 1
    Last Post: 03-23-2009, 10:47 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •