Java, `cmd.exe', and UTF-8
To the best of my knowledge it has never worked to print UTF-8 to a Windows console.
Let's take the following example string: "‘’ “” – — »« ›‹ © … ← → ↑ ↓ 奥" (the last char being troublesome Chinese).
As I understand it, `chcp 65001' tells `cmd.exe' to be UTF-8-encoded.
Now I have this class:
public class Foo {
public static void main(String[] args) {
System.out.println("file.encoding=" + System.getProperty("file.encoding"));
System.out.println("‘’ “” – — »« ›‹ © … ← → ↑ ↓ 奥");
}
}
Compile and run it like this:
> javac -encoding utf8 Foo.java
> cchcp 65001
> java -Dfile.encoding=UTF-8 Foo
I expect this to work, but it doesn't. As it turns out, the characters are displayed correcly but then followed by clutter. The output I get is the following:
‘’ “” – — »« ›‹ © … ← → ↑ ↓ 奥�‹ © … ← → ↑ ↓ 奥→ ↑ ↓ 奥 ↓ 奥 奥��
This is not quite want I wanted.
I don't know what you think, but it's 2011 AD and I find it quite embarrassing that Java still isn't able to write UTF-8 to a Windows console.
Now who is to blame? Is it `cmd.exe' (which is, as I have proven, UTF-8-compliant), is it me, or is this really a JDK/JRE bug?
I filed a bug report today, but maybe someone out there knows how I can print UTF-8 to a Windows console. I can't figure it out and give up.
And yeah, I'm new to this forum, so hello everybody
Philip