Hello everyone,
I am trying to do the following:
- I have an XML document located at some place on the web
- I want to get the XMLs content (source) as it is on its location
- The XML file is utf8-encoded
I can do the above, except for that there is 1 odd thing I cannot seem to fix. I can get the XML's source and all that, but whenever it contains special characters such as ö or é, it gets malformed into something else consisting of two characters. I know this has to do with the fact that the XML file is UTF8 encoded and that I am probably reading it using ISO-encoding. However, I have been trying to get to reading it as UTF8, but I cannot succeed.
Anyone know how to do this?
My current code is:
|
Code:
|
public String retrieveSource(String link) {
String htmlCode = "";
Scanner reader;
StringBuilder builder;
try {
URL url = new URL(link);
reader = new Scanner(url.openStream( ) );
builder = new StringBuilder( );
while (reader.hasNext( ))
builder.append(reader.nextLine( ) + "\n");
htmlCode = builder.toString( );
} catch (Exception e) {
}
return htmlCode;
} |
Thanks.