Results 1 to 2 of 2
  1. #1
    Kaizah is offline Member
    Join Date
    Nov 2009
    Posts
    2
    Rep Power
    0

    Default XML with special characters

    Hello everyone,

    I am trying to do the following:
    - I have an XML document located at some place on the web
    - I want to get the XMLs content (source) as it is on its location
    - The XML file is utf8-encoded

    I can do the above, except for that there is 1 odd thing I cannot seem to fix. I can get the XML's source and all that, but whenever it contains special characters such as or , it gets malformed into something else consisting of two characters. I know this has to do with the fact that the XML file is UTF8 encoded and that I am probably reading it using ISO-encoding. However, I have been trying to get to reading it as UTF8, but I cannot succeed.
    Anyone know how to do this?

    My current code is:
    Java Code:
    public String retrieveSource(String link) {
            
            String htmlCode = "";
            Scanner reader;
            StringBuilder builder;
            try {
            
                URL url = new URL(link);
                reader = new Scanner(url.openStream( ) );
                builder = new StringBuilder( );
                
                while (reader.hasNext( ))
                
                builder.append(reader.nextLine( ) + "\n");
                
                htmlCode = builder.toString( );
            
            } catch (Exception e) {
            
            }
            
            return htmlCode;
        
        }
    Thanks.

  2. #2
    Kaizah is offline Member
    Join Date
    Nov 2009
    Posts
    2
    Rep Power
    0

    Default

    Never mind, simply modifying the line
    Java Code:
    reader = new Scanner(url.openStream( ) );
    To
    Java Code:
    reader = new Scanner(url.openStream( ), "UTF-8" );
    did the trick. :)

Similar Threads

  1. How to index the special characters in Lucene
    By talktoudaykumar in forum Lucene
    Replies: 2
    Last Post: 04-23-2009, 08:51 AM
  2. [SOLVED] special characters (ASCII)
    By AlejandroPe in forum New To Java
    Replies: 8
    Last Post: 04-06-2009, 11:42 AM
  3. Searching for Microsoft special characters
    By Tim McDaniel in forum Eclipse
    Replies: 2
    Last Post: 02-24-2009, 04:11 PM
  4. special characters
    By ravian in forum New To Java
    Replies: 2
    Last Post: 11-16-2007, 02:28 PM
  5. Special characters in text fields
    By Felissa in forum Web Frameworks
    Replies: 0
    Last Post: 06-27-2007, 05:47 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •