Results 1 to 5 of 5
  1. #1
    Davizuco is offline Member
    Join Date
    Jan 2017
    Posts
    3
    Rep Power
    0

    Question XML especial character escaping.

    Hi all,

    I having problems editing some XML files in JAVA. The parser is unescaping some especial characters that were escaped at source file.

    Some examples, Source file:
    Java Code:
    <?xml version="1.0" encoding="UTF-8"?>
    <data>
      <description>reposicionamiento &amp; N&#243;tas .</description>
      .............................   more data ..................................
      ...
      ..
    </data>
    Final file (with UTF-8 encoding):
    Java Code:
    <?xml version="1.0" encoding="UTF-8"?>
    <data>
      <description>reposicionamiento &amp; Nótas .</description>
      .............................   more data ..................................
      ...
      ..
    </data>
    I think it's an incorrect encoding problem, but I can not find the right one. My target is an XML file escaped like the source one.

    Decimal code 243 is o-acute in windows 1250 charset, but parser doesn't escape characters as expected.

    I have tried the following with no success:
    • UTF-8
    • cp1250
    • windows-1250
    • UTF-16
    • ISO-10646


    and my code: (XML_ENCODING is a global constant with the enconding for the file)
    Java Code:
                DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
                DocumentBuilder db = dbf.newDocumentBuilder();
                File file = new File(fileName);
                if (file.exists()) {
                	InputStream inputStream= new FileInputStream(fileName);
                    Reader reader = new InputStreamReader(inputStream,XML_ENCODING);
                    InputSource inputSource = new InputSource(reader);
                    
                    Document doc = db.parse(inputSource);
                    doc.setXmlStandalone(true); 
                    Element docEle = doc.getDocumentElement();
    
    // Reading/ modifing code...............
    
                         TransformerFactory transformerFactory = TransformerFactory.newInstance();
                		Transformer transformer = transformerFactory.newTransformer();
    
    
                        transformer.setOutputProperty(OutputKeys.METHOD, "xml");
                		transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
                        transformer.setOutputProperty(OutputKeys.INDENT, "no");
                        transformer.setOutputProperty(OutputKeys.ENCODING, XML_ENCODING);
                            
                		DOMSource source = new DOMSource(doc); 
                		StreamResult result = new StreamResult(new File(fileName));
                		transformer.transform(source, result);

    Any idea what I'm doing wrong?

    Thank you in advance.
    Last edited by Davizuco; 01-17-2017 at 09:14 AM.

  2. #2
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    13,541
    Rep Power
    26

    Default Re: XML especial character escaping.

    Why do you need to escape it?
    It's not one of the (5?) XML special characters.
    Please do not ask for code as refusal often offends.

    ** This space for rent **

  3. #3
    Davizuco is offline Member
    Join Date
    Jan 2017
    Posts
    3
    Rep Power
    0

    Default Re: XML especial character escaping.

    Client app needs this characters escaped to work propertly.

    It's a mandatory requirement for my software.

  4. #4
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    13,541
    Rep Power
    26

    Default Re: XML especial character escaping.

    The only thing I can think is to read in the whole document and then replace all '&' with '&amp;', so escaping the escapes.
    The Document is doing what it is expected to do in XML, and that is reading those unicode values as characters.

    Someone somewhere just created pain because they didn't understand how XML worked.
    Please do not ask for code as refusal often offends.

    ** This space for rent **

  5. #5
    Davizuco is offline Member
    Join Date
    Jan 2017
    Posts
    3
    Rep Power
    0

    Default Re: XML especial character escaping.

    Thank you for your awnsers Tolls, it's a good idea, functional, but not very refined(I hope it's the right adjetive for spanish "elegante", sorry for my english ).

    It works doing:
    • Reading XML file to String and .replaceAll("&","&amp;");
    • Making a method to double escape/decimal-escape, all the mods and inserts. (With the uncertainty that it may be different from the source one in some case)
    • unescape before file writing, and after java parse with something like .replaceAll("&amp;amp;","&amp;").replaceAll("&amp; #","&#");


    The client app, it's an old software massively used in cross-platform environments, I think that the escape policy not only responds to XML requirements but it is thought to work at all levels. I don't know, as usual we don't have access to docs neither developers.

    It should be a encoding-charset combo that makes the required XML format, but by the moment this solution works. Thank you again.

Similar Threads

  1. Replies: 0
    Last Post: 11-06-2012, 02:09 PM
  2. Replies: 0
    Last Post: 11-06-2012, 02:09 PM
  3. Escaping Special Characters
    By djgovins in forum Lucene
    Replies: 6
    Last Post: 07-08-2011, 06:05 PM
  4. Replies: 1
    Last Post: 11-30-2010, 05:41 PM
  5. Concepts escaping me?!
    By skatefreak in forum New To Java
    Replies: 2
    Last Post: 04-29-2009, 02:36 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •