I have an XML file encoded in UTF with some Unicode characters - SC Unipad recognizes the format properly. I run this through an Ant script that uses a replace against this file. The resulting file is being written to some other encoding and the Unicode characters like register, plus minus and nbsp are all written as ??

I've checked with the linux system and the locale is set to en_US.UTF-8 so the defaults should be working. I ddi go ahead and add an encoding value thinking this was the problem but no combination of UTF-8 or utf-8 causes the correct output.

Note that on another linux box the script as is runs with the correct output. Any idea what is going on? I know that the following is where the problem is introduced because when I comment it out, the output format is correct.

<replace dir="${wkdy.export.dir}" value="" encoding="utf-8">
<include name="**/*.htm"/>
<replacetoken><![CDATA[
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">]]></replacetoken>
</replace>
<replace dir="${wkdy.export.dir}" value="">
<include name="**/*.htm"/>
<replacetoken><![CDATA[
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">]]></replacetoken>
</replace>