Results 1 to 3 of 3
  1. #1
    ragz-82 is offline Member
    Join Date
    Feb 2011
    Posts
    3
    Rep Power
    0

    Default Unicode File movement from Windows to Unix adding Special Characters

    Hi,
    In our application a Unicode file with German and Japanese characters is submitted, which is moved to a Unix directory by using the MultipartRequest JAVA API. Later Oracle PL/SQL processes the file and makes entries in the database.

    We have observed that this load is failing since the file is having some special characters when it is getting transferred to Unix. The file is untouched if it contains only English characters. To confirm this we created a file directly in Unix containing Ger/Jap chars and called the Oracle St Proc and it worked fine. When this same file was moved back to Windows using WinSCP, the file was different again.

    Hence overall it looks like Unicode file movement between Windows and Unix changes the file in someway for some reason. Please let me know if any JAVA API can avoid this issue.

    I scanned the Net for close to a week but couldn't find anything related. Any help will be greatly appreciated.

    If we cant find any solution, we are considering using POI so that JAVA can directly update the Database.

    Rgds,
    Raghu

  2. #2
    toadaly is offline Senior Member
    Join Date
    Jan 2009
    Posts
    671
    Rep Power
    6

    Default

    My guess as to what's going on is that unix uses carriage returns (CR) for an end of line indicator, but windows uses CRLF (carriage return line feed). Possibly the MultipartRequest API automagically makes the conversion?

    I'd probably try ftp instead.

  3. #3
    ragz-82 is offline Member
    Join Date
    Feb 2011
    Posts
    3
    Rep Power
    0

    Default

    I read that MultipartRequest and Apache FileUpload are the two common APIs used to perform file uploads in JAVA. Hence I tried also with the Apache API, but the result was exactly the same.

    Hence I suspect that some encoding related setting is missing in either the JSP or the Servlet code. Below is my code snippet. Please suggest.

    -- JSP
    <meta http-equiv="Content-Language" content="en-us">
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta http-equiv="Expires" content="Tue, 20 Aug 1996 14:25:27 GMT">
    <meta http-equiv="Cache-Control" content="no-cache">

    <script language="javascript" src="../js/stylesheet.js"></script>
    <script language="JavaScript" src="../js/datePicker.js"></script>
    <script language="JavaScript" src="../js/validate.js"></script>

    </head>

    <body>

    <form ENCTYPE="multipart/form-data" name="frmUpload" method="POST" action="<%= request.getContextPath() %>/servlet/TestServlet">

    Select File: <input type="file" name="file" size="30">
    <input value="Upload" name="cmdUpload" type="submit">

    -- Servlet
    public void doPost(HttpServletRequest request, HttpServletResponse response)
    throws ServletException, IOException
    {
    request.setCharacterEncoding("UTF-8");

    MultipartRequest multi = new MultipartRequest(request,
    "/tmp",
    20000000,
    "UTF-8");
    }

Similar Threads

  1. Problem with writing unicode characters in a file
    By ze snow in forum New To Java
    Replies: 1
    Last Post: 02-23-2010, 11:47 PM
  2. XML with special characters
    By Kaizah in forum XML
    Replies: 1
    Last Post: 11-06-2009, 03:26 PM
  3. writing and reading unicode characters from a file
    By ranoosh in forum Advanced Java
    Replies: 4
    Last Post: 09-28-2008, 05:34 AM
  4. special characters
    By ravian in forum New To Java
    Replies: 2
    Last Post: 11-16-2007, 02:28 PM
  5. show special symbol using its Unicode character
    By christina in forum AWT / Swing
    Replies: 1
    Last Post: 07-25-2007, 10:21 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •