Results 1 to 8 of 8
  1. #1
    trivektor is offline Member
    Join Date
    Sep 2008
    Posts
    6
    Rep Power
    0

    Default Remove control characters in txt file

    Hi,

    I have a txt file that contains invisible control characters and I want to remove those characters. I've been thinking of 2 options

    1/ Get the content of the file into a string, then go through each character and basically takes only alphanumeric, new lines, Alt+Enter character (character that is created in txt files in Excel that breaks line). With this approach, I'm stuck on getting the character code for Alt+Enter so if anyone could point out. That helps a great deal.

    2/ Use some pattern matching {ctrnl} or something to remove all control characters. I've tried this approach and it didn't work for me.

    Please help me with this problem. Any help or suggestion is greatly appreciated.

  2. #2
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,266
    Rep Power
    25

    Default

    If you can get a list of the characters, you could use String's replace method.
    it didn't work for me.
    If you'd post your code, someone could help you with it.
    If the text is ASCII and you can define the range of characters you want to delete or those that you want to keep, write a program to read the file byte by byte and write it byte by byte. Test each byte to see if you want it in the output file.

  3. #3
    trivektor is offline Member
    Join Date
    Sep 2008
    Posts
    6
    Rep Power
    0

    Default

    Thanks for the reply

    I tried this code but after that, my application doesn't work any more

    Pattern p = Pattern.compile("{cntrl}");
    Matcher m = p.matcher("");
    m.reset(input);
    String result = m.replaceAll("");

    I think the first approach is more doable, but then I don't know the character code for Alt+Enter in Java. Do you know it?

  4. #4
    GenkiSudo is offline Member
    Join Date
    Sep 2008
    Posts
    6
    Rep Power
    0

    Default

    I could be wrong, as I'm a beginner myself, but it looks like you're missing m.find() before calling m.replaceAll.

  5. #5
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,266
    Rep Power
    25

    Default

    my application doesn't work any more
    Could you explain?

    Write a small program with a test string with control characters to work on getting the correct pattern. Postit here to get help with the regex
    What OS are you on? I don't know about alt+Enter generating a character on windows.

  6. #6
    trivektor is offline Member
    Join Date
    Sep 2008
    Posts
    6
    Rep Power
    0

    Default

    In Excel, if you want a string to appear on two lines, instead of pressing Enter, you do Alt+Enter. And I think when you save the file as TXT, the Alt+Enter character is supposed to be in the file.

  7. #7
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,266
    Rep Power
    25

    Default

    If you have a text file with that char, look at it with a hexeditor to see what it is.

  8. #8
    Nicholas Jordan's Avatar
    Nicholas Jordan is offline Senior Member
    Join Date
    Jun 2008
    Location
    Southwest
    Posts
    1,018
    Rep Power
    8

    Default suggest pattern.control

    There are three approaches, Pattern has a control but colon are needed as well as possibly brackets ( square brackets ) for some syntax. If this is a student project, I suggest trying everything you can think of. If this is for professional or commercial it would be best if we know what package you are using. In general, use String methods, which will take a Pattern in the operations needed.
    Introduction to Programming Using Java.
    Cybercartography: A new theoretical construct proposed by D.R. Fraser Taylor

Similar Threads

  1. remove a portion from a file
    By alon2580 in forum New To Java
    Replies: 13
    Last Post: 08-25-2008, 01:45 PM
  2. [SOLVED] Remove All Line from File
    By Mir in forum New To Java
    Replies: 41
    Last Post: 07-17-2008, 09:44 AM
  3. Remove duplicate lines from a text file
    By Dirt.Diver in forum New To Java
    Replies: 15
    Last Post: 06-25-2008, 02:08 PM
  4. Replies: 0
    Last Post: 01-20-2008, 06:07 AM
  5. How to remove Control Characters from an input file?
    By renjan in forum Advanced Java
    Replies: 0
    Last Post: 08-01-2007, 03:33 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •