Results 1 to 7 of 7
  1. #1
    Jovaras is offline Member
    Join Date
    May 2011
    Posts
    7
    Rep Power
    0

    Default way to catch if file is simple text based

    hello,
    I'm filtering files.
    I'm looking for a way to detect if a file is simple text based(not encoded).
    Of course I could define all the extensions, but is there a quick way to see if file is simple text(for e.g.: *.php) or encoded (for e.g.: *.exe) ?
    Thanks in advance!

  2. #2
    Dark's Avatar
    Dark is offline Senior Member
    Join Date
    Apr 2011
    Location
    Camp Lejuene, North Carolina
    Posts
    643
    Rep Power
    4

    Default

    I believe I did something similar to what you're asking about earlier today. However I did mine in Swing. I'm pretty sure it belongs to IO so you should be able to use it in non-swing applications.

    Using a buffered reader I opened up a selected file. Using the file information it collected I used
    Java Code:
    if(file.getAbsolutePath().contains(".txt")){//Code}
    You could try something similar to accomplish that. Reading through a folder, directory or other collection of files may require a little bit more than this seeing as mine asked you to input the file location.
    • Use [code][/code] tags when posting code. That way people don't want to stab their eyes out when trying to help you.
    • +Rep people for helpful posts.

  3. #3
    Norm's Avatar
    Norm is online now Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    17,559
    Rep Power
    25

    Default

    way to detect if a file is simple text based
    Read it and check that all bytes/chars read are text or controls like tab and newline
    Not all files have extensions.

  4. #4
    Dark's Avatar
    Dark is offline Senior Member
    Join Date
    Apr 2011
    Location
    Camp Lejuene, North Carolina
    Posts
    643
    Rep Power
    4

    Default

    I think he's trying to distinguish between files that do have extensions, at least that's what I got from it.
    • Use [code][/code] tags when posting code. That way people don't want to stab their eyes out when trying to help you.
    • +Rep people for helpful posts.

  5. #5
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,651
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by Norm View Post
    Read it and check that all bytes/chars read are text or controls like tab and newline
    Not all files have extensions.
    That'd only catch English/American text, i.e. ASCII representable text. Any text outside of that (limited) range would fail that test. No need to worry though, because the good old 'file' Unix utility would fail as well ;-) There is no one-size-fits-all cheap test to classify a file base on its contents nor on its name nor on its name extension.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  6. #6
    Jovaras is offline Member
    Join Date
    May 2011
    Posts
    7
    Rep Power
    0

    Default

    hmmm, so how do famous programs(like searching in files) handle reading files and excluding executables?
    Or it simply can not be done using Java?

  7. #7
    Norm's Avatar
    Norm is online now Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    17,559
    Rep Power
    25

    Default

    Any programming language that can read the bytes of a file can check them using an algorithm to determine if the whole file is made of text. Define what "text" is and I can write a program to test for it. The problem as JosAH pointed out is there are lots of different "text" character sets.

Similar Threads

  1. Replies: 11
    Last Post: 12-06-2010, 05:00 PM
  2. Text based games help
    By mustachMan in forum New To Java
    Replies: 0
    Last Post: 12-04-2009, 01:11 AM
  3. Replies: 3
    Last Post: 12-12-2008, 12:12 PM
  4. printing simple text as text on printer
    By Nicholas Jordan in forum Advanced Java
    Replies: 0
    Last Post: 12-01-2008, 01:42 AM
  5. creating a text based game
    By Phobos0001 in forum New To Java
    Replies: 1
    Last Post: 02-12-2008, 04:35 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •