Results 1 to 12 of 12
  1. #1
    Join Date
    Feb 2011
    Location
    Florida
    Posts
    60
    Rep Power
    0

    Default age old problem - best way to check if file is text

    I've searched and searched for a definitive answer to the problem of how to tell if an opened file is a text file or not and still have not found one. I'm using a BufferedReader to read a text file as was suggested to me because of its speed and efficiency. It is fast an efficient IF you give it a text file. The problem is that it hangs if you give it anything else. So I'm asking you guys, the experts, how do you handle the case where you tell the user to select a text file but they select an executable or some other binary file instead? What is your bomb proof solution?

  2. #2
    JosAH's Avatar
    JosAH is online now Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,371
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by madroadbiker View Post
    I've searched and searched for a definitive answer to the problem of how to tell if an opened file is a text file or not and still have not found one. I'm using a BufferedReader to read a text file as was suggested to me because of its speed and efficiency. It is fast an efficient IF you give it a text file. The problem is that it hangs if you give it anything else. So I'm asking you guys, the experts, how do you handle the case where you tell the user to select a text file but they select an executable or some other binary file instead? What is your bomb proof solution?
    A Unix text file or a MS Windows one or a Mac text file? An ASCII file or a Unicode file? If the latter, what's the encoding used? Your question can't be answered.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  3. #3
    Join Date
    Feb 2011
    Location
    Florida
    Posts
    60
    Rep Power
    0

    Default

    Well just give me an answer for mac and/or windows. It's java. It will run on either.

  4. #4
    JosAH's Avatar
    JosAH is online now Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,371
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by madroadbiker View Post
    Well just give me an answer for mac and/or windows. It's java. It will run on either.
    Copied from my previous reply: An ASCII file or a Unicode file? If the latter, what's the encoding used? Your question can't be answered.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  5. #5
    couling is offline Member
    Join Date
    Nov 2010
    Posts
    54
    Rep Power
    0

    Default

    As I'm working with charicter encoding at the moment I'm interested (concerned) that your program just hangs. Is no exception thrown?

    As JosAH says, this is sadly not entirely possible.

    Even assuming that your file is "Plain Text" then there are several different encodings for charicters and ultimately they are all just binary.

    ASCII is the most common form of text and Pure ASKII will contain no charicters above 127 (0x7F). But there are many different forms of "extended" askii which use charicters 128 to 255 (0x80 to 0xFF).

    Unicode files may have a "Byte Order Mark" (BOM) at the start of a file. This is more intended to distinguish between unicode's various encodings and there is no garuntee that a file will have it:
    UTF8, UTF16LE, UTF16BE, UTF32LE, UTF32BE. Being British I'm particularly aware of '£' which is allways represented a codepoint higher than 127.

    Sadly the best thing you can really do is try to decode it and succeed or fail. This is why I'm supprised there is no exception thrown for attempting to decode something which is not a valid for your charicter encoding.
    Last edited by couling; 05-26-2011 at 07:08 PM.
    ----Signature ----
    Please use [CODE] tags and indent correctly. It really helps when reading your code.

  6. #6
    Join Date
    Feb 2011
    Location
    Florida
    Posts
    60
    Rep Power
    0

    Default

    Thanks for your detailed response. I am trying to develop a java app that will run on mac and windows. The program asks for the user to select a text file but of course they can select any file type. I need to be able to detect that error without the program hanging. Here is the method I am currently using to ask for and input a text file. In the code snippet below "propositionIndexWindow is just a JTextArea.

    Java Code:
            // Open a file
            JFileChooser fc = new JFileChooser();
            fc.setCurrentDirectory(new File(directory.getAbsolutePath()));
            fc.setDialogTitle("File Chooser");
    
            // Ask for a file and try to open it
            if (fc.showOpenDialog(this) == 0)
            {
                if ((inFile = fc.getSelectedFile()) != null)
                {
                    if (inFile.isFile() && inFile.canRead())
                    {
                        // Save the current directory location
                        directory = fc.getCurrentDirectory();
                        statusBar.setText("File Opened:  " + inFile.getName());
                        propositionIndexWindow.setText(""); // Clear the window to start
                        try {
                            BufferedReader bir = new BufferedReader(new FileReader(inFile));
                            while (bir.ready()) {
                                propositionIndexWindow.append(bir.readLine() + System.getProperty("line.separator"));
                            }
                            // Free up system resources
                            bir.close();
                        } catch (FileNotFoundException e) {
                            statusBar.setText("FILE NOT FOUND:  " + inFile.getAbsolutePath());
                            inFile = null;
                        } catch (IOException e) {
                            statusBar.setText("FILE I/O ERROR:  " + inFile.getAbsolutePath());
                            inFile = null;
                        }
                    } else {
                        statusBar.setText("Can not read file");
                        inFile = null;
                    }
                } else
                    statusBar.setText("Can not open file");
            }
    I think the culprit is doing the "append" to the JTextArea. I think that is where it is hanging. It works fine of course for text files but if I select a .zip file for example it hangs so bad I have to force quit to get out of it. What am I doing wrong here?

  7. #7
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,306
    Rep Power
    25

    Default

    propositionIndexWindow.append(bir.readLine() ..
    You should run a filter over the line read before doing the append.

    By filter I mean do something to detect if the String is displayable.

  8. #8
    Join Date
    Feb 2011
    Location
    Florida
    Posts
    60
    Rep Power
    0

    Default

    Quote Originally Posted by Norm View Post
    You should run a filter over the line read before doing the append.

    By filter I mean do something to detect if the String is displayable.
    You mean like test each and every byte to insure it is printable? That would be very time costly. Even so, that doesn't explain why there is no exception thrown. There must be simpler solution.

  9. #9
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,306
    Rep Power
    25

    Default

    That would be very time costly
    I wonder how it would compare cost wise to the append() you are using for each line.

    Have you tried running it the a profiler to see where the code spends its time?

  10. #10
    Join Date
    Feb 2011
    Location
    Florida
    Posts
    60
    Rep Power
    0

    Default

    Well things are getting even more bizarre. When I run with the profiler, it doesn't hang. It prints garbage which is what I would expect when trying to open a .zip file as text. What the heck is going on?

  11. #11
    DarrylBurke's Avatar
    DarrylBurke is offline Member
    Join Date
    Sep 2008
    Location
    Madgaon, Goa, India
    Posts
    11,184
    Rep Power
    19

    Default

    Sounds rather like you may have neglected Swing's single threaded rule and ended up with a deadlock.

    db

  12. #12
    Join Date
    Feb 2011
    Location
    Florida
    Posts
    60
    Rep Power
    0

    Default

    I don't know what you mean. My program doesn't spawn any other tasks. It's a simple single task program. I'm not using a swingworker or other similar mechanism while fetching the file contents. I call file chooser and just wait.

Similar Threads

  1. problem with reading a text file with special char
    By kishan.java in forum New To Java
    Replies: 1
    Last Post: 04-10-2011, 09:30 AM
  2. problem of linking text file and java program
    By binweifong in forum New To Java
    Replies: 9
    Last Post: 12-08-2010, 04:06 PM
  3. writing to text file problem
    By blumdiggity in forum Networking
    Replies: 1
    Last Post: 02-26-2010, 02:43 PM
  4. Problem with reading text from a .txt file
    By Gigi in forum New To Java
    Replies: 40
    Last Post: 01-22-2009, 03:22 AM
  5. Replies: 0
    Last Post: 07-17-2007, 03:30 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •