Results 1 to 7 of 7
  1. #1
    _max_ is offline Member
    Join Date
    Jul 2011
    Posts
    9
    Rep Power
    0

    Unhappy Regex for validation of an input file from file

    Dear,

    I'm quite new to Java so please so be patient with me.

    A text file must be validated before being loaded by a program. It contains a list of atom names, distances and affinities for any couple

    The format of each line must be exactly as follows:


    atom1 atom2 dist12:aff12 atom3 dist13:aff13 atom4 dist14:aff14
    atom3 atom5 dist35:aff35
    atom5 atom6 dist46:aff56 atom7 dist57:aff57


    For instance, a few lines of that file should contain:

    Al B 10:2 Cu 15:7 Dy 25:20 C 2:40
    Cu I 120:23
    I Cl 8:29 Si 123:100


    I have tryed to implement a class patternMatch() by using regex from java.util.regex package, but I had no success:

    Java Code:
    public class patternMatch {
    
           [B]String regex = "[A-Za-z]+[a-zA-Z0-9]{2,3}+:+[a-zA-Z0-9]{2,3}";[/B]
            
            String input = "Al B 10:2 Cu 15:7 Dy 25:20 C 2:40";
    //        String input = "Cu I 120:23";
    //        String input = "I Cl 8:29 Si 123:100 ";
    
    
            Pattern pattern = Pattern.compile(regex);
            Matcher matcher = pattern.matcher(input);
    
            while (matcher.find())
            System.out.println(matcher.group());
    }
    This for getting the atom names, string by string (and this is already a problem, since the file
    should be parsed by an I/O stream...) but the result is empty.

    Unfortunately I need to validate EACH line from the input file before processing it. No idea on ho to create this complicated REGEX in Java.

    Any help for me?

    Thanks a lot
    M

  2. #2
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,763
    Blog Entries
    7
    Rep Power
    21

    Default

    Why do people always try to do things with one big ugly moloch of a RE? Why not use the split( ... ) method instead? (read the String class API documenation). The split( ... ) method gives you an array of Strings, say s[], where s[0] is the first atom name s[1] and s[2] give you the second atom name and the dist:affinity pair respectively etc. etc. That pair needs a bit more processing that can also be done by using the split( ... ) method; it'll return an other array, say, da[], where da[0] contains the distance and da[1] contains the affinity. Converting those String values to numbers isn't rocket science either ...

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  3. #3
    _max_ is offline Member
    Join Date
    Jul 2011
    Posts
    9
    Rep Power
    0

    Default

    Dear JosAH,

    I am not so confident with Java, as written before.

    Can you clarify your suggestion by a piece of code?

    Tnx
    Last edited by _max_; 07-10-2011 at 08:02 PM.

  4. #4
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,763
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by _max_ View Post
    Dear JosAH,

    I am not so confident with Java, as written before.

    Can you clarify your suggestion by a piece of code?

    Tnx
    Here's an even simpler suggestion:

    Java Code:
    String line= ...; // a line from your input file
    String[] parts= line.spilt("[ :]+"); // split it
    Print out every element from that parts array and see for yourself.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  5. #5
    _max_ is offline Member
    Join Date
    Jul 2011
    Posts
    9
    Rep Power
    0

    Default

    Thank you Jos,

    the main question arises from the need to validate each input line, not from reading out its components.

    I've tryied the split() method you've suggested and it works, but unfortunately I don't know how to implement the line control. This is because I am to noob, I know :(

    Max

  6. #6
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,763
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by _max_ View Post
    Thank you Jos,

    the main question arises from the need to validate each input line, not from reading out its components.

    I've tryied the split() method you've suggested and it works, but unfortunately I don't know how to implement the line control. This is because I am to noob, I know :(

    Max
    Are you saying that you don't want to do anything with the lines and just check whether or not the lines are syntactically valid? If so, a regular expression can do the job easily: compose a RE starting from the small parts: i.e. define the RE for an atom name and define the RE for two integral numbers separated by a colon; glue those together to form the RE for an entire line.
    Think of the following details: what characters can an atom name have? How many of them? Integral numbers are (sort of) easy: \d+ does the job. Give it a try. The top level of the RE should look like this: <atom> (<atom> <pair>)+ where <pair> looks like this: \d+:\d+ you do the <atom> part ...

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  7. #7
    _max_ is offline Member
    Join Date
    Jul 2011
    Posts
    9
    Rep Power
    0

    Default

    This is last RE I've tryied without success:

    Java Code:
    String regex = [B]"(([A-Za-z\\.]+\\s){2}\\d+(\\s[A-Za-z\\.]+\\s\\d+)*[\\n\\r]*)+";[/B]
    As test String input = "Aluminium Borum 10:2 Cuprum 15:7 Dys.Prosium 25:20 Carbon 2:40";

    No way: the simple class patternMatch as described before prompts:

    Java Code:
    Aluminium Borum 10
    while I am expecting to have the whole line be validated (I mean: printed)

Similar Threads

  1. UK Phone Number regex validation?
    By ozzyman in forum New To Java
    Replies: 8
    Last Post: 04-12-2011, 10:15 AM
  2. Replies: 8
    Last Post: 05-12-2010, 08:19 PM
  3. count character in text file as input file
    By aNNuur in forum New To Java
    Replies: 7
    Last Post: 03-25-2010, 05:01 PM
  4. retain value of input type file in a jsp file while being dynamically generated
    By nidhi c in forum JavaServer Pages (JSP) and JSTL
    Replies: 1
    Last Post: 09-27-2009, 03:21 AM
  5. Check validation of the Regex
    By itaipee in forum New To Java
    Replies: 4
    Last Post: 05-26-2009, 12:23 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •