Results 1 to 7 of 7
- 07-10-2011, 06:15 PM #1
Member
- Join Date
- Jul 2011
- Posts
- 9
- Rep Power
- 0
Regex for validation of an input file from file
Dear,
I'm quite new to Java so please so be patient with me.
A text file must be validated before being loaded by a program. It contains a list of atom names, distances and affinities for any couple
The format of each line must be exactly as follows:
atom1 atom2 dist12:aff12 atom3 dist13:aff13 atom4 dist14:aff14
atom3 atom5 dist35:aff35
atom5 atom6 dist46:aff56 atom7 dist57:aff57
For instance, a few lines of that file should contain:
Al B 10:2 Cu 15:7 Dy 25:20 C 2:40
Cu I 120:23
I Cl 8:29 Si 123:100
I have tryed to implement a class patternMatch() by using regex from java.util.regex package, but I had no success:
This for getting the atom names, string by string (and this is already a problem, since the fileJava Code:public class patternMatch { [B]String regex = "[A-Za-z]+[a-zA-Z0-9]{2,3}+:+[a-zA-Z0-9]{2,3}";[/B] String input = "Al B 10:2 Cu 15:7 Dy 25:20 C 2:40"; // String input = "Cu I 120:23"; // String input = "I Cl 8:29 Si 123:100 "; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(input); while (matcher.find()) System.out.println(matcher.group()); }
should be parsed by an I/O stream...) but the result is empty.
Unfortunately I need to validate EACH line from the input file before processing it. No idea on ho to create this complicated REGEX in Java.
Any help for me?
Thanks a lot
M
- 07-10-2011, 06:38 PM #2
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,375
- Blog Entries
- 7
- Rep Power
- 17
Why do people always try to do things with one big ugly moloch of a RE? Why not use the split( ... ) method instead? (read the String class API documenation). The split( ... ) method gives you an array of Strings, say s[], where s[0] is the first atom name s[1] and s[2] give you the second atom name and the dist:affinity pair respectively etc. etc. That pair needs a bit more processing that can also be done by using the split( ... ) method; it'll return an other array, say, da[], where da[0] contains the distance and da[1] contains the affinity. Converting those String values to numbers isn't rocket science either ...
kind regards,
JosWhen people rob a bank they get a penalty; when banks rob people they get a bonus.
- 07-10-2011, 06:55 PM #3
Member
- Join Date
- Jul 2011
- Posts
- 9
- Rep Power
- 0
Dear JosAH,
I am not so confident with Java, as written before.
Can you clarify your suggestion by a piece of code?
TnxLast edited by _max_; 07-10-2011 at 07:02 PM.
- 07-10-2011, 07:34 PM #4
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,375
- Blog Entries
- 7
- Rep Power
- 17
When people rob a bank they get a penalty; when banks rob people they get a bonus.
- 07-11-2011, 07:38 AM #5
Member
- Join Date
- Jul 2011
- Posts
- 9
- Rep Power
- 0
Thank you Jos,
the main question arises from the need to validate each input line, not from reading out its components.
I've tryied the split() method you've suggested and it works, but unfortunately I don't know how to implement the line control. This is because I am to noob, I know :(
Max
- 07-11-2011, 09:31 AM #6
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,375
- Blog Entries
- 7
- Rep Power
- 17
Are you saying that you don't want to do anything with the lines and just check whether or not the lines are syntactically valid? If so, a regular expression can do the job easily: compose a RE starting from the small parts: i.e. define the RE for an atom name and define the RE for two integral numbers separated by a colon; glue those together to form the RE for an entire line.
Think of the following details: what characters can an atom name have? How many of them? Integral numbers are (sort of) easy: \d+ does the job. Give it a try. The top level of the RE should look like this: <atom> (<atom> <pair>)+ where <pair> looks like this: \d+:\d+ you do the <atom> part ...
kind regards,
JosWhen people rob a bank they get a penalty; when banks rob people they get a bonus.
- 07-11-2011, 11:40 PM #7
Member
- Join Date
- Jul 2011
- Posts
- 9
- Rep Power
- 0
This is last RE I've tryied without success:
As test String input = "Aluminium Borum 10:2 Cuprum 15:7 Dys.Prosium 25:20 Carbon 2:40";Java Code:String regex = [B]"(([A-Za-z\\.]+\\s){2}\\d+(\\s[A-Za-z\\.]+\\s\\d+)*[\\n\\r]*)+";[/B]
No way: the simple class patternMatch as described before prompts:
while I am expecting to have the whole line be validated (I mean: printed)Java Code:Aluminium Borum 10
Similar Threads
-
UK Phone Number regex validation?
By ozzyman in forum New To JavaReplies: 8Last Post: 04-12-2011, 09:15 AM -
how to change the layout of an input file and write to an output file
By renu in forum New To JavaReplies: 8Last Post: 05-12-2010, 07:19 PM -
count character in text file as input file
By aNNuur in forum New To JavaReplies: 7Last Post: 03-25-2010, 04:01 PM -
retain value of input type file in a jsp file while being dynamically generated
By nidhi c in forum JavaServer Pages (JSP) and JSTLReplies: 1Last Post: 09-27-2009, 02:21 AM -
Check validation of the Regex
By itaipee in forum New To JavaReplies: 4Last Post: 05-26-2009, 11:23 AM


LinkBack URL
About LinkBacks
Reply With Quote

Bookmarks