Results 1 to 11 of 11
- 12-02-2008, 03:56 AM #1
Member
- Join Date
- Dec 2008
- Posts
- 6
- Rep Power
- 0
using Delimiter with metacharacters
Hi I'm trying to parse a file that splits up information by being surrounded by brackets. ie, [] and {}.
example segment of file:
{
[program]
[statement] NL . NL ;
[statement] NL [program] ;
[statement] NL [program] ;
[statement] NL [program] ;
[statement] NL [program] ;
[statement] NL [program] ;
[statement] NL [program] ;
[statement] NL [program] ;
[statement] NL [program] ;
[statement] NL [program] ;
[statement] NL [program] ;
}
I'm trying to parse each token within [] (ex, [statement]) to an object, however I can't use the delimiter("[") to split up the tokens and use the next() method because [] is a metacharacter
How can I go about parsing my tokens? Is there a work around to force "[" to be a pattern in itself like any other pattern that you can use delimiters with or is there something completely different that I can do?
Thanks in advance,
wntdaliv
-
Have you tried back-slashing it? i.e., instead of "[" use "\["
caveat: I'm no expert in the field of regex.
- 12-02-2008, 04:02 AM #3
Member
- Join Date
- Dec 2008
- Posts
- 6
- Rep Power
- 0
Yeah I did, unfortunately "\[" is an invalid escape sequence.
I think it's because "\" is itself a metacharacter which has to be followed used with something like "\n" for a new line, etc
- 12-02-2008, 04:15 AM #4
Member
- Join Date
- Dec 2008
- Posts
- 6
- Rep Power
- 0
Update
ok so I've found on the java pages that you can supposedly force a metcharacter to be treated like a regular character if you:
However, when I try to do this, my compiler (Eclipse) gives me the error:
There are two ways to force a metacharacter to be treated as an ordinary character:
precede the metacharacter with a backslash, or
enclose it within \Q (which starts the quote) and \E (which ends it).
When using this technique, the \Q and \E can be placed at any location within the expression, provided that the \Q comes first.
Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
Any ideas about getting around this?
-
Let's see your code.
Also, are you only interested in the text between the brackets and not interested in the other text? Exactly what is your goal here?
- 12-02-2008, 04:37 AM #6
Member
- Join Date
- Dec 2008
- Posts
- 6
- Rep Power
- 0
Ok, so my goal is to get the text between the brackets and turn it into an object and take the text that isn't in between brackets and turn it into a different type of object
Here's some code:
I'm trying to take all of the information in the file and turn them into objects (using other classes and such)Java Code:public Grammar parseFile(File file) throws IOException { Grammar g = new Grammar(); Scanner scanner = new Scanner(file); Pattern pattern = Pattern.compile("["); // This is the trouble scanner.useDelimiter(pattern); startVariable = scanner.next(); startVariable = startVariable.substring(1, startVariable.length() - 2); scanner.useDelimiter("{"); while(scanner.hasNext()) { g.addRule(parseRule(scanner.next())); } return g; }
-
You do know of course that to use back slashes here, you have to double them, right?
For instance, I think that a regex String like this will match anything inside square brackets:
but again, I'm still very new to regexes so I can't guarantee how well this would work.Java Code:String regex = "(?<=\\[)([^\\]]*)(?=\\])";
Last edited by Fubarable; 12-02-2008 at 05:12 AM.
- 12-02-2008, 05:17 AM #8
Member
- Join Date
- Dec 2008
- Posts
- 6
- Rep Power
- 0
haha, all that trouble and I just had to double it. Thanks so much!
"\\{" worked just fine as a delimiter
-
For example given this input file:
parsefile.txt
and this code:Java Code:{ [program] [statement] NL . NL ; [statement] NL [program] ; [statement] NL [program] ; [statement] NL [program] ; [statement] NL [program] ; [statement] NL [program] ; [statement] NL [program] ; [statement] NL [program] ; [statement] NL [program] ; [statement] NL [program] ; [statement] NL [program] ; }
MyParse.java
I get this result:Java Code:import java.io.File; import java.io.FileNotFoundException; import java.util.Scanner; import java.util.regex.Matcher; import java.util.regex.Pattern; public class MyParse { private static final String PATH = "src/dy08/m12/a/"; private static final String FILE = "parsefile.txt"; public static void main(String[] args) throws FileNotFoundException { File toParse = new File(PATH + FILE); Scanner scanner = new Scanner(toParse); String regex = "(?<=\\[)([^\\]]*)(?=\\])"; Pattern p = Pattern.compile(regex); while (scanner.hasNextLine()) { String line = scanner.nextLine(); Matcher matcher = p.matcher(line); int index = 0; while (matcher.find(index)) { System.out.print(matcher.group() + ", "); index = matcher.start() + 1; } System.out.println(); } } }
Java Code:program, statement, statement, program, statement, program, statement, program, statement, program, statement, program, statement, program, statement, program, statement, program, statement, program, statement, program,
-
- 12-02-2008, 06:42 AM #11
Just to expand on the reason for doubling backslashes for regex Strings:
-- the backslash is the quote character for a String literal
-- to include a single backslash in the value of a String variable assigned from a String literal, you have to quote it by preceding it with another backslash
A simple test that helps this sink in:This is important to understand especially when a \ character is required to be matched by regex. Since the \ is also the quoting character for a regex pattern, you now need 4 backslashes in the String literal:Java Code:System.out.println("\\".length()); // prints 1results in the value of the variable regex being "\\" which results in the regex matching a single "\"Java Code:String regex = "\\\\";
If you were reading a regex String from a text file or a JOptionPane#showImputDialog, you would use single, not double, backslashes to quote any regex metacharacter. Two backslashes from such a source would match the backslash character itself.
db
Similar Threads
-
delimiter
By satin in forum New To JavaReplies: 2Last Post: 11-17-2008, 10:50 PM


LinkBack URL
About LinkBacks
Reply With Quote

Bookmarks