Results 1 to 5 of 5
- 11-08-2010, 01:17 PM #1
Member
- Join Date
- Nov 2010
- Posts
- 29
- Rep Power
- 0
finding regex patterns in text file
hi,
i have a text file which contains information in this format..
[ -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP ] [ said/VBD [ it/PRP ] [ and/CC [ Belgium/NNP ] [ 's/POS IMEC/NNP ] [ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ] [ to/TO develop/VB [ advanced/JJ metallization/NN process/NN technology/NN ./.In ] [ a/DT statement/NN ] [ late/RB on/IN [ Monday/NNP ]
i need to list the words that are in between the square brackets only...i mean the output should be something like
-LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP
said/VBD
it/PRP
Belgium/NNP and so on...
is it possible to do it with regex patterns? i tried but nothing worked... pls help..
thnks
jessie
- 11-08-2010, 03:43 PM #2
Senior Member
- Join Date
- Oct 2010
- Location
- Germany
- Posts
- 780
- Rep Power
- 4
have you made a mistake?
[ said/VBD [ it/PRP ] : here you wrote that you want to find out said/VBD and
it/PRP
[ and/CC [ Belgium/NNP ] : here you only want the second one?
are your results wrong or your posted format?
- 11-08-2010, 04:26 PM #3
Member
- Join Date
- Nov 2010
- Posts
- 29
- Rep Power
- 0
hi,
now its corrected..sorry i posted the wrong one....
i have a text file which contains information in this format..
[ -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP ] said/VBD [ it/PRP ] [ and/CC Belgium/NNP ] [ 's/POS IMEC/NNP ] [ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ] to/TO develop/VB [ advanced/JJ metallization/NN process/NN technology/NN ./.In ] [ a/DT statement/NN ] late/RB on/IN [ Monday/NNP ]
i need to list the words that are in between the square brackets only...i mean the output should be something like
-LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP
it/PRP
and/CC Belgium/NNP and so on...
is it possible to do it with regex patterns? i tried but nothing worked... pls help..
thnks
jessie
- 11-08-2010, 04:37 PM #4
Senior Member
- Join Date
- Oct 2010
- Location
- Germany
- Posts
- 780
- Rep Power
- 4
ok summary: you have a long text with some text in brackets which do you want to extract? no nested brackets ? sorry but i ask that again, because you posted it again:
"[ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ]"
perhaps, if you have a pattern like [...] ... [...] ...[..] [..]... (no nested)
you could use the following:
s is your posted stringJava Code:Pattern p = Pattern.compile("\\[(.+?)\\]"); Matcher m = p.matcher(s); while (m.find()) { System.out.println(m.group(1).trim()); }
the output:
Java Code:-LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP it/PRP and/CC Belgium/NNP 's/POS IMEC/NNP a/DT three-year/JJ collaboration/NN agreement/NN advanced/JJ metallization/NN process/NN technology/NN ./.In a/DT statement/NN Monday/NNP
- 11-08-2010, 08:42 PM #5
Senior Member
- Join Date
- Feb 2009
- Posts
- 303
- Rep Power
- 5
What's the output you would be expecting?
Here's basically what you currently have:
How would you handle the following?[ -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP ]
said/VBD
[ it/PRP ]
[ and/CC Belgium/NNP ]
[ 's/POS IMEC/NNP ]
[ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ]
to/TO develop/VB
[ advanced/JJ metallization/NN process/NN technology/NN ./.In ]
[ a/DT statement/NN ] late/RB on/IN [ Monday/NNP ]
Would you just return the "a/DT three-year/JJ collaboration/NN agreement/NN"?[ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ]
If there was a matching end bracket for the "had/VBD entered/VBN into/IN", so you had...
...how would you want to handle that?[ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ]]
Similar Threads
-
Finding Regex hard to learn
By Dan0100 in forum New To JavaReplies: 2Last Post: 09-21-2010, 02:18 PM -
Using regex to retrieve all text inside parentheses
By adhoc334 in forum Advanced JavaReplies: 5Last Post: 08-18-2010, 08:05 PM -
Help with Regex to get only <td></td> and the text within it in a <table> tag
By masterrs.mind in forum Advanced JavaReplies: 15Last Post: 03-02-2010, 07:09 PM -
help...! about reading a text file and finding their average
By nemesis in forum New To JavaReplies: 20Last Post: 10-20-2008, 11:02 AM -
Regex for file extension
By gapper in forum New To JavaReplies: 1Last Post: 01-31-2008, 03:59 PM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks