Results 1 to 5 of 5
  1. #1
    jessie is offline Member
    Join Date
    Nov 2010
    Posts
    29
    Rep Power
    0

    Default finding regex patterns in text file

    hi,

    i have a text file which contains information in this format..

    [ -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP ] [ said/VBD [ it/PRP ] [ and/CC [ Belgium/NNP ] [ 's/POS IMEC/NNP ] [ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ] [ to/TO develop/VB [ advanced/JJ metallization/NN process/NN technology/NN ./.In ] [ a/DT statement/NN ] [ late/RB on/IN [ Monday/NNP ]

    i need to list the words that are in between the square brackets only...i mean the output should be something like

    -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP
    said/VBD
    it/PRP
    Belgium/NNP and so on...

    is it possible to do it with regex patterns? i tried but nothing worked... pls help..

    thnks
    jessie

  2. #2
    eRaaaa is offline Senior Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    787
    Rep Power
    6

    Default

    have you made a mistake?

    [ said/VBD [ it/PRP ] : here you wrote that you want to find out said/VBD and
    it/PRP

    [ and/CC [ Belgium/NNP ] : here you only want the second one?

    are your results wrong or your posted format?

  3. #3
    jessie is offline Member
    Join Date
    Nov 2010
    Posts
    29
    Rep Power
    0

    Default

    hi,

    now its corrected..sorry i posted the wrong one....

    i have a text file which contains information in this format..

    [ -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP ] said/VBD [ it/PRP ] [ and/CC Belgium/NNP ] [ 's/POS IMEC/NNP ] [ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ] to/TO develop/VB [ advanced/JJ metallization/NN process/NN technology/NN ./.In ] [ a/DT statement/NN ] late/RB on/IN [ Monday/NNP ]

    i need to list the words that are in between the square brackets only...i mean the output should be something like

    -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP
    it/PRP
    and/CC Belgium/NNP and so on...

    is it possible to do it with regex patterns? i tried but nothing worked... pls help..

    thnks
    jessie

  4. #4
    eRaaaa is offline Senior Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    787
    Rep Power
    6

    Default

    ok summary: you have a long text with some text in brackets which do you want to extract? no nested brackets ? sorry but i ask that again, because you posted it again:

    "[ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ]"

    perhaps, if you have a pattern like [...] ... [...] ...[..] [..]... (no nested)
    you could use the following:

    Java Code:
    		Pattern p = Pattern.compile("\\[(.+?)\\]");
    		Matcher m = p.matcher(s);
    		while (m.find()) {
    			System.out.println(m.group(1).trim());
    		}
    s is your posted string

    the output:
    Java Code:
    -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP
    it/PRP
    and/CC Belgium/NNP
    's/POS IMEC/NNP
    a/DT three-year/JJ collaboration/NN agreement/NN
    advanced/JJ metallization/NN process/NN technology/NN ./.In
    a/DT statement/NN
    Monday/NNP

  5. #5
    StormyWaters is offline Senior Member
    Join Date
    Feb 2009
    Posts
    306
    Rep Power
    6

    Default

    What's the output you would be expecting?

    Here's basically what you currently have:
    [ -LRB-/-LRB- Applied/NNP Materials/NNPS Inc/NNP ]
    said/VBD
    [ it/PRP ]
    [ and/CC Belgium/NNP ]
    [ 's/POS IMEC/NNP ]
    [ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ]
    to/TO develop/VB
    [ advanced/JJ metallization/NN process/NN technology/NN ./.In ]
    [ a/DT statement/NN ] late/RB on/IN [ Monday/NNP ]
    How would you handle the following?
    [ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ]
    Would you just return the "a/DT three-year/JJ collaboration/NN agreement/NN"?

    If there was a matching end bracket for the "had/VBD entered/VBN into/IN", so you had...
    [ had/VBD entered/VBN into/IN [ a/DT three-year/JJ collaboration/NN agreement/NN ]]
    ...how would you want to handle that?

Similar Threads

  1. Finding Regex hard to learn
    By Dan0100 in forum New To Java
    Replies: 2
    Last Post: 09-21-2010, 02:18 PM
  2. Using regex to retrieve all text inside parentheses
    By adhoc334 in forum Advanced Java
    Replies: 5
    Last Post: 08-18-2010, 08:05 PM
  3. Replies: 15
    Last Post: 03-02-2010, 07:09 PM
  4. Replies: 20
    Last Post: 10-20-2008, 11:02 AM
  5. Regex for file extension
    By gapper in forum New To Java
    Replies: 1
    Last Post: 01-31-2008, 03:59 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •