Results 1 to 4 of 4
  1. #1
    Alessio is offline Member
    Join Date
    Oct 2010
    Posts
    14
    Rep Power
    0

    Question Regex: find all possible matches

    Hi,

    this is a problem that has been bothering me a while now:

    imagine you have a string (e.g. "___a___b___c___") and you have a regex (e.g. "([abc]).*([abc])") and you want to find all possible combinations of matching groups (e.g. a-b, a-c, b-c).

    You can play around with greedy and non greedy quantifiers (e.g. "([abc]).*([abc])" and "([abc]).*?([abc])") which will find some of this combinations (e.g. a-b and a-c) - but not all (in this case not b-c).

    What I've been doing till now is basically this:

    Java Code:
    String haystack = "___a___b___c___";
    for (int j = 0; j < 100; j++) {
    	for (int i = 0; i < 100; i++) {
    		String pattern = "^.{" + j + "}([abc]).{" + i + "}([abc])";
    		Pattern p = Pattern.compile(pattern);
    		Matcher m = p.matcher(haystack);
    		while (m.find()) {
    			System.out.printf("%2d,%2d : %s - %s\n",
    					j, i, 
    					m.group(1),
    					m.group(2));
    		}
    	}
    }
    output:
    Java Code:
     3, 3 : a - b
     3, 7 : a - c
     7, 3 : b - c
    this code runs through all possible string lengths and finds all combinations that match the given pattern.

    However it does this by trying all possible expressions (in this case 100*100 = 10000 times - yes you could use less than 100, but you get the point).

    Is there a more elegant (i.e. faster) solution to this?

    Regards,
    Alessio

  2. #2
    Alessio is offline Member
    Join Date
    Oct 2010
    Posts
    14
    Rep Power
    0

    Default

    *bump* (sorry, about that)

  3. #3
    Zack's Avatar
    Zack is offline Senior Member
    Join Date
    Jun 2010
    Location
    Destiny Islands
    Posts
    692
    Rep Power
    4

    Default

    This is probably how I'd do it... it's not much "faster" but it prevents hardcoding values (like 100).
    Java Code:
    String haystack = "___a___b___c___";
    for (int i = 0; i < haystack.length() - 1; i++) {
    	for (int j = i + 1; j < haystack.length(); j++) {
    		String pattern = "^([abc]).*([abc])$";
    		Pattern p = Pattern.compile(pattern);
    		Matcher m = p.matcher(haystack.substring(i,j));
    		while (m.find()) {
    			System.out.printf("%2d,%2d : %s - %s\n",
    					i,j,
    					m.group(1),
    					m.group(2));
    		}
    	}
    }
    This code loops through the string starting at some point then uses all the subsets of the string after that point. Using the substring method, it pulls a section out of the string. For example, you'd get something like this for the value of haystack.substring(i,j):
    Java Code:
    _
    __
    ___
    ___a
    ___a_
    ___a__
    ___a___
    ___a___b
    ___a___b_
    ...
    _
    _c
    _c_
    _c__
    c
    c_
    c__
    _
    __
    _
    Then the regex compares that pattern (using string start/end ^$) to the substring taken out. If it matches, it records the start/end indices.

    I'm kind of convinced there's a faster way to do this yet, so I'm still looking. This was worth posting I thought though.

  4. #4
    Alessio is offline Member
    Join Date
    Oct 2010
    Posts
    14
    Rep Power
    0

    Default

    Yes, thats definitely a possibility, but basically it does the same thing as my code: just try every possible sub string / string length.

    I'm beginning to think that thats just the only possible solution with the Java Regex Parser...

Similar Threads

  1. help: matches don`t work
    By asabdo in forum New To Java
    Replies: 6
    Last Post: 01-16-2010, 07:06 AM
  2. Replies: 0
    Last Post: 11-17-2009, 01:58 PM
  3. How "Pattern.matches(regex, input)" function works?
    By kishan in forum Advanced Java
    Replies: 2
    Last Post: 04-26-2009, 12:46 AM
  4. Regex that matches whole word
    By moaxjlou in forum New To Java
    Replies: 13
    Last Post: 11-04-2008, 05:48 PM
  5. Help with password matches
    By Albert in forum AWT / Swing
    Replies: 1
    Last Post: 07-10-2007, 04:17 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •