Sponsors: Michael Fertik - Best JAVA Web hosting Company & 30% off


Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 03-08-2010, 04:38 PM
Member
 
Join Date: Feb 2010
Posts: 20
Rep Power: 0
masterrs.mind is on a distinguished road
Exclamation StackOverflow Exception with Regexes
Hi All,

I am using Regular Expressions in order to replace and strip certain html tags from the body of html.
I am using a properties file with key value pairs for the Regex patterns
I am encountering the following errors when each time the Regular Expression has been executed.

Exception in thread "main" java.lang.StackOverflowError
at java.util.regex.Pattern$LazyLoop.match(Unknown Source)
at java.util.regex.Pattern$GroupTail.match(Unknown Source)
at java.util.regex.Pattern$BranchConn.match(Unknown Source)
at java.util.regex.Pattern$CharProperty.match(Unknown Source)
at java.util.regex.Pattern$Branch.match(Unknown Source)
at java.util.regex.Pattern$GroupHead.match(Unknown Source)

Here are the properties.
0 T:\\(.*?\\): :
1 R:<P>&nbsp;</P>: :
2 R:&nbsp;: :
3 R:<SPAN(.|\n)*?>: :
4 R:</SPAN>: :
5 R:<TD(.|\n)*?>:<td>:
6 R:<P><IMG(.|\n)*?></P>:<hr/>:
7 C:StrongPattern:<P>(.|\n)*?<STRONG>(.|\n)*?/P>:<h1>:</h1>:<h2>:</h2>:
8 C:NestedTablePattern?s)<td>(.*?)</td>:<h1>:</h1>:
9 R:<p><br/>[\\r\\n]+<h1>:<h1>:
10 R:</h1></p>:</h1>:
11 R:<P><IMG(.|\n)*?>:<hr/>:

Is it something wrong with my Regexes?How can I possibly eliminate the errors.

Thank in advance.
Bookmark Post in Technorati
Reply With Quote
  #2 (permalink)  
Old 03-09-2010, 01:52 AM
Eranga's Avatar
Moderator
 
Join Date: Jul 2007
Location: Colombo, Sri Lanka
Posts: 9,394
Rep Power: 14
Eranga has a spectacular aura aboutEranga has a spectacular aura about
Send a message via Yahoo to Eranga
Default
StackOverflowError means that you have infinite recursion thing happening in your code. Did you check anything for that in your code?

Is that the complete error message you comes with?
__________________
Use an appropriate Subject. "Help, urgent!" isn't one.
Someone helped you? their helpful post.

Forums FAQ|Use CODE Tags|How To Ask Questions The Smart Way|The Java Tutorials|Glossary for Java|NetBeans IDE|Sun Downloads
Bookmark Post in Technorati
Reply With Quote
  #3 (permalink)  
Old 03-09-2010, 04:19 PM
Member
 
Join Date: Feb 2010
Posts: 20
Rep Power: 0
masterrs.mind is on a distinguished road
Default
Hi,
yeah I got it resolved,I changed my regex definition and now it is working fine.
Yes that is due to infinite recursion.

Thanks
Bookmark Post in Technorati
Reply With Quote
  #4 (permalink)  
Old 03-09-2010, 05:59 PM
Steve11235's Avatar
Senior Member
 
Join Date: Dec 2008
Posts: 1,028
Rep Power: 3
Steve11235 is on a distinguished road
Default
It may not be infinite, but just too much. Regex can do crazy things, even if they are valid.
Bookmark Post in Technorati
Reply With Quote
  #5 (permalink)  
Old 03-12-2010, 03:34 PM
Senior Member
 
Join Date: Nov 2008
Posts: 278
Rep Power: 2
neilcoffey is on a distinguished road
Default
Maybe you could post the changes you made?
__________________
Neil Coffey
Javamex - Java tutorials and performance info
Bookmark Post in Technorati
Reply With Quote
  #6 (permalink)  
Old 03-12-2010, 04:25 PM
Member
 
Join Date: Feb 2010
Posts: 20
Rep Power: 0
masterrs.mind is on a distinguished road
Default
Actually I was getting the error when running transformations for IMG tag and <P><STRONG> patterns,because i don't need to check for next lines in those cases(previously they were <P><IMG(.|\n)*?></P>)

All I have done is removing |(or) from Regexps.
So my Regexps turned out to be

6 R:<P><IMG(.)*?></P>:<HR/>:
7 R:<IMG(.)*?>: :
8 C:StrongPattern:<P>(.)*?<STRONG>(.)*?/P>:<H1>:</H1>:<H2>:</H2>:

That solved my problem.
Hope this will be helpful for someone.
Bookmark Post in Technorati
Reply With Quote
  #7 (permalink)  
Old 03-12-2010, 04:32 PM
Senior Member
 
Join Date: Nov 2008
Posts: 278
Rep Power: 2
neilcoffey is on a distinguished road
Default Other simplifications
In general, it may well be more efficient to use character classes [] instead of the pipe where you can achieve the equivalent. So you could write:

<P><IMG[.\n]*?></P>:<hr/>

You can also set the DOT_ALL flag when you compile the pattern instead of putting both dot and newline (unless you really want to include, say, \n but not \r, I guess).

I personally think it's a bit odd to have a capturing group modified by a star (....)* (perhaps you meant a non-capturing group (?:.....)*.
__________________
Neil Coffey
Javamex - Java tutorials and performance info
Bookmark Post in Technorati
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How can I fix this exception? Abder-Rahman Java 2D 1 12-21-2009 02:04 AM
Exception sreejit Advanced Java 2 10-21-2009 08:26 AM
Exception in thread "main" java.lang Exception In InitializerError kenzo2009 New To Java 1 08-20-2009 04:46 AM
Need Help with exception maggie_2 New To Java 7 11-26-2008 11:45 PM
Trouble with factory method - unhandled exception type Exception desmond5 New To Java 1 03-08-2008 06:41 PM


Java Forums is supported by the best jsp hosting.

All times are GMT +2. The time now is 09:13 PM.



VBulletin, Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2009, Crawlability, Inc.
Copyright ©2006 - 2007, www.java-forums.org