Results 1 to 7 of 7
  1. #1
    coltx is offline Member
    Join Date
    Jan 2011
    Posts
    2
    Rep Power
    0

    Default Pattern: illegal escape character

    Hi everyone,

    I have been trying to create a regular expression that effectively encompasses the following rules:

    starts with \( or \[
    ends with \) or \]

    i came up with the following

    Java Code:
    Pattern latex = Pattern.compile(".\\[\[(](.+\\[)\[]");
    However this produces and illegal escape character exception, prefixing it with two backslashes or none has the same effect and treats the "[" as an opening of a unity statement (probably not the official name) and a "]" as a closing.

    Help would be appreciated.

    Thanks.

  2. #2
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,773
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by coltx View Post
    Hi everyone,

    I have been trying to create a regular expression that effectively encompasses the following rules:

    starts with \( or \[
    ends with \) or \]

    i came up with the following

    Java Code:
    Pattern latex = Pattern.compile(".\\[\[(](.+\\[)\[]");
    However this produces and illegal escape character exception, prefixing it with two backslashes or none has the same effect and treats the "[" as an opening of a unity statement (probably not the official name) and a "]" as a closing.

    Help would be appreciated.

    Thanks.
    Note that you have to get that String literal passed through two compilers: first javac.exe parses the String (and changes it) and second the regular expression compiles the (changed) String. If you write:

    Java Code:
    ".\\[\[(](.+\\[)\[]"
    javac.exe parses and changes the String to:

    Java Code:
    ".\[\[(](.+\[)\[]"
    ... and complains about it: the second and last \[ are illegal escape sequences. As a rule of thumb: you have to type two backslashes to 'send' a single backslash to the regular expression compiler (because javac.exe changes \\ to a single \)

    The regular expression compiler treats \[ as an escaped square left bracket that has 'lost' its special meaning.

    kind regards,

    Jos
    Last edited by JosAH; 01-10-2011 at 02:58 PM.
    cenosillicaphobia: the fear for an empty beer glass

  3. #3
    coltx is offline Member
    Join Date
    Jan 2011
    Posts
    2
    Rep Power
    0

    Default

    Thanks for the reply

    I am still unsuccessful though, does the \\[ then got to \[ and escape the first bracket?

    because if i double backslash the backslash (".\\\[\[(](.+\\[)\[]") it produces the same exception. Also if i prepend the [ which i want taken literally as a double \\ i get a runtime exception saying group not closed at index 16 (the end of the regex) ".\\[\\[(](.+\\[)\\[]"

    ".\\[[(](.+\\[)[]" this yields the same results as the last.

    and ".\\\[[(](.+\\\[)[]" wont compile with an illegal escape character error

  4. #4
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,773
    Blog Entries
    7
    Rep Power
    21

    Default

    Quote Originally Posted by coltx View Post
    Thanks for the reply

    I am still unsuccessful though, does the \\[ then got to \[ and escape the first bracket?

    because if i double backslash the backslash (".\\\[\[(](.+\\[)\[]") it produces the same exception. Also if i prepend the [ which i want taken literally as a double \\ i get a runtime exception saying group not closed at index 16 (the end of the regex) ".\\[\\[(](.+\\[)\\[]"

    ".\\[[(](.+\\[)[]" this yields the same results as the last.

    and ".\\\[[(](.+\\\[)[]" wont compile with an illegal escape character error
    I always work backwards with the awful regular expression Strings. e.g. I want to escape a left square bracket, so I want to send \[ to the regular expression compiler. Javac.exe treats the backslash as a special character so I have to make it keep its mouth shut so I send \\[ to the java compiler. The result is the literal String "\\[".

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  5. #5
    charleyjoyce is offline Member
    Join Date
    Jan 2011
    Posts
    18
    Rep Power
    0

    Default

    Quote Originally Posted by coltx View Post
    starts with \( or \[
    here's an example for your first rule. You don't have to use a class range. here's how
    Java Code:
    Pattern.compile("^(\\[|\\()")
    Use a capturing group, then use the alternation "|" (logical OR). The caret ^ you should already know is match at the beginning. Similarly, the check for ending character, use $
    Last edited by charleyjoyce; 01-10-2011 at 03:39 PM.

  6. #6
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,773
    Blog Entries
    7
    Rep Power
    21

    Default

    On second thought: if you want to check whether or not a String starts with ( or [ and ends with ) or ] there is no need to fire up the entire regular expression moloch: check if the length of the String is at least 2 and check if the first character equals a ( or a [ and do the same with the last character for ) or ].

    It'll be a bit more code on your side but under the hood the regular expression classes won't be loaded.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  7. #7
    quad64bit's Avatar
    quad64bit is offline Moderator
    Join Date
    Jul 2009
    Location
    VA
    Posts
    1,323
    Rep Power
    7

    Default

    I just wanted to mention that I have found it useful to read regex patterns from a file or from a text box. Hard coding them requires you to double escape as other users have mentioned, which is VERY confusing to look at. If you write your regex pattern into a text file and just read the text file, no double escaping is needed, its done behind the scenes! Good luck!

Similar Threads

  1. ^ Escape character
    By GaBuG in forum Advanced Java
    Replies: 3
    Last Post: 12-31-2010, 02:04 AM
  2. illegal character: \8233
    By 13ponchera in forum New To Java
    Replies: 3
    Last Post: 10-06-2010, 08:32 AM
  3. illegal character: \92
    By jlgraham in forum New To Java
    Replies: 3
    Last Post: 06-29-2008, 10:04 AM
  4. illegal character: \65279
    By iwax in forum New To Java
    Replies: 3
    Last Post: 01-30-2008, 03:52 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •