Results 1 to 4 of 4
Like Tree1Likes
  • 1 Post By DarrylBurke

Thread: Java and regular expressions

  1. #1
    kosmos890 is offline Member
    Join Date
    Apr 2012
    Posts
    40
    Rep Power
    0

    Default Java and regular expressions

    I want to get the favicon of a web page using jsoup.
    Java Code:
     Document doc = null;
     String str="http://www.java-forums.org";
    
            try {
                doc = Jsoup.connect(str).get();
    
            } catch (IOException ex) {
                Exceptions.printStackTrace(ex);
            }
       
       String urlToIcon=doc.select("link[rel~=(?i)(shortcut)?(\\\\s)+icon]").attr("abs:href"));
    The problem is the regular expression (probably whitespaces \\\\s).
    But I test the regex with this code and the regex works.
    Java Code:
    String attributeValues="icon shortcut icon ICON SHORTCUT ICON";
            
       String regex="(?i)(shortcut)?(\\\\s)+icon";
       String[ ] split = attributesValues.split(regex);
            
       for(String s:split){
           System.out.println(s);
        }
    I test the regex here and the regex works also.
    Last edited by kosmos890; 01-18-2013 at 11:49 AM.

  2. #2
    KevinWorkman's Avatar
    KevinWorkman is online now Crazy Cat Lady
    Join Date
    Oct 2010
    Location
    Washington, DC
    Posts
    3,834
    Rep Power
    8

    Default Re: Java and regular expressions

    So, what's the problem?
    How to Ask Questions the Smart Way
    Static Void Games - Play indie games, learn from game tutorials and source code, upload your own games!

  3. #3
    kosmos890 is offline Member
    Join Date
    Apr 2012
    Posts
    40
    Rep Power
    0

    Default Re: Java and regular expressions

    From an html document, I want to extract tags like
    <link rel="shortcut icon" href="/favicon.ico">
    <link rel="icon" href="/favicon.ico">
    <link rel="ICON" href="/favicon.ico">
    but with my regex I can't.
    I am beginner and I can't understand why my regex doesn't work.

    If I simplify my regex to (?i)icon then I lose tags like
    <link rel="shortcut icon" href="/favicon.ico">

    If I change my regex to (?i).*icon then I get tags like
    <link rel="apple-touch-icon" href="icon.png">
    I don't want these tags.

  4. #4
    DarrylBurke's Avatar
    DarrylBurke is offline Member
    Join Date
    Sep 2008
    Location
    Madgaon, Goa, India
    Posts
    11,188
    Rep Power
    19

    Default Re: Java and regular expressions

    You want to parse HTML, use a HTML parser. Not regex.

    db
    foulkelore likes this.
    If you're forever cleaning cobwebs, it's time to get rid of the spiders.

Similar Threads

  1. regular expressions in java
    By chkontog in forum New To Java
    Replies: 2
    Last Post: 11-07-2012, 03:53 PM
  2. Java Regular Expressions: Comma Seperated List
    By sgtblitz in forum New To Java
    Replies: 3
    Last Post: 04-18-2011, 09:17 PM
  3. Regular Expressions Help
    By Death Sickle in forum New To Java
    Replies: 4
    Last Post: 04-04-2011, 04:21 AM
  4. How to create regular expressions in Java
    By maz09 in forum New To Java
    Replies: 12
    Last Post: 04-02-2010, 05:13 PM
  5. Regular Expressions in java
    By blue404 in forum Advanced Java
    Replies: 2
    Last Post: 09-26-2008, 03:43 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •