Results 1 to 4 of 4
Like Tree1Likes
  • 1 Post By DarrylBurke

Thread: Java and regular expressions

  1. #1
    kosmos890 is offline Member
    Join Date
    Apr 2012
    Posts
    24
    Rep Power
    0

    Default Java and regular expressions

    I want to get the favicon of a web page using jsoup.
    Java Code:
     Document doc = null;
     String str="http://www.java-forums.org";
    
            try {
                doc = Jsoup.connect(str).get();
    
            } catch (IOException ex) {
                Exceptions.printStackTrace(ex);
            }
       
       String urlToIcon=doc.select("link[rel~=(?i)(shortcut)?(\\\\s)+icon]").attr("abs:href"));
    The problem is the regular expression (probably whitespaces \\\\s).
    But I test the regex with this code and the regex works.
    Java Code:
    String attributeValues="icon shortcut icon ICON SHORTCUT ICON";
            
       String regex="(?i)(shortcut)?(\\\\s)+icon";
       String[ ] split = attributesValues.split(regex);
            
       for(String s:split){
           System.out.println(s);
        }
    I test the regex here and the regex works also.
    Last edited by kosmos890; 01-18-2013 at 11:49 AM.

  2. #2
    KevinWorkman's Avatar
    KevinWorkman is offline Crazy Cat Lady
    Join Date
    Oct 2010
    Location
    Washington, DC
    Posts
    2,923
    Rep Power
    6

    Default Re: Java and regular expressions

    So, what's the problem?
    How to Ask Questions the Smart Way
    Static Void Games - Play indie games, learn from game tutorials and source code, upload your own games!

  3. #3
    kosmos890 is offline Member
    Join Date
    Apr 2012
    Posts
    24
    Rep Power
    0

    Default Re: Java and regular expressions

    From an html document, I want to extract tags like
    <link rel="shortcut icon" href="/favicon.ico">
    <link rel="icon" href="/favicon.ico">
    <link rel="ICON" href="/favicon.ico">
    but with my regex I can't.
    I am beginner and I can't understand why my regex doesn't work.

    If I simplify my regex to (?i)icon then I lose tags like
    <link rel="shortcut icon" href="/favicon.ico">

    If I change my regex to (?i).*icon then I get tags like
    <link rel="apple-touch-icon" href="icon.png">
    I don't want these tags.

  4. #4
    DarrylBurke's Avatar
    DarrylBurke is offline Moderator
    Join Date
    Sep 2008
    Location
    Madgaon, Goa, India
    Posts
    10,094
    Rep Power
    17

    Default Re: Java and regular expressions

    You want to parse HTML, use a HTML parser. Not regex.

    db
    foulkelore likes this.
    Why do they call it rush hour when nothing moves? - Robin Williams

Similar Threads

  1. regular expressions in java
    By chkontog in forum New To Java
    Replies: 2
    Last Post: 11-07-2012, 03:53 PM
  2. Java Regular Expressions: Comma Seperated List
    By sgtblitz in forum New To Java
    Replies: 3
    Last Post: 04-18-2011, 09:17 PM
  3. Regular Expressions Help
    By Death Sickle in forum New To Java
    Replies: 4
    Last Post: 04-04-2011, 04:21 AM
  4. How to create regular expressions in Java
    By maz09 in forum New To Java
    Replies: 12
    Last Post: 04-02-2010, 05:13 PM
  5. Regular Expressions in java
    By blue404 in forum Advanced Java
    Replies: 2
    Last Post: 09-26-2008, 03:43 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •