Results 1 to 13 of 13
  1. #1
    gvs048 is offline Member
    Join Date
    Apr 2013
    Posts
    16
    Rep Power
    0

    Default scraping using Jsoup

    I am scraping a E-Commerce website using Jsoup. in this i want to get tags like , and price. after Jsoup.parse(), i am unable to getting this .

    <div id="ctl00_ContentPlaceHolder1_ctl00_ctl03_Showcase ">
    <div class="controlcontent_r">
    <div class="bucketgroup">
    <div class="prod_viewsparent">
    <div class="bucket" style="width: 175px; height: 280px;">
    <div class="bucket_left">
    <a href="/Products/Buy-Online-Electronics-Cameras-Digital-Cameras/Nikon/Nikon-Coolpix-L27-Point--Shoot/pid-2849731.aspx">
    <img class="mtb-img" style="width: 150px; height: 150px;" src="http://resources-images.martjackhosting.com/s3/martjack-resources/5d4b3aa1-119a-4d82-b9bb-1b6bdbd62002/Images/ProductImages/Source/NikonL27-BLK.jpg;width=150;height=150;scale=canvas" alt="Nikon Coolpix L27 Point & Shoot" title="Digital Cameras, Nikon, Nikon Coolpix L27 Point & Shoot"></a>
    <div id="2849731" class="btn_quick_view" style="display:none">
    <a rel="2849731,0,2466375,5d4b3aa1-119a-4d82-b9bb-1b6bdbd62002" href="#">Quick View</a></div>
    <h4 class="mtb-title">Nikon Coolpix L27 Point & Shoot</h4>
    <div class="mtb-desc">
    <span class="mtb-price">
    <label class="mtb-mrp">
    <b class="lb1"> MRP </b>
    <span class="WebRupee">Rs. </span>
    4,990
    </label>
    <label class="mtb-ofr">
    <b class="lb2"> Now At </b>
    <span class="WebRupee">Rs. </span>
    4,700
    </label>
    </span>
    <span class="offer_block">
    <a class="mtb-more" href="/Products/Buy-Online-Electronics-Cameras-Digital-Cameras/Nikon/Nikon-Coolpix-L27-Point--Shoot/pid-2849731.aspx" title="Click for more details"></div>

    I am unable to see "div class="controlcontent_r"" tag, after parsing.

    how i can handle this?

  2. #2
    MR bruto's Avatar
    MR bruto is offline Senior Member
    Join Date
    May 2013
    Location
    The Netherlands
    Posts
    130
    Rep Power
    0

    Default Re: scraping using Jsoup

    this is not java, this is javascript...

  3. #3
    SurfMan's Avatar
    SurfMan is offline Godlike
    Join Date
    Nov 2012
    Location
    The Netherlands
    Posts
    903
    Rep Power
    2

    Default Re: scraping using Jsoup

    Quote Originally Posted by MR bruto View Post
    this is not java, this is javascript...
    Jsoup is Java, not Javascript so this is a valid question. Just because you see a block of HTML doesn't automatically mean the question is about Javascript.

    On topic: what selector have you used? Do you have a SSCCE?

  4. #4
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,336
    Blog Entries
    7
    Rep Power
    20

    Default Re: scraping using Jsoup

    @Mr Bruto: please stop trying to answer questions you know nothing about; it only pollutes the threads.

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  5. #5
    gvs048 is offline Member
    Join Date
    Apr 2013
    Posts
    16
    Rep Power
    0

    Default Re: scraping using Jsoup

    Thanks for your discussion, This is regarding Java with Jsoup.

    If you not clear with my question, i am using this url for scraping the products names, prices, images..

    Jeans Men Clothing

    my code is like

    Document d=Jsoup.parse(url);
    Element ele=d.select("bucket");

    but i am not able to find some tags, after parsing some data is missing like <div class="bucket">

    can anyone tell me,how can do this

    Thanks.

  6. #6
    SurfMan's Avatar
    SurfMan is offline Godlike
    Join Date
    Nov 2012
    Location
    The Netherlands
    Posts
    903
    Rep Power
    2

    Default Re: scraping using Jsoup

    The selector says bucket so it is looking for tags called bucket. If you want tags with class bucket, use .bucket (Note the dot in front).

    More speficially you could use "div.bucket" to get all the divs with class="bucket". See http://jsoup.org/cookbook/extracting...elector-syntax

  7. #7
    gvs048 is offline Member
    Join Date
    Apr 2013
    Posts
    16
    Rep Power
    0

    Default Re: scraping using Jsoup

    Thanks SurfMan, the <div class="bucket"> is not coming after parsing..

  8. #8
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,336
    Blog Entries
    7
    Rep Power
    20

    Default Re: scraping using Jsoup

    Does 'd.select("div[class=bucket]")' help you out? (note: I know nothing about JSoup but found this in a blog somewhere ...)

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  9. #9
    SurfMan's Avatar
    SurfMan is offline Godlike
    Join Date
    Nov 2012
    Location
    The Netherlands
    Posts
    903
    Rep Power
    2

    Default Re: scraping using Jsoup

    Any chance we get to see an SSCCE i.e. a working example?

  10. #10
    SurfMan's Avatar
    SurfMan is offline Godlike
    Join Date
    Nov 2012
    Location
    The Netherlands
    Posts
    903
    Rep Power
    2

    Default Re: scraping using Jsoup

    I tried to do this as well, and it appears that the details, i.e. the buckets are populated throug an Ajax sub-request.
    Try this URL:
    http://www.jabraat.com/Handler/ProductShowcaseHandler.ashx?ProductShowcaseInput={ %22PgControlId%22:1076996,%22IsConfigured%22:true, %22ConfigurationType%22:%22%22,%22CombiIds%22:%22% 22,%22PageNo%22:1,%22DivClientId%22:%22ctl00_Conte ntPlaceHolder1_ctl00_ctl03_Showcase%22,%22SortingV alues%22:%22%22,%22ShowViewType%22:%22%22,%22Prope rtyBag%22:null,%22IsRefineExsists%22:true,%22CID%2 2:%22CU00084422%22,%22CT%22:0,%22TabId%22:0}&_=137 0529705400

    Having said that, screenscraping might be against the terms of use of the website. You might want to check with the webstore to see if they are ok with this.

  11. #11
    gvs048 is offline Member
    Join Date
    Apr 2013
    Posts
    16
    Rep Power
    0

    Default Re: scraping using Jsoup

    Thank you, may i know wt is this url? and how you get this one?

    What is the problem with direct url?

  12. #12
    SurfMan's Avatar
    SurfMan is offline Godlike
    Join Date
    Nov 2012
    Location
    The Netherlands
    Posts
    903
    Rep Power
    2

    Default Re: scraping using Jsoup

    I use Firefox with the Firebug add-on (Shameless plug here). It allows you to see all the sub-requests that your browser is sending out. (Check out its netpanel). The direct URL only contains the skeleton webpage, the actual content is added later. It's a design decision by the devs.

  13. #13
    gvs048 is offline Member
    Join Date
    Apr 2013
    Posts
    16
    Rep Power
    0

    Default Re: scraping using Jsoup

    Thanks for your help and response... @SurfMan and all others who replied to my post

Similar Threads

  1. jsoup 1.7.2
    By java software in forum Java Software
    Replies: 0
    Last Post: 02-01-2013, 06:05 AM
  2. Html scraping Site Loads Wrong Jsoup Java
    By kevinn205 in forum Advanced Java
    Replies: 1
    Last Post: 08-27-2012, 09:19 PM
  3. jsoup 1.6.2
    By java software in forum Java Software
    Replies: 0
    Last Post: 04-02-2012, 05:05 PM
  4. JSoup how to submit form?
    By Gwindow in forum Networking
    Replies: 0
    Last Post: 07-12-2011, 09:07 AM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •