Results 1 to 9 of 9

Thread: SIte Grabber

  1. #1
    makpandian's Avatar
    makpandian is offline Senior Member
    Join Date
    Dec 2008
    Location
    Chennai
    Posts
    448
    Rep Power
    6

    Default SIte Grabber

    Hi to all
    i want to grab all files belongs to website?
    i think it is possible with the help of URL,URI classes.
    I can download a file form site using URL class.
    but i couldn't find out what are the files under the site.

    if Any one know how to extract files names under web directory ,share yours with me


    Thanking YoU.
    Mak
    (Living @ Virtual World)

  2. #2
    Supamagier is offline Senior Member
    Join Date
    Aug 2008
    Posts
    384
    Rep Power
    7

    Default

    Every page is a file... xx.html
    I die a little on the inside...
    Every time I get shot.

  3. #3
    makpandian's Avatar
    makpandian is offline Senior Member
    Join Date
    Dec 2008
    Location
    Chennai
    Posts
    448
    Rep Power
    6

    Default

    i know the site name but i dont know what files are in that site.if i know all files paths i can download entire site.
    Mak
    (Living @ Virtual World)

  4. #4
    tommosimmo is offline Member
    Join Date
    Mar 2009
    Posts
    6
    Rep Power
    0

    Default

    For security reasons, all files on a webserver/website arent accessible to the general public unless made so by a webmaster admin. An easy way of displaying all the files on a webserver is by simply not having an index page.

  5. #5
    makpandian's Avatar
    makpandian is offline Senior Member
    Join Date
    Dec 2008
    Location
    Chennai
    Posts
    448
    Rep Power
    6

    Default

    Thanks a lot tommosimmo
    In earlier ,i have tried to copy the site by accessing file names of server.But by your command i accept that it is not possible.

    Now my try is to access all file name via index.html

    do u know how to parse html to find out the links available in site?
    Mak
    (Living @ Virtual World)

  6. #6
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default

    Search on regular expressions.
    You can use "http://" or "www" to find the links.

  7. #7
    makpandian's Avatar
    makpandian is offline Senior Member
    Join Date
    Dec 2008
    Location
    Chennai
    Posts
    448
    Rep Power
    6

    Default

    bubbless
    you know one thing index files contain links as a relative path so it difficult to trace it by regular expression as www and http:
    Mak
    (Living @ Virtual World)

  8. #8
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default

    You can also do it with <a href=", that will work.

  9. #9
    makpandian's Avatar
    makpandian is offline Senior Member
    Join Date
    Dec 2008
    Location
    Chennai
    Posts
    448
    Rep Power
    6

    Default

    bubbless .
    Now i am tring a way as you told here...
    Mak
    (Living @ Virtual World)

Similar Threads

  1. Site slogan
    By fishtoprecords in forum Suggestions & Feedback
    Replies: 6
    Last Post: 01-08-2009, 07:02 PM
  2. New Forum Site
    By elasolova in forum Java Software
    Replies: 0
    Last Post: 11-02-2008, 11:53 PM
  3. My java blog and site
    By Engineeringserver.com in forum Reviews / Advertising
    Replies: 1
    Last Post: 10-23-2008, 02:47 AM
  4. [SOLVED] Site Blocking
    By Mir in forum Networking
    Replies: 12
    Last Post: 07-03-2008, 06:04 AM
  5. Site hacked
    By tim in forum Suggestions & Feedback
    Replies: 3
    Last Post: 02-02-2008, 09:47 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •