Results 1 to 12 of 12
  1. #1
    Edinson is offline Member
    Join Date
    Sep 2012
    Posts
    6
    Rep Power
    0

    Default How to fetch HTML documents from Sockets

    Hi, I am new to sockets and I would like to fetch html document using sockets.
    I can read texts from the host file but I could not display any images that is in that file

    byte[] b = new byte[1024];
    Socket socket = new Socket(hostname,80);

    PrintWriter out = new PrintWriter(new BufferedWriter(new OutputStreamWriter(socket.getOutputStream())));
    out.println("GET /" + file + " HTTP/1.1");
    out.println("Host: " + hostname);
    out.println();
    out.flush();

    DataInputStream reader = new DataInputStream(socket.getInputStream());

    int i = 0;
    File outputfile = new File(outFileName);
    FileOutputStream outfile = new FileOutputStream(outputfile);


    while ((i = reader.read(b)) != -1) {
    outfile.write(b,0, i);
    }

    The above is my code, it is able to give me the correct result if the html document contains only texts but not if there are images..
    I will get a bunch or weird data.

    Can someone help?
    Last edited by Edinson; 09-19-2013 at 09:13 AM.

  2. #2
    masijade is offline Senior Member
    Join Date
    Jun 2008
    Posts
    2,571
    Rep Power
    9

    Default Re: How to fetch HTML documents from Sockets

    Because there ARE no "images in that file". There are img tags "in that file".

  3. #3
    gimbal2 is offline Just a guy
    Join Date
    Jun 2013
    Location
    Netherlands
    Posts
    4,030
    Rep Power
    6

    Default Re: How to fetch HTML documents from Sockets

    Referring to other image files that need to be downloaded.
    "Syntactic sugar causes cancer of the semicolon." -- Alan Perlis

  4. #4
    Edinson is offline Member
    Join Date
    Sep 2012
    Posts
    6
    Rep Power
    0

    Default Re: How to fetch HTML documents from Sockets

    I am not very good with English but let me rephrase my question.
    I am suppose to retrieve the content of a file in a server
    the file could be a html file, a text file or an image file which i do not know.
    My above code is able to fetch the contents of html and text file no problem.

    but I am not able to get an image. I know the codes of retrieving an image but that would mean that I could not get if the file is html or text file

    So i was wondering is there any way I could know if the file is text/html/image before I fetched it?
    I try reading them all as bytes b[] but for image I would still have to convert into image but I do not need to do that for text file and html.

  5. #5
    gimbal2 is offline Just a guy
    Join Date
    Jun 2013
    Location
    Netherlands
    Posts
    4,030
    Rep Power
    6

    Default Re: How to fetch HTML documents from Sockets

    yes you can't use a reader to fetch binary data, you need to use a variation of a regular InputStream. The reader will try to interpret the data as text, but an image is not text.

    And this is the moment where you figure out that programming is hard. But lucky for you, there is a way out. See you can't read an binary file as text (characters in stead of bytes)... but you can read a text file as binary (bytes in stead of characters). So don't use a Reader for your data, use an InputStream for all files you fetch.

    Byte Streams (The Java™ Tutorials > Essential Classes > Basic I/O)
    "Syntactic sugar causes cancer of the semicolon." -- Alan Perlis

  6. #6
    Edinson is offline Member
    Join Date
    Sep 2012
    Posts
    6
    Rep Power
    0

    Default Re: How to fetch HTML documents from Sockets

    I did not use any reader. I thought I have already fetch the byte data using InputStream but still cannot see the image (see above code) but it does not
    display the image..Is my code wrong?

  7. #7
    gimbal2 is offline Just a guy
    Join Date
    Jun 2013
    Location
    Netherlands
    Posts
    4,030
    Rep Power
    6

    Default Re: How to fetch HTML documents from Sockets

    crap, I saw the word 'reader' and made a huge assumption. I am going to stand in the corner for about 15 minutes after this.

    But you don't need a DataInputStream either, just use the socket InputStream directly. And you are actually closing all those streams, right? The reading and writing part looks good.
    "Syntactic sugar causes cancer of the semicolon." -- Alan Perlis

  8. #8
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    12,015
    Rep Power
    20

    Default Re: How to fetch HTML documents from Sockets

    Since this is http, aren't there headers to get through first?
    In fact the entire response header would need to be read through before you get to the part where the actual image is held.
    Please do not ask for code as refusal often offends.

    ** This space for rent **

  9. #9
    Edinson is offline Member
    Join Date
    Sep 2012
    Posts
    6
    Rep Power
    0

    Default Re: How to fetch HTML documents from Sockets

    yup.. I close all the streams after writing.. in addition, the output file contains all the http response code which I don't want. but i could not omit them as i am reading everything as bytes, I do not know the place where the http response code will end, if I read as strings, at least I can know where it ends and start writing just the content to the output file. This is driving me crazy..

  10. #10
    Edinson is offline Member
    Join Date
    Sep 2012
    Posts
    6
    Rep Power
    0

    Default Re: How to fetch HTML documents from Sockets

    @Tolls, yup there are the http response code which I finds hard to omit it. How do I know if the inputstream start reading the content??

  11. #11
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    12,015
    Rep Power
    20

    Default Re: How to fetch HTML documents from Sockets

    Read up on http and read the input stream until you get to the marker that indicates the start of the data you are interested in.

    Why you're doing this manually I'm not sure, but I can only assume it's a learning exercise.
    Please do not ask for code as refusal often offends.

    ** This space for rent **

  12. #12
    johnson10001 is offline Member
    Join Date
    Jul 2014
    Location
    Chennai
    Posts
    1
    Rep Power
    0

    Default Re: How to fetch HTML documents from Sockets

    Thanks for sharing your ideas..

Similar Threads

  1. fetch method
    By mwimpelberg in forum New To Java
    Replies: 1
    Last Post: 02-19-2013, 09:46 AM
  2. Replies: 0
    Last Post: 01-30-2013, 08:38 AM
  3. Fetch Source Code
    By ujjwal in forum New To Java
    Replies: 7
    Last Post: 02-08-2011, 04:22 PM
  4. Replies: 2
    Last Post: 02-02-2011, 03:51 PM
  5. hibernate association fetch
    By enggvijaysingh@gmail.com in forum Advanced Java
    Replies: 1
    Last Post: 12-03-2010, 09:30 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •