Results 1 to 13 of 13
  1. #1
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default Weird download file problem

    Hi all,

    I'm trying to download a pdf from a website but it doesn't work.
    When i try any other website it works, but with this, it doesnt.

    I try to download a pdf but instead i get the html content of the home page.

    However, if I use a download manager and import the link, the download manager can get the pdf instead of the homepage.

    Here's the code:

    Java Code:
    	    URL localURL = new URL("http://krant.tijd.be/pdf/tijd/20100522/1/paper.pdf");
    	    InputStream localInputStream = localURL.openStream();
    	    InputStreamReader localInputStreamReader = new InputStreamReader(localInputStream);
    	    FileOutputStream localFileOutputStream = new FileOutputStream("C:/Users/Glenn/4.pdf");
    	    OutputStreamWriter localOutputStreamWriter = new OutputStreamWriter(localFileOutputStream);
    	    int i;
    	    while ((i = localInputStreamReader.read()) >= 0)
    	    {
    	      localOutputStreamWriter.write(i);
    	    }localOutputStreamWriter.flush();
    	    localOutputStreamWriter.close();
    	    localFileOutputStream.close();
    	    localInputStreamReader.close();
    	    localInputStream.close();
    Any help is appreciated because I've searched alot but didn't find anything

  2. #2
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,023
    Blog Entries
    7
    Rep Power
    20

    Default

    It won't solve your problem but a .pdf file isn't just text so you can't use a Reader for it, use an InputStream to read all the bytes and save those.

    kind regards,

    Jos

  3. #3
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default

    When I use this, without a reader, it doesn't work either.

    Java Code:
    	    URL url = new URL("http://krant.tijd.be/pdf/tijd/20100522/1/paper.pdf");
    	    InputStream is = url.openStream();
    	    OutputStream out = new FileOutputStream("C:/Users/Glenn/test.pdf");
    	    byte buf[]=new byte[1024];
    	    int len;
    	    while((len=is.read(buf))>0) {
    	        out.write(buf,0,len);
                }
                out.close();
    	    is.close();
    Last edited by bubbless; 05-27-2010 at 08:56 PM.

  4. #4
    Norm's Avatar
    Norm is online now Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,618
    Rep Power
    23

    Default

    When I read from the URL given, I get an HTML page.
    Why do you think that the server will send a pdf file vs an html file?
    Browsers can be instructed to go to another site for another file without user intervention.

  5. #5
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default

    But then the read would also return the other file.
    And I tried 2 download managers and they both downloaded the pdf file.

  6. #6
    Norm's Avatar
    Norm is online now Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,618
    Rep Power
    23

    Default

    But then the read would also return the other file.
    How does your code do that?
    My code reads reads from the server at the URL. Here's the first part of what I get:
    hdr> prefixes the header records
    XML Code:
    hdr> Content-Length: 80582
    hdr> Content-Type: text/html; charset=UTF-8
    hdr> Set-Cookie: auth=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Set-Cookie: user=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Set-Cookie: nickname=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Set-Cookie: username=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Set-Cookie: usertype=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Set-Cookie: bouncercount=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Set-Cookie: useremail=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Set-Cookie: port_info=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Set-Cookie: auth_cd=; expires=Thu, 01-Jan-1970 00:00:00 GMT; domain=.tijd.be; path=/
    hdr> Connection: Keep-Alive
    hdr> Keep-Alive: timeout=3, max=999
    hdr> Date: Thu, 27 May 2010 20:28:27 GMT
    hdr> Content-Location: [url]http://www.tijd.be/tijd/WEB-INF/jsp/homepage.jsp[/url]
    hdr> X-UA-Compatible: IE=EmulateIE7
    hdr> Content-Language: nl-BE
                    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="NL"> <head> <title>De Tijd: Homepage</title>      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="built" content="891_COMPILED" /> <meta name="verify-v1" content="8RV13SpoAOI9QNmJWyUn4TLCI4Ugal6w5haVLxqZzPk=" /> <meta name="keywords" content="" /> <meta name="description" content="Homepage" /> <meta name="robots" content="noarchive, noimageindex" /> <meta name="googlebot" content="noarchive, noimageindex" />
     <link rel="shortcut icon" type="image/ico" href="/favicon.ico" />   <link href="http://www.tijd.be/rss/nieuws.xml" rel="alternate" type="application/rss+xml" title="tijd.be Volledig nieuwsaanbod" /> <link href="http://www.tijd.be/rss/ondernemingen.xml" rel="alternate" type="application/rss+xml" title="tijd.be Markten" /> <link href="http://www.tijd.be/rss/ondernemingen.xml" rel="alternate" type="application/rss+xml" title="tijd.be Ondernemingen" /> <link href="http://www.tijd.be/rss/chemie_farma.xml" rel="alternate" type="application/rss+xml" title="tijd.be Chemie-Farma" /> <link href="http://www.tijd.be/rss/consumptie.xml" rel="alternate" type="application/rss+xml" title="tijd.be Consumptie" /> <link href="http://www.tijd.be/rss/diensten.xml" rel="alternate" type="application/rss+xml" title="tijd.be Diensten" /> <link href="http://www.tijd.be/rss/energie.xml" rel="alternate" type="application/rss+xml" title="tijd.be Energie" /> <link href="http://www.tijd.be/rss/financien.xml" rel="alternate" type="application/rss+xml" title="tijd.be Financi&euml;n" /> <link href="http://www.tijd.be/rss/industrie.xml" rel="alternate" type="application/rss+xml" title="tijd.be Industrie" /> <link href="http://www.tijd.be/rss/media_telecom.xml" rel="alternate" type="application/rss+xml" title="tijd.be Media & Telecom" /> <link href="http://www.tijd.be/rss/technologie.xml" rel="alternate" type="application/rss+xml" title="tijd.be Technologie" /> <link href="http://www.tijd.be/rss/economie.xml" rel="alternate" type="application/rss+xml" title="tijd.be Economie & Financi&euml;n" /> <link href="http://www.tijd.be/rss/binnenland.xml" rel="alternate" type="application/rss+xml" title="tijd.be Binnenland" /> <link href="http://www.tijd.be/rss/buitenland.xml" rel="alternate" type="application/rss+xml" title="tijd.be Buitenland" /> <link href="http://www.tijd.be/rss/cultuur.xml" rel="alternate" type="application/rss+xml" title="tijd.be De Wijde Wereld" /> <link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/fonts.css?v=891_COMPILED" media="screen"/> <link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/grid.css?v=891_COMPILED" media="screen"/> <link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/main.css?v=891_COMPILED" media="screen"/> <link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/nl/css/main.css?v=891_COMPILED" media="screen"/> <link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/calendar.css?v=891_COMPILED" media="screen"/> <link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/weer.css?v=891_COMPILED" media="screen"/>      <!-- Takeover / High impact -->   <!-- Floorad -->   <!-- Splash -->   <link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/print.css?v=891_COMPILED" media="print"/> <!--[if lt IE 8]><link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/IE_lt_8.css?v=891_COMPILED" media="screen"/><![endif]--> <!--[if IE 6]><link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/IE6.css?v=891_COMPILED" media="screen"/><![endif]--> <!--[if IE 7]><link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/IE7.css?v=891_COMPILED" media="screen"/><![endif]--> <!--[if IE 8]><link type="text/css" rel="stylesheet" href="http://static.tijd.be/s2010/common/css/IE8.css?v=891_COMPILED" media="screen"/><![endif]-->    <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/protoaculous.1.8.2.p1.6.1_rc3.min.js?v=891_COMPILED"></script>  <script type="text/javascript">
            var site = 'tijd';
            var styles = 'http://static.tijd.be/s2010';
            var slang = 'nl';
            var IE6 = false;
            var realtimeUrl = 'http://1.ajax.tijd.be/rtq/';
            var beursByIdPrefix = 'http://www.tijd.be/beurzen/';
            var jsDebug=false;
            var commonstyles = 'http://static.tijd.be/s2010/common';
            var isAdvertTest = false;
            var cookieDomain = '.tijd.be';
            var buildNumber= '891_COMPILED';
        </script>  <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/main.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/MFadvert.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/realtime.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/calendarNav.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/calendarPicker.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/chart.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/dropdown.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/portfolio.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/tablekit.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/tablekit_extend.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/flash_detect.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/swfobject.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/JavaScriptFlashGateway.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/assets/FusionCharts//FusionCharts.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/assets/FusionMaps/JSClass//FusionMaps.js?v=891_COMPILED"></script>  <script type="text/javascript" src="http://static.tijd.be/s2010/common/flowplayer/js//flowplayer-3.2.0.js?v=891_COMPILED"></script>  <script type="text/javascript" src="http://static.tijd.be/s2010/common/flowplayer/js//mediafin_language_nl-1.0.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/flowplayer/js//mediafin_loadplayer-1.0.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/hyphenator/Hyphenator.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/dialog.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/extra.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/common/js/newsticker.js?v=891_COMPILED"></script> <script type="text/javascript" src="http://static.tijd.be/s2010/nl/js/emergency.js?v=891_COMPILED"></script>   <!--[if lt IE 7]><script type="text/javascript">IE6 = true;</script><![endif]-->       <!--[if IE 6]> <div id="IEslidewarning" style="display:none;"> <div> <div class="warning"> <p class="icon">Uw browser is verouderd.</p> <p>Gelieve te upgraden naar &eacute;&eacute;n van deze moderne browsers:</p> </div> <div class="warningIcons">  <a href="http://www.mozilla-europe.org/nl/firefox/"><img alt="" border="0" src="http://static.tijd.be/s2010/common/img/content/firefox.gif"/></a> <a href="http://www.microsoft.com/windows/internet-explorer/default.aspx"><img alt="" border="0" src="http://static.tijd.be/s2010/common/img/content/ie.gif"/></a> <a href="http://www.google.com/chrome/?hl=nl"><img alt="" border="0" src="http://static.tijd.be/s2010/common/img/content/chrome.gif"/></a> <a href="http://www.apple.com/safari/download/"><img alt="" border="0" src="http://static.tijd.be/s2010/common/img/content/safari.gif"/></a> </div> <div class="warningWhy">  <a class="upgrade" href="http://help.tijd.be/?op=detail&articleId=8916321&nodeId=17">Waarom upgraden?</a> </div> <a class="close" href="#" onclick="closeSlider(); return false;"> <img alt="" border="0" src="http://static.tijd.be/s2010/common/img/content/ico_close.gif"/> </a> </div> </div>  <script type="text/javascript">
        if(readCookie('IE-warning') == null){
            createCookie( 'IE-warning' , true , 2 );
    
            Event.observe(window, 'load', function() {
                removezoom();
                Effect.SlideDown('IEslidewarning');
                setTimeout( addzoom , 1000 );
            });
        }
    
        function closeSlider(){
            removezoom()
            Effect.SlideUp('IEslidewarning')
            setTimeout( addzoom , 1000 );
        }
    
        function addzoom() {
            if ( $('paper') ) {
                $$('#paper .paperlink img')[0].setStyle( { zoom : "1" } );
            }
            if ( $('dateinfo_header') ) {
                $$('#dateinfo_header')[0].setStyle( { zoom : "1" } );
            }
        }
    
        function removezoom() {
            if ( $('paper') ) {
                $$('#paper .paperlink img')[0].setStyle( { zoom : "0" } );
            }
            if ( $('dateinfo_header') ) {
                $$('#dateinfo_header')[0].setStyle( { zoom : "0" } );
            }
        }
    </script> <![endif]-->
        <script type="text/javascript">
            Hyphenator.run();
        </script> </head> <body class="homepage home noSky">      <!-- Leaderboard -->  <!-- Splash -->
    
        
                    		
            <div id="splashOverlay"> </div> <div id="splashBlock"> <div class="clearfix" id="splashHeader"> <div id="splashLogo"> <img src="http://static.tijd.be/s2010/tijd/img/logo_tijd_splash.gif" alt="tijd"/> </div> <a href="#" onclick="closeSplash(); return false;">Sluiten</a> </div> <div id="splashContent"> <script type="text/javascript">
                        var add = new MFAdvert('http://ads.tijd.be/ad/303');
                    </script> </div> </div>
        
       <!-- Peelback -->                <div id="header"> <div id="headerInner" class="inner clearfix"> <div class="wrap"> <div class="metanav"> <div class="grid"> <div class="grid1"> <div class="dateinfo" id="dateinfo_header"> <a name="top"></a> <span style="text-transform:capitalize">donderdag</span> 27 mei 2010 22:28 </div> </div>    <div class="grid3"> <div class="servicenav"> <ul> <li id="portInfo"><a href="http://diensten.tijd.be/registratie/nieuw">Registreer nu gratis!</a></li> <li id="serviceInfo"><a href="http://diensten.tijd.be/">Mijn diensten</a></li> <li id="loginInfo"><a href="http://diensten.tijd.be/registratie/wijzigen">Aanmelden</a></li> </ul> </div> </div> <div class="grid1 lastGrid">     <script type="text/javascript"> 
                                setPortfolioIndicator('Aanmelden','http://diensten.tijd.be/registratie/wijzigen','Afmelden','http://www.tijd.be/logout','Mijn portefeuille','Maak gratis een portefeuille aan','http://diensten.tijd.be/portefeuille/');
                            </script> </div> </div> </div> <div class="header grid"> <div class="grid2"> <h2 class="sitebrand"> <a href="http://www.tijd.be/home"> <span class="hidden">De Tijd</span> </a> </h2> </div> <div class="grid2"> <div class="promo"> <div class="promo clearfix">
      <a href="http://blogs.tijd.be/verkiezingen2010/?
    What does the download manager do?

  7. #7
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default

    That is exactly what I am getting.
    But both the download managers download the pdf, the same one you get when you put the link in a browser.

    I've found the exact location of the pdf on the website (no redirect) but I need to login to see it.
    I'm gonna try that now.

    If you find anything, let me know.

  8. #8
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default

    Still no results.
    Anyone has a clue why this isn't working?

  9. #9
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,450
    Rep Power
    19

    Default

    Presumably the download managers are successfully processing the script on that page that appears when you go to the pdf link...which is something you would have to do.

  10. #10
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,450
    Rep Power
    19

    Default

    In fact, now that I've gone to that site, they clearly want a login...or something.

  11. #11
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default

    When I use the direct pdf link, it opens in firefox.

  12. #12
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,450
    Rep Power
    19

    Default

    Have you logged into the site before?
    I haven't so it redirects me to the front page.

    Your code hasn't logged in either...that is it hasn't got the relevant cookie info.

    What the page is epxecting exactly I couldn't say, but that's almost certainly your problem.

  13. #13
    bubbless is offline Member
    Join Date
    Mar 2009
    Posts
    81
    Rep Power
    0

    Default

    You're right.
    It's a cookie that you get when you visit the homepage.
    Thank you very much!

Similar Threads

  1. Weird problem with JSTL
    By Diego_Dalmasso in forum JavaServer Pages (JSP) and JSTL
    Replies: 2
    Last Post: 03-02-2010, 02:51 PM
  2. weird problem
    By GPB in forum New To Java
    Replies: 2
    Last Post: 02-28-2010, 12:04 PM
  3. Weird msdos problem
    By dudejonne in forum New To Java
    Replies: 6
    Last Post: 11-02-2009, 08:39 PM
  4. Weird path problem when reading properties file
    By jerry_popperq in forum New To Java
    Replies: 0
    Last Post: 03-18-2009, 03:32 PM
  5. Weird problem upon calling same function twice
    By alin_ms in forum New To Java
    Replies: 2
    Last Post: 12-20-2008, 06:14 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •