Results 1 to 20 of 37
- 06-07-2011, 01:23 PM #1
Member
- Join Date
- Jun 2011
- Posts
- 20
- Rep Power
- 0
How to best deal with large file uploads ?
Hello forum,
Although I'm still a bit of a Java newbie, I think the advanced section is probably better suited to this question, so here goes....
I've got the following code snippet that works beautifully, or at least it did until I threw a 2GB file at it and then it complained with the error Exception in thread "main" java.lang.OutOfMemoryError: Java heap space.
Should I be using an alternative technique if I'm going to be dealing with large files ? I am already calling my program with -Xms512m -Xmx1024m and don't really want to call it with more !
Java Code:this.ssl.conn.connect(); bos = new BufferedOutputStream(this.ssl.conn.getOutputStream()); bis = new BufferedInputStream(new FileInputStream(this.fil)); int i; while ((i = bis.read()) >= 0) { bos.write(i); } bos.close(); bis.close();
- 06-07-2011, 02:02 PM #2
Perhaps the buffering is reading too much of the file. Try it without the input file buffered.
- 06-07-2011, 02:22 PM #3
Senior Member
- Join Date
- Jun 2008
- Posts
- 2,366
- Rep Power
- 7
flush your output stream every now and then. And your program would be much more effecient using the read(byte[] b, int off, int len) and write(byte[] b, int off, int len) so you are only making two method calls for every b.length bytes rather than two methods for every byte. For a few bytes, no problem, for 2 GB, good night.
- 06-07-2011, 02:52 PM #4
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
As Norm says, why buffer?
Just use a FileInputStream and the OutputStream you get from this.ssl.conn.getOutputStream().
No need for the buffering on either of them.
And what masijade says...though the read(byte[]) one is good enough for this really.
- 06-07-2011, 03:05 PM #5
Member
- Join Date
- Jun 2011
- Posts
- 20
- Rep Power
- 0
Thanks for all the super quick replies.
Tolls Re: "Why buffer ?" .... Can't really say it was a specific design decision, I'll put it down to my lack of experience.
Norm & masijade, good food for thought there. Will go back and try agin without buffering first.
Thanks again
- 06-07-2011, 03:26 PM #6
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
Generally if all you're doing is reading from a stream and writing straight away to another stream then stick the thing closest to the interfaces (InputStream and OutputStream), since you don't care about the data in there, you just want to move it. Buffering is useful if you want to do something with it.
- 06-07-2011, 04:26 PM #7
Member
- Join Date
- Jun 2011
- Posts
- 20
- Rep Power
- 0
Thanks for that, nice easy tip to remember !
I suppose then according to your motto about "just moving" the data, the stuff on the following website is not worth considering as a solution ?
Java tip: How to read files quickly | Nadeau Software
- 06-07-2011, 04:57 PM #8
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
They essentially come to the same conclusion we have here.
Read using a byte array (coincidentally I use 8k by default, so I must have read something somewhere on that).
I've not used the nio stuff, but those graphs are far too busy for me to see what's actually going on....I can't see massive differences to be honest once you start using the byte array.
- 06-07-2011, 05:02 PM #9
Senior Member
- Join Date
- Jun 2008
- Posts
- 2,366
- Rep Power
- 7
Well, the smallest block size to use would be 512b as that is (was?) the standard disk sector size, but 4kb or 8kb is a much more effecient block size.
- 06-07-2011, 05:32 PM #10
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,406
- Blog Entries
- 7
- Rep Power
- 17
I don't think the buffering is to blame; no matter the size of the file those buffered stream (or readers) only buffer 8KB in total. Buffereing may be useless here but I doubt you can blame it for the OOME ...
kind regards,
JosWhen people rob a bank they get a penalty; when banks rob people they get a bonus.
- 06-07-2011, 05:50 PM #11
Member
- Join Date
- Jun 2011
- Posts
- 20
- Rep Power
- 0
Thanks Tolls.
I tried without any buffering, but that broke too. So array is my next test....
- 06-07-2011, 06:00 PM #12
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
Well, no...but it is pointless in any case.
In fact...:
You're storing something somewhere that doesn't need storing...
I'm going to hazard a guess it's that OutputStream, since the FIS isn't going to read ahead...at least not up to 2Gb.
So...are you flushing the output stream (as suggested)?
Failing that, take a heap dump and see what's taking up the space.
- 06-07-2011, 06:23 PM #13
Member
- Join Date
- Jun 2011
- Posts
- 20
- Rep Power
- 0
Am I just having a bad day or something.....
Is still giving me grief. Is it because 8024 is too big a size or am I just coding clumsily today ?Java Code:private byte[] barray = new byte[8024]; this.inStream = new FileInputStream(this.fil); this.outStream = this.ssl.conn.getOutputStream(); int r = 0; while ((r = this.inStream.read(this.barray)) > 0) { this.outStream.write(this.barray, 0, r); }
- 06-07-2011, 06:29 PM #14
Member
- Join Date
- Jun 2011
- Posts
- 20
- Rep Power
- 0
Tolls,
I think my reply crossed in cyberspace with yours.
As you can now see from my later post, I've been naughty and haven't tried that yet .... but I've given myself a slap on the wrist and am going back now....So...are you flushing the output stream (as suggested)?
- 06-07-2011, 06:30 PM #15
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
As I said, what is that output stream?
What is conn?
And should that not give any pointers, have you taken a dump (stop snickering at the back there!) and analysed it in something like Eclipse MAT?
ETA: And we crossed again.
Bad things happen when you cross streams...
- 06-07-2011, 06:46 PM #16
Member
- Join Date
- Jun 2011
- Posts
- 20
- Rep Power
- 0
Aah..... Mr Tolls....we may well have been crossing recently, but I think I can see convergence at the end of the tunnel.
I've been doing a little bit of digging and it seems URLConnection (conn = HttpsURLConnection) doesn't play ball with flushing and so aims to cache the whole lot in memory.
Now, the problem I've got is although the common answer suggested by Mr Google is to run URLConnection as Transfer-Encoding=chunked, I can't do that because I need to send an ETag header with the MD5 hash of the file and a Content-Length header with its size.
So that's where I'm at right now.
- 06-07-2011, 06:58 PM #17
Member
- Join Date
- Jun 2011
- Posts
- 5
- Rep Power
- 0
Hi, am also a newbie at this time, but i came across this post because it seems to have some relationship with my issue. you see...this time am working in a webservice for a national company in my country, so the web service calls a method and this query goes to the server bringing a litlle more than 50.000 registries in one only response, it happens in asincronohus method, so am pretty sure no new data will be loaded while am showing the 50.000 registries, of course the browser freeze when try to load so many registries, my main question is what can I do to paging al those record and only show 300 registries per page.. and of course, the 4 ussual buttons, Start, Back, Forward,Last. please any help would be good, some people have already told me that I must load this registry in a temporal memory, I am pretty sure, I don´t know how to do that?, am working with java 1.5 and my sdk is jdeveloper 10.1.3.5
- 06-07-2011, 08:02 PM #18
Can this be detected by looking at the output from the program?URLConnection (conn = HttpsURLConnection) doesn't play ball with flushing and so aims to cache the whole lot in memory.
Or can you see the number of writes to the internet your system is making? My Local Connection Status has a packet count. Would that increase in proportion to what is written? Would the class cache stuff in case of retry?
- 06-08-2011, 09:10 AM #19
Member
- Join Date
- Jun 2011
- Posts
- 20
- Rep Power
- 0
My plan for today is to try with Apache HttpClient as that seems to be how others have resolved their issue. But if I get a chance I'll try a tcpdump to see if I can get an answer to Norm's question (my present assumption about URLConnection caching comes from a quick speed-read of descriptions such as Bug ID: 4212479 Data(or Buffered)OutputStream from a URLConnection does not flush writes and Bug ID: 5026745 Cannot flush output stream when writing to an HttpUrlConnection).
- 06-08-2011, 09:26 AM #20
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
Similar Threads
-
Interrupt Component While Working With Large File & Chart to Image
By sherazam in forum Java SoftwareReplies: 0Last Post: 02-08-2011, 08:51 AM -
how to split large xml file into small xml file in java
By enggvijaysingh@gmail.com in forum XMLReplies: 2Last Post: 02-07-2011, 09:34 AM -
post of large xml file on third party webservices
By enggvijaysingh@gmail.com in forum XMLReplies: 6Last Post: 11-16-2010, 03:03 PM -
Help! trying to make JAR file for very large project
By tacopalypse in forum EclipseReplies: 0Last Post: 04-25-2009, 10:18 PM -
I need to be able to deal with functions like matlab
By romina in forum New To JavaReplies: 1Last Post: 08-07-2007, 05:37 AM


LinkBack URL
About LinkBacks
Reply With Quote

Bookmarks