Java Heap Space Problem
I am actually not a Java user but an R (statistical programming language) user. I use an R package which writes output
to Excel sheets. The package uses Java to link between R and Excel. When we used the package to write our data to Excel
R recently started to throw errors which after doing several tests we traced back to the java heap space.
Our question is now whether it is possible to clear/empty/reset the java heap space using java on the command line or by
starting an external programme? It is possible to start external programmes within R. So in case it would be possible to
clean up the java heap space we basically could write a little script which we could start from R to fix our problem.
Any help is very much appreciated.
How is R running the Java program?
You can't tell Java to clear its heap.
It does that itself, so if it's hitting an out of memory (OOM) error then it's simply crunching too much data or, possibly, holding onto too much data.
You could give Java more memory to work with, but that comes down to how it gets launched.
yes i think java is crunching ways too much data. Unfortunately we cannot allocate more memory to java...
any other ideas though?
None I'm afraid.
Java uses what it needs up to the point it hits the limit its been given.
You either have to allocate more memory or rewrite the Java code. Though I will say Excel work tends to be memory intensive (you can't stream the data, so you have to hold it all in there generally).
If the package you use accesses java through the rJava library, try setting the following option within R before the JVM is initialized by the library:
...or set the -Xmx amount to what you see fit for the operation
i think that's definitely worth a try! thanks for your comment!
just tried your suggestion and set the heap space to -Xmx4000M but still same result :(
i just created a dataframe with 85000 rows and 6 columns. It's not a tiny one but R Excel or Java should actually be able to handle that.
85000 rows is massive.
I don't think any of the Excel frameworks for Java would be able to handle that.
Why do you need an Excel report with so much data in it?
Would a CSV work?
If so you could then simply stream it.
I know the files a rather big. Yesterday, I also tried a R package which uses Python as the link between R and Excel. But even here I got an error. I think the dataframe which we want to save a simple to big. I looked for other solutions and found rather simple perl and python programmes which take several .CSV files as input and merge them together in one Excel file but each .CSV file is put into a new Excel tab. I think the advantage exporting data as CSV is that it is much much faster than exporting it directly to Excel.
The advantage of CSV is you can stream it, so you only need a single rows worth of data in memory at any one time.
With Excel frameworks you tend to have to hold the whole thing in memory.
If this is a data dump then Excel is a waste of time...you can read a csv into Excel afterwards in any case.
Yes that is also true but the thing is that R is much more limited I think compared to java. With Java you can do pretty much everything whereas R is basically a software for statistical analysis and cannot really be compared to java.
Another thing came just into my mind. Is it possible or does it make sense just to write a little java programme which simple calls the different types of garbage collectors? so after reading one output to Excel we could run the little java programme and then writing the next output to Excel?
Java garbage collects prior to throwing the OOM anyway.
If it could have collected memory it would have done.
Why do you think you have to use Excel?
That's the core of your problem...