Results 1 to 10 of 10
  1. #1
    javameanslife is offline Member
    Join Date
    Jan 2010
    Posts
    5
    Rep Power
    0

    Default Out of memory work around for a java application (please help!)

    I am working on a java application(Batch Process) and we have to use some third party API's (No Option to change it) related to the Database.It is causing out of memory after every 25000 updates and the total records are 5 million or so. I used the JProfiler to find out that the out of memory is caused by the third party API only.Since the application is in the production already we really cannot risk to change that third party API at any cost so we have to do some work around.

    What is the best work around in this scenario? I am thinking to create separate thread in the below mentioned code which will do the first 20000 updates and as soon as this thread dies all the objects associated with this will also die and hence the new thread will be a fresh start and again the new thread will do 20000 updates or something like that.

    If you have encountered such kind of problem in the past please help me and if possible write a pseudo code or sample code for it.

    my current code looks like this which I am thinking to add the java threads:

    public static void main(String [] args){

    *ResultSet* rs (rs has all the records e.g. 5 millions)
    long count=0;
    while (rs.next()){

    doUpdate();

    count++;
    *//the code fails if the count reaches to 25000*

    }


    }

  2. #2
    travishein's Avatar
    travishein is offline Senior Member
    Join Date
    Sep 2009
    Location
    Canada
    Posts
    684
    Rep Power
    5

    Default

    humm. are you are able to just launch the java VM with the VM argument to ask it to use more memory:
    Java Code:
     java -Xmx1500M
    where that would try allow this process to use 1.5Gigs of ram, assuming you have the ram in your system of course, and if this is even a suitable amount, maybe you would even need more, or less. this kind of needs fiddling, to see what is the amount you need before you no longer get the out of memory exceptions anymore.

    One thing to keep in mind when tuning memory parameters to try to work around [badly designed] fetch operations, is fixing it for today doesn't mean it will work for all time, likely it will break again when there is even more data to try to pull, and eventually the amount of memory that might be required to do the fetch all things at once could exceed what is practically available in a system.

    What is the starting point you have to this code, for example, do you just get a ResultSet back from it, ? or do you start with the DataSource, connection, and pass the result set into the update operation thingie.?

    Depending on your database kind, there might be more proprietary (specific to that database) result set classes that might support a kind of a row handler operation, where it more efficiently scrolls through results without loading all of them into memory.

    But from your sample code in the post above it kind of looks like the result set is invoked and returns all the data, and the doUpdate() likely updates things in this same result set, in this case, the modifications might pile up (in memory) until the transaction is committed.

    Would it be possible (assuming you are the one that first invoked the SQL query to fetch rows) to only work with large sections of the data at a time, such as to use the "limit" constraint in mysql, or the limit offset constraint in postgresql. virtually all databases (except oracle oddly enough) support this kind of limit.. offset behavior, which when used with an order by clause can effectively drive the query to paginate the results into smaller more manageable pieces. for example, in your problem, if it fails at 25,000 records, only do 20,000 at a time.

    Java Code:
     Connection con; // something that initializes this of course.
     PreparedStatement stmt = con.prepareStatement("select * from mytable order by id limit 0,20000");  // assuming mysql database here
     ResultSet rset = stmt.executeQuery();
     while (rset.next() {
      doUpdate();
     }
    where the next time this is invoked you would have the query then be
    Java Code:
    select * from mytable order by id limit 20000,20000
    to get the next 20,000 records, and so on.
    And this block of code would be driven by an outer block of code to first compute the count of total records, to figure out how many paginated invocations are required.

    The only down side to this approach is we now break up what used to be a single transaction into as many individual transactions as there are 'pages' of this processing.

    I guess its also possible somewhere in the code, its not immediately closing the result set or the statement right away, like waiting for the connection to close instead. where its good practice to close a result set or a statement as soon as you are done with it to have its resources freed up, so you don't run out of memory like this. not sure if thats somehting you have control over here, but check into it if you do.

  3. #3
    javameanslife is offline Member
    Join Date
    Jan 2010
    Posts
    5
    Rep Power
    0

    Default

    Thanks for the reply. I have done all the things which you have mentioned. So I have no idea other than just to kill the current jvm and keep continue the process as soon as it reaches to 20000 records or so. By kill the JVM I mean start the new process so that it can again update 20000 records.

  4. #4
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,863
    Rep Power
    19

    Default

    Are you sure it's not the way you're using the third party software?

    For example, I worked somewhere that suffered a memory leak, eventually resulting in an OOM. Profiling indicated it was in Hibernate that the problem lay, however it wasn't Hibernate that was the problem...it was the way queries were being written that caused the eventual *BOOM*.

    Not knowing how either your stuff, or the third party stuff works, it's almost impossible to provide a solution for you.

  5. #5
    senorbum is offline Member
    Join Date
    Aug 2009
    Posts
    76
    Rep Power
    0

    Default

    Quote Originally Posted by Tolls View Post
    Are you sure it's not the way you're using the third party software?

    For example, I worked somewhere that suffered a memory leak, eventually resulting in an OOM. Profiling indicated it was in Hibernate that the problem lay, however it wasn't Hibernate that was the problem...it was the way queries were being written that caused the eventual *BOOM*.

    Not knowing how either your stuff, or the third party stuff works, it's almost impossible to provide a solution for you.
    We had the same issue with hibernate :P

    And as to the OP, I'd probably find a way to ensure that you are using the 3rd party software correctly. Because if you are and its still an issue, you can either buy a nice server with 64gb of ram and restart it once in a while, or you can find a new way to do what you need to do.

    Also, how was this not caught until production :/

  6. #6
    javameanslife is offline Member
    Join Date
    Jan 2010
    Posts
    5
    Rep Power
    0

    Default

    Thanks for your concern. I can be 99.9% sure its not the way I am using the third party tool. The same code does not produce the out of memory in the previous version of the same software and neither in the present version of the code. It only produces it with the version which is under production right now. Previously nobody cared for the Out of memory because there used to be only 10000-15000 records to process but now it has reached to 1million or so hence creating the problem. The threshold is 25000 records.

    I think I have found the solution to use only 15000 records each time calling the same main class using Process Builder and once its done I can increase the range from 15000-30000 and so on. I haven't tested it though. But again this is really not a solution its a workaround.

    Still I am not sure if the solution which I am thinking will work or not:eek:

  7. #7
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,863
    Rep Power
    19

    Default

    Why not either fall back to the previous version or move forward to the new one?
    It does sound, from that description, like it is a bug with that particular version, and the usual move is to fall back, not attempt to work around it with a hack that may well not work properly anyway.

  8. #8
    senorbum is offline Member
    Join Date
    Aug 2009
    Posts
    76
    Rep Power
    0

    Default

    Quote Originally Posted by Tolls View Post
    Why not either fall back to the previous version or move forward to the new one?
    It does sound, from that description, like it is a bug with that particular version, and the usual move is to fall back, not attempt to work around it with a hack that may well not work properly anyway.
    I also would agree with this.

  9. #9
    javameanslife is offline Member
    Join Date
    Jan 2010
    Posts
    5
    Rep Power
    0

    Default

    For some reason(corporate decisions) we have go with whatever we have right now and just do a work around.
    FYI: I have found the solution.I will use the ProcessBuilder and call the main program and update it in the range of 15000 updates each time. In this way its always consuming less memory and I will be good to go. I have spend almost a month to fix this issue but I had no option just to do a workaround now:eek:.

    Thanks everybody for your help and support. I really appreciate your responses and suggestions.

  10. #10
    BSA
    BSA is offline Member
    Join Date
    Jan 2010
    Posts
    5
    Rep Power
    0

    Default

    Would it be possible to do one of the following (at one time or another I have had to do something similar):
    - If the API supports cursors you could try to use one.
    - Could these proceses be moved to a database stored procedure?
    - Can you limit the amount of data returned by your select... only select the column(s) that need updated.
    - Can you select and store just the primary key of each record (smaller memory usage) and then follow up with subsequent queries for maybe a small number of these at a time.

    ... just some ideas.

Similar Threads

  1. memory game in JAVA
    By lclclc in forum New To Java
    Replies: 19
    Last Post: 10-18-2009, 04:41 PM
  2. Replies: 2
    Last Post: 07-24-2009, 01:26 PM
  3. Replies: 2
    Last Post: 08-21-2008, 07:33 PM
  4. Memory Leak using a Swing Application Project
    By iimasd in forum AWT / Swing
    Replies: 0
    Last Post: 11-27-2007, 10:20 AM
  5. swing application consumes much memory
    By oregon in forum AWT / Swing
    Replies: 1
    Last Post: 08-05-2007, 08:25 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •