Results 1 to 2 of 2
  1. #1
    RGupta is offline Member
    Join Date
    Feb 2014
    Posts
    1
    Rep Power
    0

    Default how to improve file reading efficiency and its data insertion in java

    Hi All,

    We have an autosys job running in our production on daily basis. It calls a shell script which in turn calls a java servlet. This servlet reads these files and inserts the data into two different tables and then does some processing. Java version is 1.6 & application server is WAS7 and database is oracel-11g.

    We get several issues with this process like it takes time, goes out of memory etc etc. Below are the details of the way we have coded this process. Please let me know if it can be improved.

    1. When we read the file using BufferedReader, do we really get a lot of strings created in the memory as returned by readLine() method of BufferedReader? These files contain 4-5Lacs of line. All the records are separated by newline character. Is there a better way to read files in java to achieve efficiency? I couldnt find any provided the fact that all the record lines in the file are of variable length.

    2. When we insert the data then we are doing a batch process with statement/prepared statement. We are making one batch containing all the records of the file. Does it really matter to break the batch size to have better performance?

    3. If the tables has no indexes defined nor any other constraints and all the columns are VARCHAR type, then which operation will be faster:- inserting a new row or updating an existing row based upon some matching condition?

  2. #2
    notivago is offline Heavy Coffe Drinker
    Join Date
    Feb 2014
    Location
    São Paulo, Brazil
    Posts
    29
    Rep Power
    0

    Default Re: how to improve file reading efficiency and its data insertion in java

    Quote Originally Posted by RGupta View Post
    Hi All,

    We have an autosys job running in our production on daily basis. It calls a shell script which in turn calls a java servlet. This servlet reads these files and inserts the data into two different tables and then does some processing. Java version is 1.6 & application server is WAS7 and database is oracel-11g.

    We get several issues with this process like it takes time, goes out of memory etc etc. Below are the details of the way we have coded this process. Please let me know if it can be improved.
    My first advice is to run a profiler and see where things are dragging. Any attempt to optimize code you don't know where is broken is invite failure.

    1. When we read the file using BufferedReader, do we really get a lot of strings created in the memory as returned by readLine() method of BufferedReader?
    Yes.

    These files contain 4-5Lacs of line. All the records are separated by newline character. Is there a better way to read files in java to achieve efficiency? I couldnt find any provided the fact that all the record lines in the file are of variable length.
    What a re lacs? Also you must define what is efficiency. And not knowing how your code is writen, it is hard to say if it can be improved, dont you think? Why do you think it matters that lines are of variable lenght?

    2. When we insert the data then we are doing a batch process with statement/prepared statement. We are making one batch containing all the records of the file. Does it really matter to break the batch size to have better performance?
    Of course it does. You said you are getting out of memory, if you hold large parts of the file before batch inserting, those partts of file cannot be garbage collected so the take memory proportional to theirs size. Besides as you fill up memory you get less memory headroom for the jvm to work with, with an small free memory it has to invoke frequent garbage collections cycles and that costs the system.

    Depending on how much memory is allocated to the jvm and the size of the file you can be going in swap, which considerably slows down things.

    3. If the tables has no indexes defined nor any other constraints and all the columns are VARCHAR type, then which operation will be faster:- inserting a new row or updating an existing row based upon some matching condition?
    ...
    Breath
    ...

    Start by creating indexes, stop creating tables with all varchar, a question on what is best insert or update should never be about performance but about meaning.

    What is the meaning of inserting that is already in the data base. It does not make any sense to propose insert instead update?

    And this last part of the question belong to a data base forum, not java.

Similar Threads

  1. Replies: 11
    Last Post: 04-27-2012, 08:43 PM
  2. Replies: 2
    Last Post: 02-10-2010, 08:00 PM
  3. Replies: 0
    Last Post: 11-24-2009, 02:49 PM
  4. Reading data to file
    By puk284 in forum Advanced Java
    Replies: 1
    Last Post: 04-28-2009, 04:19 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •