Results 1 to 5 of 5
  1. #1
    thursgun is offline Member
    Join Date
    Apr 2010
    Rep Power

    Default I need ideas on how to read this file

    I want to read a flat file with the following info:

    There are about 30,000 Terms. Most terms are linked with others and all of them have an id which has this format: GO:number, i.e: GO:0006310.

    For every term I need to get:
    -their id.
    -the is_a id.
    - the relationship id

    For instance:

    id: GO:0000019
    name: regulation of mitotic recombination
    namespace: biological_process
    def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators]
    synonym: "regulation of recombination within rDNA repeats" NARROW []
    is_a: GO:0000018 ! regulation of DNA recombination
    relationship: regulates GO:0006312 ! mitotic recombination
    I need:
    GO:0000019, GO:0000018, GO:0006312.

    Finally I must ignore them when "is_obsolete: true" is present. (The Term is not relevant and I don't need it's info)

    I don't need any java code (although any suggestion is greatly appreciated), but I need a way to get this done. My final goal is to make a matrix with the Term's id in the first column, and the rest of the ids found, in the following columns. Any idea on how to do this?

    P.S: Please forgive me if this is not the right forum. Feel free to move it.

  2. #2
    Eranga's Avatar
    Eranga is offline Moderator
    Join Date
    Jul 2007
    Colombo, Sri Lanka
    Blog Entries
    Rep Power


    Read a line by line from the file, and then validate with your patterns. Regular expressions make sense in that case.

  3. #3
    gcalvin is offline Senior Member
    Join Date
    Mar 2010
    Rep Power


    You will want a Record class and a Value class. The Record itself is a set of key/value pairs, so you will want to store it as an ArrayList<Map<String, Value>>. The Value is text (String) and possibly one or more links (ArrayList<String>). You'll read through the file, parsing Values and Records. When you have a complete Record, you will store it in a Map<String, Record> (probably using the HashMap implementation) using the id field as the key.

    I think you'll find the project simpler if you parse all the fields, including the ones you're not interested in. This will keep your code simple and easy to read, and you can always filter out the ones you don't want afterward.

    It sounds more complicated than it is. Take a swing at either the Record or Value class first, and show us what you come up with. We can keep you steered in the right direction.


  4. #4
    Webuser is offline Senior Member
    Join Date
    Dec 2008
    Rep Power


    To analyze the text you read use this

    Regular Expression Lessons
    If my answer helped you. Please click my "REP" button and add a comment
    Have a Good Java Coding :)

  5. #5
    thursgun is offline Member
    Join Date
    Apr 2010
    Rep Power


    Thank you all for replying. Your help was very useful.

    I could create a flat file with the relations between the Terms (GO:XXXXXX).

    GO:0000001 is related to GO:0048308 and GO:0048311. GO:0000002 is related to GO:0007005...and so on.

    And I assigned an id number to each Term (GO:xxxxxx) with HashMap() like Gary suggested.

    Now I must face the final problem:

    I need to create a matrix that put a '1' when the Term is related to another one. So I need something like a 30,000x30,000 matrix, but this requires too much memory that I don't have.

    What can I do?

Similar Threads

  1. Read a file and converting this file into a string
    By kostinio in forum New To Java
    Replies: 7
    Last Post: 12-26-2009, 04:54 PM
  2. Replies: 2
    Last Post: 05-11-2009, 10:07 AM
  3. Replies: 0
    Last Post: 02-11-2009, 10:53 AM
  4. Replies: 5
    Last Post: 02-05-2009, 11:28 AM
  5. How to read a text file from a Java Archive File
    By Java Tip in forum Java Tip
    Replies: 0
    Last Post: 02-08-2008, 10:13 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts