Results 1 to 6 of 6
  1. #1
    akshayshah is offline Member
    Join Date
    Dec 2009
    Posts
    3
    Rep Power
    0

    Default Parsing of information on EMBL

    Hello,

    I need to make a java program to do the following. I am completely lost. Please help.

    Develop an API for representing the information in an EMBL file(eg. at bottom). This must include information on the EMBL ID, the species information as a list of taxon classifications, and the DNA sequence. In terms of Java classes, I need to produce at least EMBL.java and Sequence.java, with EMBL classes having a Sequence object as a field.

    Develop a class with a main method that takes the path of an EMBL file as a command line parameter. This class should use the file path passed to the main method to create a Scanner object, and then parses the EMBL file into the classes defined in 1.

    The Sequence class should implement the java.lang.CharSequence interface. It should store the sequence as a List of java.lang.Character objects. In particular, charAt(int) should extract the appropriate Character from the list and return the equivalent char

    Finally carry out one of the following:
    Write a method that searches the Sequence for a given DNA string. Write a test method, which searches for a Shine-Dalgarno Sequence (AGGAGGU) in the DNA sequence.

    EMBL file eg:
    ID BB252375; SV 1; linear; mRNA; EST; MUS; 318 BP.
    XX
    AC BB252375;
    XX
    DT 01-JUL-2000 (Rel. 64, Created)
    DT 01-DEC-2005 (Rel. 86, Last updated, Version 4)
    XX
    DE Mus musculus 7 days neonate cerebellum cDNA, RIKEN full-length
    DE enriched library, clone:A730051B17, 3' end partial sequence, similar to
    DE refseq:NM_000280 Homo sapiens paired box gene 6 (PAX6), isoform a, DE mRNA.
    SQ Sequence 318 BP; 80 A; 129 C; 35 G; 73 T; 1 other;
    ttatctatcat ctccacccct cacctctcca tcctcacccc ccggccccca 50
    taaacacact tgagccatca ccaatcagca cagctgtncc ggctgcaccc 100

  2. #2
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,863
    Rep Power
    19

    Default

    So, what have you done so far?

    If nothing, then I suggest writing the classes that represent the EMBL data, as stated in the first paragraph, before you do anything else. It's all there in the first paragraph.

    ETA: And third paragraph as well, come to think of it. Which tells you pretty much exactly what the Sequence class should look like.

  3. #3
    akshayshah is offline Member
    Join Date
    Dec 2009
    Posts
    3
    Rep Power
    0

    Default

    Thanks for the reply.

    I understand that this is a very easy task.

    But, I havent done any programming before and so I have no idea where to begin.

  4. #4
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,863
    Rep Power
    19

    Default

    Then learn how to program first?
    I presume you've had a course...

    Also there's always Sun's tutorial.

    But I would hope you'd done enough of a course to actually be able to write a simple class that contains the EMBL data, as described in para 1, even if the Sequence class is empty.

  5. #5
    akshayshah is offline Member
    Join Date
    Dec 2009
    Posts
    3
    Rep Power
    0

    Default

    I am from a biology background and we were taught java for only 2 hours and given this assignment.

    Neways, I have done a bit and need help for the second part.

    I have made the class EMBL.java as follows:

    public class EMBL {

    private String identity;

    public String getIdentity() {
    return identity;
    }

    public void setIdentity(String identity) {
    this.identity = identity;
    }

    private Sequence sequence;

    public Sequence getSequence() {
    return sequence;
    }

    public void setSequence(Sequence sequence) {
    this.sequence = sequence;
    }

    private String taxonomy;

    public String getTaxonomy() {
    return taxonomy;
    }

    public void setTaxonomy(String taxonomy) {
    this.taxonomy = taxonomy;
    }



    }

    After that, in another class, i have done the following to load the text embl file:

    import java.io.File;
    import java.io.FileNotFoundException;
    import java.io.FileReader;
    import java.util.Scanner;

    /**
    *
    * @author a9907862
    */
    public class ScannerReadFile {

    public static void main(String[] args) throws FileNotFoundException {
    System.out.println("Please, give the file to read ");
    Scanner inPath = new Scanner(System.in);
    String path = inPath.next();
    Scanner inFile = new Scanner(new FileReader(path));

    }
    }

    Could you help me to write a statement which would print out the line folloing ID or AC and sequence and how to remove the whitespace and numbers.

    Thanks.

  6. #6
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,863
    Rep Power
    19

    Default

    OK.
    That's a good start.
    In fact, you've pretty much (as far as I can tell) done the EMBL class.

    Now, that file is, I have to say, gibberish to me, so I'm not sure which bits you require.
    Anyway, it doesn't say "print out the ID", it says you need to read it in (along with the taxonomy and sequence) and create an EMBL object. So try doing that. I'd give you pointers, but I'm not terribly up on the Scanner class...I still use FileReaders...:)

    Have you done regular expressions at al? I'm guessing not if you've only done a two hour lesson.

Similar Threads

  1. storing information
    By bsebal28 in forum New To Java
    Replies: 3
    Last Post: 03-26-2009, 08:10 AM
  2. Security Information
    By saty_32016 in forum CLDC and MIDP
    Replies: 0
    Last Post: 03-05-2009, 08:14 AM
  3. How i get a JVM Heap information ?
    By Martin in forum New To Java
    Replies: 0
    Last Post: 01-15-2009, 11:03 AM
  4. Help in storing Information
    By care in forum New To Java
    Replies: 1
    Last Post: 12-01-2008, 09:16 PM
  5. system information
    By nitinborge5 in forum New To Java
    Replies: 1
    Last Post: 08-07-2007, 09:25 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •