Results 1 to 6 of 6
Thread: Parsing of information on EMBL
- 12-02-2009, 08:57 PM #1
Member
- Join Date
- Dec 2009
- Posts
- 3
- Rep Power
- 0
Parsing of information on EMBL
Hello,
I need to make a java program to do the following. I am completely lost. Please help.
Develop an API for representing the information in an EMBL file(eg. at bottom). This must include information on the EMBL ID, the species information as a list of taxon classifications, and the DNA sequence. In terms of Java classes, I need to produce at least EMBL.java and Sequence.java, with EMBL classes having a Sequence object as a field.
Develop a class with a main method that takes the path of an EMBL file as a command line parameter. This class should use the file path passed to the main method to create a Scanner object, and then parses the EMBL file into the classes defined in 1.
The Sequence class should implement the java.lang.CharSequence interface. It should store the sequence as a List of java.lang.Character objects. In particular, charAt(int) should extract the appropriate Character from the list and return the equivalent char
Finally carry out one of the following:
Write a method that searches the Sequence for a given DNA string. Write a test method, which searches for a Shine-Dalgarno Sequence (AGGAGGU) in the DNA sequence.
EMBL file eg:
ID BB252375; SV 1; linear; mRNA; EST; MUS; 318 BP.
XX
AC BB252375;
XX
DT 01-JUL-2000 (Rel. 64, Created)
DT 01-DEC-2005 (Rel. 86, Last updated, Version 4)
XX
DE Mus musculus 7 days neonate cerebellum cDNA, RIKEN full-length
DE enriched library, clone:A730051B17, 3' end partial sequence, similar to
DE refseq:NM_000280 Homo sapiens paired box gene 6 (PAX6), isoform a, DE mRNA.
SQ Sequence 318 BP; 80 A; 129 C; 35 G; 73 T; 1 other;
ttatctatcat ctccacccct cacctctcca tcctcacccc ccggccccca 50
taaacacact tgagccatca ccaatcagca cagctgtncc ggctgcaccc 100
- 12-03-2009, 09:15 AM #2
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
So, what have you done so far?
If nothing, then I suggest writing the classes that represent the EMBL data, as stated in the first paragraph, before you do anything else. It's all there in the first paragraph.
ETA: And third paragraph as well, come to think of it. Which tells you pretty much exactly what the Sequence class should look like.
- 12-03-2009, 12:31 PM #3
Member
- Join Date
- Dec 2009
- Posts
- 3
- Rep Power
- 0
Thanks for the reply.
I understand that this is a very easy task.
But, I havent done any programming before and so I have no idea where to begin.
- 12-03-2009, 12:46 PM #4
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
Then learn how to program first?
I presume you've had a course...
Also there's always Sun's tutorial.
But I would hope you'd done enough of a course to actually be able to write a simple class that contains the EMBL data, as described in para 1, even if the Sequence class is empty.
- 12-03-2009, 05:02 PM #5
Member
- Join Date
- Dec 2009
- Posts
- 3
- Rep Power
- 0
I am from a biology background and we were taught java for only 2 hours and given this assignment.
Neways, I have done a bit and need help for the second part.
I have made the class EMBL.java as follows:
public class EMBL {
private String identity;
public String getIdentity() {
return identity;
}
public void setIdentity(String identity) {
this.identity = identity;
}
private Sequence sequence;
public Sequence getSequence() {
return sequence;
}
public void setSequence(Sequence sequence) {
this.sequence = sequence;
}
private String taxonomy;
public String getTaxonomy() {
return taxonomy;
}
public void setTaxonomy(String taxonomy) {
this.taxonomy = taxonomy;
}
}
After that, in another class, i have done the following to load the text embl file:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.Scanner;
/**
*
* @author a9907862
*/
public class ScannerReadFile {
public static void main(String[] args) throws FileNotFoundException {
System.out.println("Please, give the file to read ");
Scanner inPath = new Scanner(System.in);
String path = inPath.next();
Scanner inFile = new Scanner(new FileReader(path));
}
}
Could you help me to write a statement which would print out the line folloing ID or AC and sequence and how to remove the whitespace and numbers.
Thanks.
- 12-04-2009, 09:14 AM #6
Moderator
- Join Date
- Apr 2009
- Posts
- 10,481
- Rep Power
- 16
OK.
That's a good start.
In fact, you've pretty much (as far as I can tell) done the EMBL class.
Now, that file is, I have to say, gibberish to me, so I'm not sure which bits you require.
Anyway, it doesn't say "print out the ID", it says you need to read it in (along with the taxonomy and sequence) and create an EMBL object. So try doing that. I'd give you pointers, but I'm not terribly up on the Scanner class...I still use FileReaders...:)
Have you done regular expressions at al? I'm guessing not if you've only done a two hour lesson.
Similar Threads
-
storing information
By bsebal28 in forum New To JavaReplies: 3Last Post: 03-26-2009, 08:10 AM -
Security Information
By saty_32016 in forum CLDC and MIDPReplies: 0Last Post: 03-05-2009, 08:14 AM -
How i get a JVM Heap information ?
By Martin in forum New To JavaReplies: 0Last Post: 01-15-2009, 11:03 AM -
Help in storing Information
By care in forum New To JavaReplies: 1Last Post: 12-01-2008, 09:16 PM -
system information
By nitinborge5 in forum New To JavaReplies: 1Last Post: 08-07-2007, 09:25 AM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks