Results 1 to 2 of 2
- 01-30-2008, 06:03 PM #1
Member
- Join Date
- Jan 2008
- Posts
- 1
- Rep Power
- 0
Feature extraction from a text file in java. this is used for scoring the sentences
I have seperated individual sentences out of a text document read from a file using several heuristics. Now i need to index the sentence according to the order of appearance (eg 1st sentence as 1, then 2,3 etc). Then i need to extract features out of these marked sentences. The features include position(if at top of document position is 1 if bottom its 0),length(number of words in the sentence) and thematic words(most frequent words). based on these features i will score the sentences. Please do help me out friends
- 02-04-2008, 08:26 PM #2
Member
- Join Date
- Jan 2008
- Posts
- 7
- Rep Power
- 0
There are a few ways that you could do this. The most efficient way is to keep track of the position of the sentences during the extraction from the file. You could define a class Sentence with variables like String sentence, int position, int appearanceOrder, etc. The Sentence class could contain methods which would count the number of words in the sentence and find the most frequent words. Is this helpful, or were you looking for something more specific?
Similar Threads
-
How to print text file in java(dotmatrix printer)
By yoganeethi in forum Advanced JavaReplies: 4Last Post: 12-01-2010, 01:45 PM -
count character in text file as input file
By aNNuur in forum New To JavaReplies: 7Last Post: 03-25-2010, 04:01 PM -
How to read a text file from a Java Archive File
By Java Tip in forum Java TipReplies: 0Last Post: 02-08-2008, 09:13 AM -
Extract Text from PDF File using java
By TSW1016 in forum Advanced JavaReplies: 5Last Post: 01-06-2008, 11:03 PM -
Converting text file(.txt) to JPG file(.jpg) in java
By javadeveloper in forum Advanced JavaReplies: 0Last Post: 11-09-2007, 04:22 PM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks