Java Forums

Main Menu
Home
Today's Posts
FAQ
Search
Contact Us

Java Network
Linux Archive
Java Tips
Java Tips Blog

Sponsored Links





Welcome to the Java Forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community, you will:

  • have access to post topics
  • communicate privately with other members (PM)
  • not see advertisements between posts
  • have the possibility to earn one of our surprises if you are an active member
  • access many other special features that will be introduced later.

Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 11-25-2007, 01:14 AM
Member
 
Join Date: Nov 2007
Posts: 20
Gilgamesh is on a distinguished road
tokens
Ive used StringTokenizer to take the words from a text. I used the delimeters ",", "." etc.

questions

1) I tried to define a final string DELIMETERS="!@#" (etc) but when i type
Code:
StringTokenizer (line, DELIMITERS);
though it recognizes the delimeters, it creates one token made by every line of the text document (without including the delimeters).
if you cant figure out the problem, can you please tell me if there is a different way to set the delimeters? except from this:
Code:
StringTokenizer (line, ".", ",", "?");
?

2)some texts, at the end of the line use a hiven to continue the word to the next line. what can I 'unify' these two tokens that consist one word?

Last edited by Gilgamesh : 11-25-2007 at 01:19 AM.
Bookmark Post in Technorati
Reply With Quote
Sponsored Links
  #2 (permalink)  
Old 11-25-2007, 01:44 AM
Senior Member
 
Join Date: Jul 2007
Posts: 1,222
hardwired is on a distinguished road
a different way to set the delimeters
Try including the "space" delimiter.
some texts, at the end of the line use a hiven to continue the word to the next line. what can I 'unify' these two tokens that consist one word
Do you mean to remove the hyphen and concatenate the two words together to become a single token?
Code:
import java.util.StringTokenizer; public class TokenDelims { public static void main(String[] args) { String s = "This is#a special test-string for testing " + "deliminators in a StringTokenizer"; String delims = " #-"; StringTokenizer st = new StringTokenizer(s, delims); while(st.hasMoreTokens()) System.out.println(st.nextToken()); } }
Bookmark Post in Technorati
Reply With Quote
  #3 (permalink)  
Old 11-25-2007, 02:12 AM
Member
 
Join Date: Nov 2007
Posts: 20
Gilgamesh is on a distinguished road
remove the hyphen and concatenate the two words together to become a single token i mean the use of hyphens to show that a word has been broken in order to fit onto a line.

But i am thinking now that the hyphens are also used to join words together to make a compound e.g. 'left-handed'.

thats sounds difficult.. if I use the String.split () (instead of the StringTokenizer) things gonna be easier?

so how can i make this code? 'if there is a hyphen check if the syllables that the hyphen is between them (even if there is a change of line) exist as a compound word in the (arraylist/vector) dictionary and if they do not then eliminate the hyphen and unite the syllables into one word.

pain in the neck lol

ooh and there is no ignoreCase at the StringTokenizer . :-|

Last edited by Gilgamesh : 11-25-2007 at 02:17 AM.
Bookmark Post in Technorati
Reply With Quote
  #4 (permalink)  
Old 11-25-2007, 04:39 AM
Senior Member
 
Join Date: Jul 2007
Posts: 1,222
hardwired is on a distinguished road
You can save the token in a string and lowerCase it
Code:
String token = st.nextToken(); token = token.toLowerCase();
Code:
import java.util.StringTokenizer; public class TokenDelims { public static void main(String[] args) { String s = "This is a test-string for testing delimi-\n" + "nators in a StringTokenizer"; String delims = " "; StringTokenizer st = new StringTokenizer(s, delims); while(st.hasMoreTokens()) { String token = st.nextToken(); int dash = token.indexOf("-"); if(dash != -1) { int newLine = token.indexOf("\n"); if(newLine != -1) { // hyphen int length = token.length(); token = token.substring(0, dash) + token.substring(newLine+1, length); } } System.out.println(token); } } }
Bookmark Post in Technorati
Reply With Quote
Sponsored Links
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Getting tokens using Scanner class Java Tip Java Tips 0 02-05-2008 11:11 AM
tokens Gilgamesh New To Java 5 12-03-2007 01:30 AM
How to use StringTokenizer for multiple tokens javaplus New To Java 2 11-29-2007 11:38 AM


All times are GMT +3. The time now is 03:26 AM.


VBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2006 - 2007, www.java-forums.org