Results 1 to 3 of 3
  1. #1
    Igor is offline Member
    Join Date
    Dec 2007
    Posts
    3
    Rep Power
    0

    Default building a tokenizer

    Hey there,

    I'm trying to build a tokenizer to break up sentences in words. Now since I find it not sufficient to break up/split string on whitespace, I'd like to give it a few more arguments.
    Whitespace being one of them, but also dot followed by whitespace, question mark followed by whitespace etc. (But not a dot alone, since that can be part of an abbreviation for example). So it's actually a mix of (single) characters (whitespace) and strings (punctuation mark followed by whitespace) that I want to split on.

    so conceptually I'd figure it looks something like this:
    input.split("[", ", ". ", "? ", "! , " "]";
    but eclipse doesn't really like that...

    I've read the doc for string and pattern/matcher, but it doesn't really help me.
    Anybody here who could point me in the right direction?

    thanks.

  2. #2
    PhHein's Avatar
    PhHein is offline Senior Member
    Join Date
    Apr 2009
    Location
    Germany
    Posts
    1,430
    Rep Power
    6

    Default

    Math problems? Call 1-800-[(10x)(13i)^2]-[sin(xy)/2.362x]
    The Ubiquitous Newbie Tips

  3. #3
    quad64bit's Avatar
    quad64bit is offline Moderator
    Join Date
    Jul 2009
    Location
    VA
    Posts
    1,323
    Rep Power
    6

    Default

    I needed to do this when I wrote a compiler. I found that Java's REGEX package worked just fine, since I had a combination of whitespace and no whitespace, but needed regular tokens either way.
    Last edited by quad64bit; 01-20-2010 at 05:55 PM.

Similar Threads

  1. Manipulating String Tokenizer
    By Bomber_Will in forum New To Java
    Replies: 2
    Last Post: 04-19-2009, 11:54 PM
  2. string tokenizer
    By twinytwo in forum New To Java
    Replies: 2
    Last Post: 03-26-2009, 02:10 PM
  3. Problem with string tokenizer
    By twinytwo in forum AWT / Swing
    Replies: 2
    Last Post: 03-26-2009, 11:27 AM
  4. Parsing or Tokenizer??
    By hiklior in forum New To Java
    Replies: 15
    Last Post: 05-28-2008, 02:20 PM
  5. question on string tokenizer
    By munigantipraveen in forum New To Java
    Replies: 2
    Last Post: 05-23-2008, 05:00 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •