Java Forums

Main Menu
Home
Today's Posts
FAQ
Search
Contact Us

Java Network
Java Tips
Java Tips Blog

Sponsored Links





Welcome to the Java Forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community, you will:

  • have access to post topics
  • communicate privately with other members (PM)
  • not see advertisements between posts
  • have the possibility to earn one of our surprises if you are an active member
  • access many other special features that will be introduced later.

Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 04-12-2008, 09:40 PM
Moderator
 
Join Date: Nov 2007
Posts: 1,415
Java Tip will become famous soon enoughJava Tip will become famous soon enough
Soundex Algorithm Implementation in Java
This class implements the soundex algorithm as described by Donald Knuth in Volume 3 of The Art of Computer Programming. The
algorithm is intended to hash words (in particular surnames) into a small space using a simple model which approximates the sound of
the word when spoken by an English speaker. Each word is reduced to a four character string, the first character being an upper case
letter and the remaining three being digits. Double letters are collapsed to a single digit.

EXAMPLES

Knuth's examples of various names and the soundex codes they map to are:

Euler, Ellery -> E460
Gauss, Ghosh -> G200
Hilbert, Heilbronn -> H416
Knuth, Kant -> K530
Lloyd, Ladd -> L300
Lukasiewicz, Lissajous -> L222

LIMITATIONS

As the soundex algorithm was originally used a long time ago in the United States of America, it uses only the English alphabet
and pronunciation.

As it is mapping a large space (arbitrary length strings) onto a small space (single letter plus 3 digits) no inference can be made
about the similarity of two strings which end up with the same soundex code. For example, both "Hilbert" and "Heilbronn" end up
with a soundex code of "H416".

The soundex() method is static, as it maintains no per-instance state; this means you never need to instantiate this class.

Code:
public class Soundex { /* Implements the mapping * from: AEHIOUWYBFPVCGJKQSXZDTLMNR * to: 00000000111122222222334556 */ public static final char[] MAP = { //A B C D E F G H I J K L M '0','1','2','3','0','1','2','0','0','2','2','4','5', //N O P W R S T U V W X Y Z '5','0','1','2','6','2','3','0','1','0','2','0','2' }; /** Convert the given String to its Soundex code. * @return null If the given string can't be mapped to Soundex. */ public static String soundex(String s) { // Algorithm works on uppercase (mainframe era). String t = s.toUpperCase(); StringBuffer res = new StringBuffer(); char c, prev = '?'; // Main loop: find up to 4 chars that map. for (int i=0; i<t.length() && res.length() < 4 && (c = t.charAt(i)) != ','; i++) { // Check to see if the given character is alphabetic. // Text is already converted to uppercase. Algorithm // only handles ASCII letters, do NOT use Character.isLetter()! // Also, skip double letters. if (c>='A' && c<='Z' && c != prev) { prev = c; // First char is installed unchanged, for sorting. if (i==0) res.append(c); else { char m = MAP[c-'A']; if (m != '0') res.append(m); } } } if (res.length() == 0) return null; for (int i=res.length(); i<4; i++) res.append('0'); return res.toString(); } /** main */ public static void main(String[] args) { String[] names = { "Darwin, Ian", "Davidson, Greg", "Darwent, William", "Derwin, Daemon" }; for (int i = 0; i< names.length; i++) System.out.println(Soundex.soundex(names[i]) + ' ' + names[i]); } }
__________________
Want to make your IDE the best?
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
Bookmark Post in Technorati
Reply With Quote
Sponsored Links
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Binary Tree Implementation in Java Java Tip java.lang 0 04-16-2008 11:35 PM
Implementation of the Producer/Consumer problem in Java Java Tip java.lang 0 04-09-2008 07:41 PM
Java Telnet App Implementation mgdesmond13 New To Java 0 12-26-2007 07:08 PM
Java Telnet App Implementation mgdesmond13 Java Applets 0 12-26-2007 04:15 PM
Help with algorithm in java coco AWT / Swing 1 08-01-2007 07:45 AM


All times are GMT +3. The time now is 07:49 PM.


VBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2006 - 2007, www.java-forums.org