little more
a reference
Reference text book
String canonicalization
There can be some confusion about whether Strings are already canonicalized. There is no guarantee that they are, although the compiler can canonicalize Strings that are equal and are compiled in the same pass. The String.intern( ) method canonicalizes strings in an internal table. This is supposed to be, and usually is, the same table used by strings canonicalized at compile time, but in some earlier JDK versions (e.g., 1.0), it was not the same table. In any case, there is no particular reason to use the internal string table to canonicalize your strings unless you want to compare Strings by identity (see ). Using your own table gives you more control and allows you to inspect the table when necessary. To see the difference between identity and equality comparisons for Strings, including the difference that String.intern( ) makes, you can run the following class:
{
public static void main(String[] args)
{
System.out.println(args[0]); //see that we have the empty string
//should be true
System.out.println(args[0].equals(""));
//should be false since they are not identical objects
System.out.println(args[0] == "");
//should be true unless there are two internal string tables
System.out.println(args[0].intern( ) == "");
}
}
This Test class, when run with the command line:
java Test ""
gives the output:
true
false
true
Changeable objects
Canonicalizing objects is best for read-only objects and can be troublesome for objects that change. If you canonicalize a changeable object and then change its state, then all objects that have a reference to the canonicalized object are still pointing to that object, but with the object's new state. For example, suppose you canonicalize a special Date value. If that object has its date value changed, all objects pointing to that Date object now see a different date value. This result may be desired, but more often it is a bug.
If you want to canonicalize changeable objects, one technique to make it slightly safer is to wrap the object with another one, or use your own (sub)class.[5] Then all accesses and updates are controlled by you. If the object is not supposed to be changed, you can throw an exception on any update method. Alternatively, if you want some objects to be canonicalized but with copy-on-write behavior, you can allow the updater to return a noncanonicalized copy of the canonical object.
Note that it makes no sense to build a table of millions or even thousands of strings (or other objects) if the time taken to test for, access, and update objects in the table is longer than the time you are saving canonicalizing the