
An acceptable hashcode?
My assignment is to write my own hash function for a hashmap class that uses Integer id numbers for keys and maps them to Integer inventory numbers. Would it be acceptable Java programming just to use the id number as the hash code since each id number is unique or do I have to actually create a new number as the hashcode?

I'm not sure, Netbeans automatically overrides Hashcodes when you override equals, and there is a specific formula for it. I also learned it in a book I am reading. It is not the only way but a good way. I don't really understand why it needs to be complicated though. So your ID numbers are "Integer"s, not "int"s? In that case wouldn't it need to call the Integer's hashCode instead? Correct me if I'm wrong.
MK12

I actually think that my first post was wrong, they are ints not Integers. Thanks for posting.

The short answer is that with integer keys, in many cases, it actually turns out not to be such a bad choice to just use the number directly (partly due to the specific way Java HashMap is implemented). But there are some "worst cases" that you could improve on by using a better hash function. So it depends a bit on how your IDs are generated.
The aim of a hash function is generally to distribute items "randomly" within the "bins" that make up a hash table (see my introduction to hashing and Jonathan Shewchuk's video lecture on hash tables for info about the basic hash table structure). In the general case, where we don't know how many bins there are (and where that could change as the table is resized), we redefine our problem as trying to make it so that, for the hash code of a given key, each bit of the hash code has a roughly 50% chance of being set. (Or put another way, the hash code is a "random 32bit number" in some sense[1].) Then, taking that random number mod the number of bins (or taking the n bottom bits of that number if there are a power of 2 number of bins, as in Java HashMap) will always give us a random bin number. Where, as in Java, we take the bottom bits of the hash code as the bin number, it's especially important that those bottom bits are random (the top ones will be chopped off!), and so Java actually includes a secondary hash function (see the source code of HashMap) to "spread the bits" about. So so long as some portion of the bits in our hash code are reasonably random, our hash function will work reasonably well in practice. For an illustration of how this works with the example of the String hash code function, see an article on my web site about how hash codes work.
So coming back to our integers, if you list all of the integers between 0 and any power of 2 minus one (say, 07 or 015 or 031 etc), you'll see that for every bit position, exactly half of the numbers have that bit set. In other words, a list of consecutive numbers generally has the desired property of having a range of bit positions with a 50% probability of bits being set. (It has exactly that property for numbers in the range 0 to a power of 2 minus 1, e.g. 0511; if we have numbers in, say, the range 0767, then the bottom 9 bits are each set 50% of the time, but the tenth bit is set in only 1/3 of the numbers but this bias isn't too bad: overall, the bits are "random enough" to work for a hash function.)
Now, where things may not be quite so good is in rare cases where your integers have specific bit patterns that bias the chance of several of them "colliding" and ending up with the same bucket number. It doesn't sound like you'll get this with your example.
[1] When assessing random numbers in general, we wouldn't take bit probabilities on their own as a satisfactory criterion, but for assessing hash codes, I believe it is.

Edit: deleted in favor of Neil's much better and more detailed post.

P.S. To the original poster an Integer object is just a wrapper round an int. If you do:
Code:
HashMap<Integer,String> m = new HashMap<Integer,String>(5);
m.put(10, "Item number 10");
then the compiler will compile this to:
Code:
HashMap<Integer,String> m = new HashMap<Integer,String>(5);
m.put(Integer.valueOf(10), "Item number 10");
P.P.S. I should have mentioned that Java hash codes are 32 bits, but there's nothing terribly special about 32 bits. For example, you could implement a hash table with 64 bits (the size of a Java long). If you have a good hash function and a wide enough hash code (64 bits is probably about the minimum for many practical cases), you actually needn't store the keys in the hash table (as Java does) if the chance of two different keys giving identical hash codes is small enough.

Hello!
Hi Guys,
Just joined up, thought i would say Hi :)
Jenny

Please post these types of posts in the Announcements forum.