Results 1 to 8 of 8
  1. #1
    Join Date
    Feb 2009
    Posts
    9
    Rep Power
    0

    Default An acceptable hashcode?

    My assignment is to write my own hash function for a hashmap class that uses Integer id numbers for keys and maps them to Integer inventory numbers. Would it be acceptable Java programming just to use the id number as the hash code since each id number is unique or do I have to actually create a new number as the hashcode?

  2. #2
    MK12's Avatar
    MK12 is offline Senior Member
    Join Date
    Jan 2009
    Posts
    185
    Rep Power
    6

    Default

    I'm not sure, Netbeans automatically overrides Hashcodes when you override equals, and there is a specific formula for it. I also learned it in a book I am reading. It is not the only way but a good way. I don't really understand why it needs to be complicated though. So your ID numbers are "Integer"s, not "int"s? In that case wouldn't it need to call the Integer's hashCode instead? Correct me if I'm wrong.
    -MK12
    Tell me if you want a cool Java logo avatar like mine and I'll make you one.

  3. #3
    Join Date
    Feb 2009
    Posts
    9
    Rep Power
    0

    Default

    I actually think that my first post was wrong, they are ints not Integers. Thanks for posting.

  4. #4
    neilcoffey is offline Senior Member
    Join Date
    Nov 2008
    Posts
    286
    Rep Power
    6

    Default

    The short answer is that with integer keys, in many cases, it actually turns out not to be such a bad choice to just use the number directly (partly due to the specific way Java HashMap is implemented). But there are some "worst cases" that you could improve on by using a better hash function. So it depends a bit on how your IDs are generated.

    The aim of a hash function is generally to distribute items "randomly" within the "bins" that make up a hash table (see my introduction to hashing and Jonathan Shewchuk's video lecture on hash tables for info about the basic hash table structure). In the general case, where we don't know how many bins there are (and where that could change as the table is re-sized), we re-define our problem as trying to make it so that, for the hash code of a given key, each bit of the hash code has a roughly 50% chance of being set. (Or put another way, the hash code is a "random 32-bit number" in some sense[1].) Then, taking that random number mod the number of bins (or taking the n bottom bits of that number if there are a power of 2 number of bins, as in Java HashMap) will always give us a random bin number. Where, as in Java, we take the bottom bits of the hash code as the bin number, it's especially important that those bottom bits are random (the top ones will be chopped off!), and so Java actually includes a secondary hash function (see the source code of HashMap) to "spread the bits" about. So so long as some portion of the bits in our hash code are reasonably random, our hash function will work reasonably well in practice. For an illustration of how this works with the example of the String hash code function, see an article on my web site about how hash codes work.

    So coming back to our integers, if you list all of the integers between 0 and any power of 2 minus one (say, 0-7 or 0-15 or 0-31 etc), you'll see that for every bit position, exactly half of the numbers have that bit set. In other words, a list of consecutive numbers generally has the desired property of having a range of bit positions with a 50% probability of bits being set. (It has exactly that property for numbers in the range 0 to a power of 2 minus 1, e.g. 0-511; if we have numbers in, say, the range 0-767, then the bottom 9 bits are each set 50% of the time, but the tenth bit is set in only 1/3 of the numbers-- but this bias isn't too bad: overall, the bits are "random enough" to work for a hash function.)

    Now, where things may not be quite so good is in rare cases where your integers have specific bit patterns that bias the chance of several of them "colliding" and ending up with the same bucket number. It doesn't sound like you'll get this with your example.

    [1] When assessing random numbers in general, we wouldn't take bit probabilities on their own as a satisfactory criterion, but for assessing hash codes, I believe it is.
    Last edited by neilcoffey; 02-18-2009 at 02:50 AM. Reason: typo

  5. #5
    Fubarable's Avatar
    Fubarable is offline Moderator
    Join Date
    Jun 2008
    Posts
    19,316
    Blog Entries
    1
    Rep Power
    25

    Default

    Edit: deleted in favor of Neil's much better and more detailed post.

  6. #6
    neilcoffey is offline Senior Member
    Join Date
    Nov 2008
    Posts
    286
    Rep Power
    6

    Default

    P.S. To the original poster-- an Integer object is just a wrapper round an int. If you do:

    Java Code:
    HashMap<Integer,String> m = new HashMap<Integer,String>(5);
    m.put(10, "Item number 10");
    then the compiler will compile this to:

    Java Code:
    HashMap<Integer,String> m = new HashMap<Integer,String>(5);
    m.put(Integer.valueOf(10), "Item number 10");
    P.P.S. I should have mentioned that Java hash codes are 32 bits, but there's nothing terribly special about 32 bits. For example, you could implement a hash table with 64 bits (the size of a Java long). If you have a good hash function and a wide enough hash code (64 bits is probably about the minimum for many practical cases), you actually needn't store the keys in the hash table (as Java does) if the chance of two different keys giving identical hash codes is small enough.

  7. #7
    HafHothLact is offline Member
    Join Date
    Mar 2009
    Location
    US
    Posts
    4
    Rep Power
    0

    Default Hello!

    Hi Guys,

    Just joined up, thought i would say Hi :)

    Jenny

  8. #8
    Fubarable's Avatar
    Fubarable is offline Moderator
    Join Date
    Jun 2008
    Posts
    19,316
    Blog Entries
    1
    Rep Power
    25

    Default

    Please post these types of posts in the Announcements forum.

Similar Threads

  1. Is it acceptable to use a private object?
    By saul_2110 in forum New To Java
    Replies: 8
    Last Post: 12-08-2008, 06:04 PM
  2. Why Equals method should be over ridden in Hashcode?
    By skyineyes in forum New To Java
    Replies: 1
    Last Post: 05-26-2008, 04:13 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •