- What is hashCode() method
- Do java primitive types have their hashcode()?
- Does the Java enums have its hashCode() ?
- hashCode() method in Java arrays
- hashCode() method in List, Map, Set
- hashCode() method in String
- hashCode() method in java Objects
- How is the hashCode calculated? hashCode() JVM implementation
- hashCode() own implementations
- Additional hashCode() doubts
hashCode() are asked on most interviews (from junior to senior level). In this post, you will find out what this method is, how it works for particular types, how it works for primitives, what you have to watch out for during your implementations. Enjoy reading!
What is hashCode() method
hashCode() is a method that should contain a hash algorithm that calculates a 32-bit integer (hash) from the object data, representing that object.
This hash is used by some collections (e.g. HashMap and HashSet) to store references to objects, so that access to data is very fast. Access time depends on the performance of our hash algorithm, so it is worth paying attention to it!
Do java primitive types have their hashcode()?
Primitive types are not treated the same as ordinary objects. It is not possible to call the
hashCode() method on a primitive type. The only option is to use wrappers such as
Integer.hashCode() returns the value of the int we passed on to the method (in our case 3). Implementation of hashCode from Integer class:
hashcode() functionality exists in the Short and Byte classes. First it casts short/byte to the int and returns its value.
A little different hashCode() is in Long. Since Long is a 64-bit number we have to do some “compression” so that we can return the int value:
>>> is the right shift bit operation.
value >>> 32means that we shift 32 bits to right. It leave only 32 bits - and the rest of the position is completed by 0 (32 rightmost bits are discarded). Then, on shifted value a XOR operation is performed
Float numbers are more complicated. Java convert its floating-point values to an equicalent integer value using
Float.floatToIntBits(float value). The method also checks whether the value is NaN (Not-a-number):
floatToRawIntBits() depend on the JVM/JDK you are using. For openJDK the source is here.
Double convert its floating-point value to long using
doubleToLongBits() then does the same XOR trick as
Does the Java enums have its hashCode() ?
Enums uses Object’s
hashCode() . This is the only way to use hashCode() in the enum.
In addition, Enum’s hashcode() is
final to prevent developers from creating their own implementations. It is because within one JVM there can exist only one instance of each enum object (enums are singletons). This is enough to ensure that such implementation makes sense and is correct.
hashCode() method in Java arrays
Arrays do not provide its
hashCode() implementation. Just as enums use a method from the Object class.
It can cause a lot of problems because this hash depends on the magic described below. A better practice is to use
hashCode() from the helper class -
Arrays.hashCode() method returns a hash code based on the contents of the specified array. If the contents of the two arrays are equal then return the same value. You can see an example below:
Array class has several hashCode() methods - for primitive types (e.g. hashCode(float a), hashCode(int a)), which inside use type specific hashcode() from the wrapper classes. For other Objects it use:
As you can see, this method adds all the hash from all elements by multiplying them by a multiplier - which in this case is 31. You may be wondering why we need a multiplier of 31. Thanks to the multiplier, arrays with the same elements but different order has different hashcodes. For example, having a table with the elements
[20,10] - without a multiplier, hashcode for the first table would be:
1 + 10 + 20 = 30, and for the second table:
1 + 20 + 10 = 30. Both tables would have the same hashcode (30) even though the order of their elements is different. With a multiplier, their hashcode is - for the first array:
(31*1 + 10) *31+20 = 1291 and for the second array:
(31*1 + 20) *31+10 = 1591
If you are interested in why the multiplier is 31 - this is described in detail in the article. I recommend reading it!
hashCode() method in List, Map, Set
LinkedList use the same algorithm as
Arrays.hashCode() for the same reason.
Set, unlike List, uses hashcode() without multiplier:
In Set a multiplier is not needed - as the order of the elements does not matter (it is the same). This makes the hashCode calculation a bit faster.
Maps, hashCode adds all element hashes (element hash is xor from the key hash and hash values).
hashCode() method in String
This is String class hashCode() implementation:
If the string has Latin encoding, the
StringLatin1.hashCode method is used, and if the string is UTF-16 encoding, the hashcode will be generated in
tringUTF16.hashCode. The difference is that in latin1 each character is exactly one byte long. In utf8 a character can consist of more than one byte. Inside, the hashCode calculating functionality is the same as in List or
The advantage of the
hashCode from the String class is that the hashcode value is cached and calculated lazily during the first call of the method.
hashCode() method in java Objects
If your class hasn’t overridden the
hashCode() method, then it will use the one defined in its super class, probably in the
Object class. hashCode() in Object class is a native function (is dependant on the JVM). Documentation says :
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
It is hard to guess how exactly
hashCode() works, as its implementation is placed in the native code of the JVM. In many places, you can find information that
hashCode() is simply an address in memory. Given that the JVM can relocate objects (e.g. during garbage collection cycles due to promotion or compaction) - after we calculate an object’s identity hash we must be able to retain it in a way that survives object relocation.
One solution is to write the hashcode calculated during the first call of the
hashCode() method into the object data (e.g object header). This way, if the object is moved to another memory area, it will still have the same, original hashcode. But despite the fact that it is often believed that - this is not the functionality of the most popular JVM implementation - OpenJDK…
How is the hashCode calculated? hashCode() JVM implementation
If we go deeper, to the JVM code (the implementation can be found here - line 511) we can deduce that the hashCode is not calculated from the address in the memory but from the thread state! You can find proof of that on StackOverflow.
There are two important things to remember. First of all, such hashCode calculating exist from OpenJDK 8. Secondly, hashCode calculating is implementation-dependent. In this post we checked OpenJDK. For comparison - JVM Zing implementation calculates hashCode based on a memory address.
This can be considered a tricky recruitment question! :)
hashCode() own implementations
Finally, it is important to mention the good practices when creating your own hashCode().
I will share with you the 3 most common ways to create your own hashCode():
- Using IDE (e.g. generating in IntelliJ, you can press ALT + Insert and select “equals and hashCode” from the context menu).
- Using libraries. For example, Lombok library, which will generate
- Writing your own implementation of
hashCode()- you must be careful not to make a mistake. A wrong
hashCode()can spoil the effectiveness of HashMap and bring many problems!
A simple example! Everyone likes simple examples! We have a well-known Person class:
Remember that when overwriting the
hashCode() method you must also overwrite the
equals() method, so as not to break the hashCode() & equals() contract. The equals and hashCode methods should meet the following conditions:
If x == y then x.equals(y) == true
If x.equals(y) == true then x.hashCode() == y.hashCode()
If x.hashCode() == y.hashCode() then x.equals(y) can either return true or return false
The last point is based on int scope (hashCode() returns int value). As a result,
hashCode() can return a finite number of results: from -2 147 483 648 to +2 147 483 647 - so, around 4 billion unique values. Let’s imagine, that we are building the application, that keeps more data. If we start invoking the
hashCode() method from more than 4 billion elements, it is possible, that first 4 billion ones will produce a unique result, but each next call will produce the integer value, that has been produced before. Therefore, it is important to combine
equals() to make sure that objects that have the same hashcodes are not different.
Additional hashCode() doubts
Do I have to implement hashCode/equals in each class?
Answer: No, if you don’t expect to use the
equals method (e.g., aren’t using
assertEquals from JUnit, or never mean to use this class as a key in a
Map, etc), the implementation will not be used -> it not necessary.
What will be the effect of badly implemented hashCode()
Anwer: Even if in hashCode you will return the same int for each element, e.g:
Your program will compile and run. The problem can be when you want to use collections using hashCode (HashMap, HashSet or other). In the above example hashCode - all elements will be placed into the same bucket in HashMap (as LinkedList elements). Effectiveness will decrease significantly.
Does null have its hashCode?
Of course you can’t call the
hashCode() method (like any other method) on null value. But we can pass null to
As you can see, Objects.hashCode() for null will return 0. You’re probably wondering how HashMap’s doing in that case.
HashMap has a special handling for
null values are fine as you don’t compute hash code for them in a
HashMap. This is the reason why
null keys work fine in
I hope you liked the article. If you have any questions, please feel free to comment. Thank you for your comments under the last post :)