How does Kotlin implement equals and hashCode?


#1

Hi,

I just want to understand little more on how equals() and hashCode() are generated by Kotlin.

Does Kotline default “class Foo” generates equals() and hashCode(), or is it only by when using “data class Foo”?

When Kotlin generates equals() and hashCode(), what implementation does it use? Can you document the pesudo code here?

Also how does Kotlin address the subclass with additional fields concern in regard to equals() and hashCode() implementation? More specifically, I am interested in knowing how Kotlin address the issue that Scala addressed with the “canEqual()” pattern described here: http://www.artima.com/lejava/articles/equality.html

Thanks,
Zemian


#2

By default, no specific equals and hashCode are generated (thus, they check object identity).

For data classes, hashCode and equals are based on property values. I’ll write algorithm steps in words instead of pseudocode:

hashCode:

  1. Calculate hashCode for each property declared as constructor parameter in current class
  2. Return h1 × 31n−1 + h2 × 31n−2 + … + hn


equals (let parameter be named that):

  1. Check if that is instance of our class (instanceof-check)
  2. For each property declared as constructor parameter in current class, check if this.prop1 equals to that.prop1 (by invoking equals method).


As I have already said in other thread, data class inheritance is not supported. If we support it, we’ll consider stricter algorithm for equals().


#3

This explanation doesn’t seem right to me. Consider this code:

data class IA(val ia: IntArray)

fun main(args: Array<String>) {
    val ia0 = IA(intArrayOf(0))
    val ia1 = IA(intArrayOf(0))
    println("ia0: ${ia0}, ia1: ${ia1}, ia0.ia: ${ia0.ia}, ia1.ia: ${ia1.ia}")
    println("ia0.hash: ${ia0.hashCode()}, ia1.hash: ${ia1.hashCode()}, ia0.ia.hash: ${ia0.ia.hashCode()}, ia1.ia.hash: ${ia1.ia.hashCode()}")
    println("ia0 == ia1: ${ia0 == ia1}, ia0.hash == ia1.hash: ${ia0.hashCode() == ia1.hashCode()}")
    println("ia0.ia == ia1.ia: ${ia0.ia == ia1.ia}, ia0.ia.hash == ia1.ia.hash: ${ia0.ia.hashCode() == ia1.ia.hashCode()}")
}

Which produces:

ia0: IA(ia=[0]), ia1: IA(ia=[0]), ia0.ia: [I@2cfb4a64, ia1.ia: [I@5474c6c
ia0.hash: 31, ia1.hash: 31, ia0.ia.hash: 754666084, ia1.ia.hash: 88558700
ia0 == ia1: false, ia0.hash == ia1.hash: true
ia0.ia == ia1.ia: false, ia0.ia.hash == ia1.ia.hash: false

I would have thought from the explanation given that the hash of ia0 and ia1 would be different because the hashes of ia0.ia and ia1.ia are different.

It is obviously important that equals and hash are consistent with one another, unfortunately they are not :frowning:.

What am I missing?

– Howard.


#4

The correct way to get the hash code of an array in java and in kotlin is to use Arrays.hashCode(a). Otherwise you use the hash implementation of java.lang.Object which is based on the memory location (I think).

Blame Java I guess :slightly_frowning_face:. Kotlin could decide to change the behavior, but this would lead to strange situations when using java and kotlin in the same project, even if this makes the language worse when used on it’s own. Kotlin still has interop with java as one of the core design principals and this is a necessary evil.


#5

The point I was making was that currently as implemented in Kotlin 1.3, for the data class shown, equals and hashCode are not really consistent. They are consistent if you use a class, not data class. Currently the data class returns 31 for the hash of both, ideally it would return a different hash since the hash of the underlying data structure is different. It is not out of contract however since you are allowed repeated hash values, its just that they aren’t very useful.

It is also not consistent with the description of the algorithm given in the above post from @geevee which would give a different hash value.

Just to be clear; I am not saying it should have value or reference semantics, just that it should be consistent and not return a constant for all hashes…


#6

Woops, you’re right. I missed that ia0 != ia1.

Not really. As I explained you calculate the hash of arrays differently so the hash values of the underlying data is the same. The problem is that the equality check is not consistent with the hash calculation for arrays. Either the hashCode implementation should use array.hashCode() instead of Arrays.hashCode(array) or the equals method should be changed to Arrays.equals(a, b).

It does not return a constant for all hashes. It just uses a hash function which is based on the elements in the array, which are in your example the same. If you change your example to use 1 instead of 0 in the array you get a different hash.


#7

So I think we are in agreement that there is something wrong with the data class implementation because it is by reference for equality and by value for hash.

Where can you file bugs?


#8

https://youtrack.jetbrains.com/issues


#9

Thanks


#10

Also could you post a link to the issue you created? I’d like to follow it and see an official response.


#11

https://youtrack.jetbrains.com/issue/KT-28751