How does Kotlin implement equals and hashCode?

Hi,

I just want to understand little more on how equals() and hashCode() are generated by Kotlin.

Does Kotline default “class Foo” generates equals() and hashCode(), or is it only by when using “data class Foo”?

When Kotlin generates equals() and hashCode(), what implementation does it use? Can you document the pesudo code here?

Also how does Kotlin address the subclass with additional fields concern in regard to equals() and hashCode() implementation? More specifically, I am interested in knowing how Kotlin address the issue that Scala addressed with the “canEqual()” pattern described here: http://www.artima.com/lejava/articles/equality.html

Thanks,
Zemian

2 Likes

By default, no specific equals and hashCode are generated (thus, they check object identity).

For data classes, hashCode and equals are based on property values. I’ll write algorithm steps in words instead of pseudocode:

hashCode:

  1. Calculate hashCode for each property declared as constructor parameter in current class
  2. Return h1 × 31n−1 + h2 × 31n−2 + … + hn


equals (let parameter be named that):

  1. Check if that is instance of our class (instanceof-check)
  2. For each property declared as constructor parameter in current class, check if this.prop1 equals to that.prop1 (by invoking equals method).


As I have already said in other thread, data class inheritance is not supported. If we support it, we’ll consider stricter algorithm for equals().

1 Like

This explanation doesn’t seem right to me. Consider this code:

data class IA(val ia: IntArray)

fun main(args: Array<String>) {
    val ia0 = IA(intArrayOf(0))
    val ia1 = IA(intArrayOf(0))
    println("ia0: ${ia0}, ia1: ${ia1}, ia0.ia: ${ia0.ia}, ia1.ia: ${ia1.ia}")
    println("ia0.hash: ${ia0.hashCode()}, ia1.hash: ${ia1.hashCode()}, ia0.ia.hash: ${ia0.ia.hashCode()}, ia1.ia.hash: ${ia1.ia.hashCode()}")
    println("ia0 == ia1: ${ia0 == ia1}, ia0.hash == ia1.hash: ${ia0.hashCode() == ia1.hashCode()}")
    println("ia0.ia == ia1.ia: ${ia0.ia == ia1.ia}, ia0.ia.hash == ia1.ia.hash: ${ia0.ia.hashCode() == ia1.ia.hashCode()}")
}

Which produces:

ia0: IA(ia=[0]), ia1: IA(ia=[0]), ia0.ia: [I@2cfb4a64, ia1.ia: [I@5474c6c
ia0.hash: 31, ia1.hash: 31, ia0.ia.hash: 754666084, ia1.ia.hash: 88558700
ia0 == ia1: false, ia0.hash == ia1.hash: true
ia0.ia == ia1.ia: false, ia0.ia.hash == ia1.ia.hash: false

I would have thought from the explanation given that the hash of ia0 and ia1 would be different because the hashes of ia0.ia and ia1.ia are different.

It is obviously important that equals and hash are consistent with one another, unfortunately they are not :frowning:.

What am I missing?

– Howard.

The correct way to get the hash code of an array in java and in kotlin is to use Arrays.hashCode(a). Otherwise you use the hash implementation of java.lang.Object which is based on the memory location (I think).

Blame Java I guess :slightly_frowning_face:. Kotlin could decide to change the behavior, but this would lead to strange situations when using java and kotlin in the same project, even if this makes the language worse when used on it’s own. Kotlin still has interop with java as one of the core design principals and this is a necessary evil.

3 Likes

The point I was making was that currently as implemented in Kotlin 1.3, for the data class shown, equals and hashCode are not really consistent. They are consistent if you use a class, not data class. Currently the data class returns 31 for the hash of both, ideally it would return a different hash since the hash of the underlying data structure is different. It is not out of contract however since you are allowed repeated hash values, its just that they aren’t very useful.

It is also not consistent with the description of the algorithm given in the above post from @geevee which would give a different hash value.

Just to be clear; I am not saying it should have value or reference semantics, just that it should be consistent and not return a constant for all hashes…

1 Like

Woops, you’re right. I missed that ia0 != ia1.

Not really. As I explained you calculate the hash of arrays differently so the hash values of the underlying data is the same. The problem is that the equality check is not consistent with the hash calculation for arrays. Either the hashCode implementation should use array.hashCode() instead of Arrays.hashCode(array) or the equals method should be changed to Arrays.equals(a, b).

It does not return a constant for all hashes. It just uses a hash function which is based on the elements in the array, which are in your example the same. If you change your example to use 1 instead of 0 in the array you get a different hash.

1 Like

So I think we are in agreement that there is something wrong with the data class implementation because it is by reference for equality and by value for hash.

Where can you file bugs?

https://youtrack.jetbrains.com/issues

Thanks

Also could you post a link to the issue you created? I’d like to follow it and see an official response.

https://youtrack.jetbrains.com/issue/KT-28751

I started wondering the same thing, and I’m not even using arrays.

My class, minus the irrelevant methods:

data class Complex(val real: Double, val imaginary: Double) {
    val conjugate: Complex by lazy { Complex(real, -imaginary) }

    override fun toString(): String {
        val builder = StringBuilder("complex(")
        if (real != 0.0 || imaginary == 0.0) {
            builder.append(real)
        }
        if (imaginary != 0.0) {
            if (real != 0.0) {
                if (imaginary.sign < 0) {
                    builder.append(" - ")
                    builder.append(-imaginary)
                } else {
                    builder.append(" + ")
                    builder.append(imaginary)
                }
            } else {
                builder.append(imaginary)
            }
            builder.append('i')
        }
        builder.append(")")
        return builder.toString()
    }
}

Quick test:

fun main() {
    val z = Complex(2.0, 0.0)
    println("$z")
    println("${z == z}")
    val z2 = z.conjugate
    println("$z2")
    println("${z == z2}")
}

Result:

complex(2.0)
true
complex(2.0)
false

But a naively implemented equals method which just checks each property one by one returns true - even IDEA’s autocreated one I get if I tell it to create it for me returns the expected result of true.

So how is the default equals method implemented? If it isn’t behaving in a trustworthy manner, I will just have to ignore its existence and implement it manually every time.

Now, you may say, “oh, you shouldn’t be using == to compare doubles”. Well, if I use == to compare doubles, I get a result more correct than whatever Kotlin is doing to compare doubles, so I think it’s the lesser of two evils at this point. == at least says that 0 and -0 are equal. Kotlin’s equality behaviour does not.

Your problem isn’t really related to how Kotlin generates equals(), but with your wrong assumption that 0.0 and -0.0 should be considered equal. Let’s ignore Kotlin entirely and try some Java instead:

    System.out.println(0.0d == -0.0d); // true
    System.out.println(Double.valueOf(0.0d).equals(Double.valueOf(-0.0d))); // false
    System.out.println(Double.valueOf(0.0d).compareTo(Double.valueOf(-0.0d))); // 1, so first is larger

As you can see, they are equal for primitive doubles, but they aren’t for wrapped ones. Honestly, I’m not sure what is the reason behind this, it seems inconsistent, but the point is: you may get exactly the same behavior, which is unexpected by you, even when using Java or writing your own equals().

1 Like

Well… when writing my own equals, I compare the primitives since that makes the most sense for comparing doubles.

Same deal with the equals automatically generated by IDEA.

Kotlin’s default equals implementation is the only one doing the dumb thing.

The point holds - since I can’t trust Kotlin’s default implementation, I’m forced to implement it myself every time, because I never know when it’s going to do something equally dumb.

You call 0.0 being equal to -0.0 a wrong assumption. I’d be curious where you learned mathematics.

No, it isn’t. Kotlin implementation internally uses Double.compare() method from Java stdlib and this method says these two numbers differ.

Reimplement your above Complex class in Java using Double as fields, generate equals() from IDE and you will see exactly the same behavior. The reason why you see it only in Kotlin’s data classes is just because you explored very few alternatives so far.

Please send the same question to Java authors, because they clearly lack math skills as I do :slight_smile:

As I said, I’m not entirely sure what is the reason behind this choice, but this is just how it works in Java. I guess this may be caused by the fact that floats/doubles are sometimes used to represent indeterminate forms. If 0.0 would equal -0.0 then 1 / ∞ would equal 1 / -∞ which doesn’t make sense. But this is just my guess.

1 Like

I know that the bit sequence of the floating point representations differs, but the fact remains that the most naive implementation (using the primitives - boxing them is adding more code and thus not the most naive implementation!) gets the right result, while Kotlin’s automatic equals method does not.

And the whole time, we’re led to believe that Kotlin’s pseudoprimitive types are treated as primitives whenever possible. What a lie that was!

So basically any time I want to use data classes now I have to second guess what Kotlin might have written for it. Because it’s doing stupid things in there for at least arrays and doubles, who says that it isn’t doing stupid things for ints? For strings? For anything? At the very least, Kotlin should document a table of which types are safe to have in your object for the default equality to behave correctly.

The only thing Kotlin does here is to ask Java stdlib if these two numbers differ or not. And Java stdlib responds that they differ. End of story.

And as a matter of fact, it uses primitives here.

1 Like

But, if you use == for primitives, like you get in an autogenerated equals method, 0.0 == -0.0, so it works correctly. So at the very least, it’s a fact that what Kotlin’s doing doesn’t agree with what IDEA is doing, which is confusing at best.

IDEA generates something that you can then edit later. It doesn’t strive for ultimate correctness, but to just give you a sort-of-okay default. The Kotlin stdlib follows what Java provides as one of the options for how Double equality works. Now, if you read the Java docs related to that:

Compares two Double objects numerically. There are two ways in which comparisons performed by this method differ from those performed by the Java language numerical comparison operators (<, <=, ==, >=, >) when applied to primitive double values:
Double.NaN is considered by this method to be equal to itself and greater than all other double values (including Double.POSITIVE_INFINITY).
0.0d is considered by this method to be greater than -0.0d.
This ensures that the natural ordering of Double objects imposed by this method is consistent with equals.

And so I’m guessing that the real reason that Kotlin uses this implementation is that data classes are meant to be stable data holders, and so trivial changes like changing a field from being a Double to a Double? shouldn’t impact the quality of the class. Just imagine how unexpected it’d be that your code suddenly behaves differently just because you decided that maybe the amount of time that has passed since an event cannot always be determined due to connectivity problems or whatever, and so you changed your class from: data class Event(time: Double) to data class Event(time: Double) and now all of a sudden your equality implementation has wildly differed.

Almost like that. The reason why data and value classes use generic equals for any type including arrays and floating point numbers is to avoid unexpected equality behavior changes when you make your data/value class more generic. For example, if you had a data class like

data class BoxDouble(val value: Double)

and then decided to change it to hold any number by changing Double to a generic type T

data class BoxNumber<T : Number>(val value: T)

it would be unfortunate if the equality behavior of BoxNumber(0.0) was different from BoxDouble(0.0). Kotlin avoids that by choosing generic equals for all properties to compare their values.

1 Like