Using .toString() of data classes as a dependency

Hi.

I’m working on a project where I have to create a sha1-hash of an object. Calculating the hash isn’t the problem, but I’m worried about the design of it.

The object itself is a data class and I’m relying on the toString() function to provide the input for the hash function. But is a data class’s toString-function cut it stone or could you imagine the implementation changing. Because if the implementation of toString() changes, the output of the hash would end up being different for the same object, using different toString-implementations.

I’m wondering if there might be a better alternative?!? Should I just always override toString and provide my own implementation or should I trust that the implementation of toString will stay the same in regards to data classes and avoid unnecessary boilerplate?

I see no reasons to further implementation changes of data’s toString() (at least in 1.X branch).

The only available formal language specification doesn’t clarify (yet) implementation details, so technically you cannot hard-depend on current details and must generate some form of stable output by yourself. I suggest you to create such strings from some sort of serialization (json from kotlinx.serialization for example) and calculate hash from it.

But as I said before, I don’t see any reason to change current toString() and 99.9% sure you can just keep using it.

@Prototik Thx for the input. But is the serialization-solution not prone to the same issue? How would such a solution work?

As for avoiding issues in this regard, it would probably require a minimal set of unittests. So for what it’s worth it should be easy to catch.

Serialization just a quick way as libraries for data transfer have a strong guarantee about ordering, value representation and so on. No one can forbid you to make this guarantee by yourself (manually writing implementation or using “Generate hashCode/toString methods” from IDE).

1 Like

Ahhh now I understand. Because the json representation of my data class (because JSON is a standard) would guarantee the same output (unless of course my data model were to change).

This would of course create some extra overhead for each generation, but it’s definitely worth considering. Thanks.

I would not be to sure of that. (JSON) objects are basically maps, and you cannot assume a guaranteed order of the keys of a map (unless it is a specialized map that does have that guarantee).

And in JSON there are infinite ways to specify the same value. For example:

{"foo":"bar"}
{ "foo": "bar" }
{
    "foo": "bar"
}

I would not be to sure of that. (JSON) objects are basically maps, and you cannot assume a guaranteed order of the keys of a map (unless it is a specialized map that does have that guarantee).

I was thinking about putting this to a test using kotlinx.serialization. I’ve been thinking the same.

And in JSON there are infinite ways to specify the same value. For example:

{"foo":"bar"}
{ "foo": "bar" }
{
    "foo": "bar"
}

Removing the prettyprint features and just using raw json should get rid of this problem I believe.

But then you would need the same guarantee for the serializer as you hoped to get for toString(). As long as the serializer has some freedom when generating the JSON, you would need somebody to guarantee you that the output is stable.

Let’s be honest though. I don’t think anyone really believes that the toString implementation for data classes will change in the near future.
I haven’t seen any discussion about changing anything there and I can’t come up with any argument why you would want to. So unless you need a 100% guarantee that your implementation will work with every kotlin version for the next 10,000 years toString should be fine.
Otherwise I suggest you override toString yourself just to be sure. Also I guess you can use an annotation processor to generate you a function that generates the string if writing it manually is to much work but this is an extreme solution.

But then you would need the same guarantee for the serializer as you hoped to get for toString() . As long as the serializer has some freedom when generating the JSON, you would need somebody to guarantee you that the output is stable.

@jstuyts Well I kind of have this I believe if the json is to be valid. I would have a guarantee within the bounds of the specification for the JSON-standard. My concern with the serializer solution is actually more (from experience) that they always require complex configuration for them to work on mildly complex objects. They never work out of the box.

So unless you need a 100% guarantee that your implementation will work with every kotlin version for the next 10,000 years toString should be fine.

@Wasabi375 My thought exactly.

Thanks for all the input. Mighty nice of you.

If you worried about implementation and have a critical point on it, then implement it yourself :smile: