String literals, way to compute hashCode at compile-time?

So imagine you’re trying to pay attention to performance/cache coherency, but also want the freedom of working with String literals. You’d want something like this:

val foo = myMap["Name"]

To really do something like this:

const val ID = "Name".hashCode()
val foo = myMap[ID] // ID usually unique, can report in case of failure

Where ID is computed at compile time. Normally to accomplish this, you’d need something like constexpr from C++, which as far as I’m aware, isn’t anywhere on the radar for Kotlin. Is there another way to accomplish this in Kotlin? Would a compiler plugin be sufficient? Any tips here? Thanks.

How do you know it is not computed at compiled time already? Did you run any benchmarks to confirm there is a room for performance improvement or is it just your guess that it could/should be improved?

1 Like

2 things:
1- I believe that, since this is a general issue with dealing with Java’s Hashmaps, there might be some optimization for it in Proguard or any other minifier, or in the JIT itself. If not, I really don’t think there’s much you can do here
2- Hashcodes are not guaranteed to be unique, but just distributed well enough.

Now onto the solution:
If this behavior of only relying on the hashcode is exactly what you need, then your best bet is to use your own custom map implementation with a get method like this:

inline override operator fun get(key: K): V = myInternalHashcodeArray[key.hashcode()]
}

because its inline, this code will be inlined in the resulting JVM bytecode, and then a minifier like Proguard should be able to notice the constant expression of “taking the hashcode of a constant string” and optimize it out.

The true reality of the situation, though, is that this is really a premature optimization. You’re already dealing with maps using string literals, the small hit of using hashcode on a constant string shouldn’t matter that much. Again, benchmarking is pretty damn important in a case like this, since maintaining your own hashmap implementation is a bit of a hassle.

You do know that hashcode is computed only once for any string right?

It’s a good idea for literals imo, why do it in runtime if it could be done in compile time…

Well then simply 1) make sure you have your own map implementation with an inlined get (because no compiler, and I mean no compiler, can optimize the hashcode away if it only happens inside the HashMap itself since it is a platform class that can’t be modified) and 2) find a minifier that optimizes constant hashcode on literals.

Sure, there isn’t a reason not to do it, but the benefit from it is marginal.

static is JVM constexpr performance equivalent:

@JvmStatic
val ID = "Name".hashCode()

Also, as @vach noticed, on JVM/Android hashcode is computed only once (with the exception of hashcode=0, which is used as special value).

This has higher priority:

https://youtrack.jetbrains.com/issue/KT-7774

because it opens the door to what you actually asking for:

https://youtrack.jetbrains.com/issue/KT-14652

the most heavily optimizing minifier i now is R8, and it does not optimize String hash codes

I don’t really see the point in this. Java is not C++ and some optimizations typical for C++ may be completely irrelevant to Java.

  1. My assumption is that hashCode() for string literals is calculated only once, so there is probably no real performance benefit of calculating it at compile time.
  2. As @kyay10 explained, this optimization is impossible without a custom map implementation that exposes its internals, because even if we replace a string literal with integer literal then map… will hashCode() this integer anyway. And I believe in most cases this integer hashCode() won’t be cached contrary to string hashCode(), so this may actually degrade the performance.
  3. Even if we somehow fix the above problem, map still has to perform equality check on the key. Map is not an array ID->value, they’re different data structures.
3 Likes

I completely agree with @broot there’s no advantage in doing this in Kotlin.
Additionally you can’t actually do this in C++ either: C++ std::hash::operator() is not constexpr see std::hash<Key>::operator() - cppreference.com

And it is impossible for hash to be made constexpr since its value might change between program runs.
See std::hash - cppreference.com

Hash functions are only required to produce the same result for the same input 
within a single execution of a program; this allows salted hashes that prevent 
collision denial-of-service attacks.
2 Likes