String literals, way to compute hashCode at compile-time?

So imagine you’re trying to pay attention to performance/cache coherency, but also want the freedom of working with String literals. You’d want something like this:

val foo = myMap["Name"]

To really do something like this:

const val ID = "Name".hashCode()
val foo = myMap[ID] // ID usually unique, can report in case of failure

Where ID is computed at compile time. Normally to accomplish this, you’d need something like constexpr from C++, which as far as I’m aware, isn’t anywhere on the radar for Kotlin. Is there another way to accomplish this in Kotlin? Would a compiler plugin be sufficient? Any tips here? Thanks.

How do you know it is not computed at compiled time already? Did you run any benchmarks to confirm there is a room for performance improvement or is it just your guess that it could/should be improved?

1 Like

2 things:
1- I believe that, since this is a general issue with dealing with Java’s Hashmaps, there might be some optimization for it in Proguard or any other minifier, or in the JIT itself. If not, I really don’t think there’s much you can do here
2- Hashcodes are not guaranteed to be unique, but just distributed well enough.

Now onto the solution:
If this behavior of only relying on the hashcode is exactly what you need, then your best bet is to use your own custom map implementation with a get method like this:

inline override operator fun get(key: K): V = myInternalHashcodeArray[key.hashcode()]
}

because its inline, this code will be inlined in the resulting JVM bytecode, and then a minifier like Proguard should be able to notice the constant expression of “taking the hashcode of a constant string” and optimize it out.

The true reality of the situation, though, is that this is really a premature optimization. You’re already dealing with maps using string literals, the small hit of using hashcode on a constant string shouldn’t matter that much. Again, benchmarking is pretty damn important in a case like this, since maintaining your own hashmap implementation is a bit of a hassle.

You do know that hashcode is computed only once for any string right?

It’s a good idea for literals imo, why do it in runtime if it could be done in compile time…

Well then simply 1) make sure you have your own map implementation with an inlined get (because no compiler, and I mean no compiler, can optimize the hashcode away if it only happens inside the HashMap itself since it is a platform class that can’t be modified) and 2) find a minifier that optimizes constant hashcode on literals.

Sure, there isn’t a reason not to do it, but the benefit from it is marginal.

static is JVM constexpr performance equivalent:

@JvmStatic
val ID = "Name".hashCode()

Also, as @vach noticed, on JVM/Android hashcode is computed only once (with the exception of hashcode=0, which is used as special value).

This has higher priority:

https://youtrack.jetbrains.com/issue/KT-7774

because it opens the door to what you actually asking for:

https://youtrack.jetbrains.com/issue/KT-14652

the most heavily optimizing minifier i now is R8, and it does not optimize String hash codes

1 Like

I don’t really see the point in this. Java is not C++ and some optimizations typical for C++ may be completely irrelevant to Java.

  1. My assumption is that hashCode() for string literals is calculated only once, so there is probably no real performance benefit of calculating it at compile time.
  2. As @kyay10 explained, this optimization is impossible without a custom map implementation that exposes its internals, because even if we replace a string literal with integer literal then map… will hashCode() this integer anyway. And I believe in most cases this integer hashCode() won’t be cached contrary to string hashCode(), so this may actually degrade the performance.
  3. Even if we somehow fix the above problem, map still has to perform equality check on the key. Map is not an array ID->value, they’re different data structures.
3 Likes

I completely agree with @broot there’s no advantage in doing this in Kotlin.
Additionally you can’t actually do this in C++ either: C++ std::hash::operator() is not constexpr see std::hash<Key>::operator() - cppreference.com

And it is impossible for hash to be made constexpr since its value might change between program runs.
See std::hash - cppreference.com

Hash functions are only required to produce the same result for the same input 
within a single execution of a program; this allows salted hashes that prevent 
collision denial-of-service attacks.
2 Likes

Hey folks,

I appreciate your tips on this. I understand the desire not to prematurely optimize, but in gamedev every little bit counts. The idea is actually inspired from the C++ engine at a major studio I worked at some time ago, and I assumed (perhaps wrongly) that they did it for good reason.

On the JVM, not having to store the String literals in bytecode or pay for cache misses while fetching them (which isn’t really profilable via JMH) are still legitimate concerns imo. I could see the Int literals having a similar problem, though - I’m not sure the JVM handles these any differently, although it appears to have “constants” for some values. If I ever attempt to profile this with an amount of code sufficiently large enough to demonstrate benefit, then I’ll update you with my results. As it stands, JMH was not useful for this, either in the abstract or in my experiments with it on this particular issue.

I’m currently using ProGuard, and it’s been a while since I investigated, but from what I recall it did not have the features necessary to accomplish something like this. Ah well.

Some of you are referencing standard libraries in terms of applicability - though using them would be nice, I’m assuming here that basic data structures could be reimplemented to facilitate this. For cache coherency in a game engine, using Java’s HashMap is a non-starter anyway. (All those Entry instances floating around in memory start to matter very quickly. JMH doesn’t reveal this as it doesn’t clear the cache between invocations, but it in a real-world codebase the results can be quite dramatic.)

My understanding is that constexpr isn’t really on the menu for Kotlin any time soon, but of course if that ever changes I will be thoroughly investigating it :slight_smile:

Thanks for your feedback - I’m gonna keep this issue on the backburner, and check back into it if I ever see real-world performance issues with it.