Our backend is written in Kotlin
. The data is in MongoDB
.
We did some profiling and this revealed that the current bottleneck is that too much data gets transferred between MongoDB and the Kotlin
backend.
get_by_id()
fetches the same data again and again.
We thought about caching all get_by_id()
calls in an in-memory cache
(shared by all threads of this node). This way all threads on a node can benefit from the faster access to data from this cache.
The next step would be to implement cache-invalidation
. All modifications would need to update the in-memory
cache.
Before implementing this, I want to know which different/better ways exist to implement this.
How to optimize the fact that the code does fetch the same data from MongoDB again and again?
Hi there.
That’s not a Kotlin specific issue (but it is an interesting one), there is lots of ways to do cache, depending on the complexity and amount of it you might be better off using something like Redis
, which can do automatic cache invalidation (based on time/space left). If your amount of data is limited, you might get away with a simple ConcurrentHashMap
, but I would not recommend it if your data is often invalidated as it is using a bucket of mutex under the hood, which destroys performances on contention.
As for cache invalidation techniques, I usually make the code that updates the data invalidate the cache entries after the update is succesful, but there is a lot of way to do it, each with their pros and cons.
The guys at StackOverflow made some really interesting article about how they cache here and here.
Caching is a very deep topic, and bad caching can be very counterproductive, you have to measure whether requesting Mongo is more or less efficient than your in-memory cache for your workload, it is all about compromises.
Have a great day
1 Like
Thank you very much. The link to the Nick Craver Blog was helpful. Thank you.