You can, apparently, add state to classes with extension properties with almost no overhead (a.k.a. mixins in Kotlin)

As you all know, it’s “impossible” in Kotlin to add actual state to a final class in a mixin-like-fashion. Well, apparently there is a way. The simplest way to think of how to achieve this is to just have a map that maps between the object and the value of the specific property that you want to add. The problem with that is that it prevents gc on the actual object, which is obviously a HUGE performance impact. But, we can still use weak references with almost no overhead. So… here’s the code and enjoy ig:

import kotlinx.coroutines.launch
import kotlin.reflect.KProperty

class Example {
    override fun toString() = "Example Class"
}

interface IWeakReference<T> {
    val referent: T?
    fun clear()
}

open class JavaWeakReference<T>(value: T) : IWeakReference<T> {
    protected val queue = java.lang.ref.ReferenceQueue<T>()
    private val actualWeakRef = java.lang.ref.WeakReference<T>(value, queue)
    override val referent get() = actualWeakRef.get()
    override fun clear() {
        actualWeakRef.clear()
    }
}

interface WeakKeyReference<K, V> : IWeakReference<K> {
    val map: MutableMap<WeakKeyReference<K, V>, V>
}

class JavaWeakKeyReference<K, V>(key: K, override val map: MutableMap<WeakKeyReference<K, V>, V>) : JavaWeakReference<K>(key), WeakKeyReference<K, V> {
    init {
        kotlinx.coroutines.GlobalScope.launch {
            queue.remove()
            map.remove(this@JavaWeakKeyReference)
        }
    }
}

class ExtendedState<T> {
    private val map = mutableMapOf<WeakKeyReference<Any, T>, T>()
    operator fun getValue(thisRef: Any?, prop: KProperty<*>): T? = thisRef?.let {
        map[map.findWeakReferenceForKey(thisRef)]
    }

    operator fun setValue(thisRef: Any?, prop: KProperty<*>, value: T) {
        val key: WeakKeyReference<Any, T> = thisRef?.let {
            map.findWeakReferenceForKey(thisRef) ?: JavaWeakKeyReference(thisRef, map)
        }
                ?: return
        map[key] = value
    }
}

fun <K, V, R : IWeakReference<K>> MutableMap<R, V>.findWeakReferenceForKey(key: K): R? {
    for ((currentKey, _) in this) {
        if (currentKey.referent == key) {
            return currentKey
        }
    }
    return null
}

var Example.p: String? by ExtendedState()

fun main() {
    val examples = sequence{
        var i = 0
        while (true){
            yield(Example().apply { p = i.toString() })
            i++
        }
    }
    examples.take(10000).toList().apply { forEach{
        println(it.p)
    }}.asReversed().forEachIndexed { index, example -> if(index % 5 == 0) println(example.p) }
}

P.S.: While this is using Java’s weak reference, it can actually work with K/N weak reference and possibly JS’s future one. To support that, you just need to turn the WeakKeyReference interface into an expect class and provide the implementations in the platforms’ respective modules.

1 Like

Again, this has almost no overhead because it still allows the gc to take the object (and then the WeakReference deleted the key value pair from the map so even the value gets gced). To allow this to work perfectly for K/N tho, we would either need to wait until Kotlin adds the ability for us to do something like the reference queue in Java, or we can just override the getter on the referent and if it is null then just delete the key-value pair from the map.

Java provides WeakHashMap for this: WeakHashMap (Java Platform SE 7 )

I don’t think that “almost no overhead” is accurate, really, but I guess that depends on where you draw the line.

1 Like

Well, the problem is that there is no guarantee that the hash of any object is unique, so it is technically safer if we use a weak reference to the object itself instead of using a hash. Now there is definitely some overhead included, but it’s probably not that huge if you use it in non-performance-critical parts of your app. The only overhead that I could think of is the map object itself, the weak reference objects, and the values that will need memory. But other than that the only performance overhead is that you need to loop over all the map values to find the right value, which does take some time but still with modern processors it shouldn’t be that slow. Also, all the extra allocations that are created by this are immediately removed when the actual object gets gced, so it guarantees that any memory overhead will get cleaned up quickly. So while there is some overhead involved, the usefulness of this is probably larger when you have a large codebase with library classes that you don’t control. Overall, the readability of your code is probably gonna exceed the overhead of this technique especially with our modern processors and faster gc.

I’ve actually written about it in this article near the end. You can also see how it looks like in the real code.

2 Likes

Wow, that’s quite cool ngl. The problem with your code (I think) is that even if you use WeakHashMap you are still comparing hashes, which are not guaranteed to be unique. So it’s probably better if you store a reference to the actual WeakReference itself like I did in my code. But like wow it seems like the Kotlin community is quite great to the point that we create the same really-weird-but-practical ideas.

Well, hash conflict is an issue, but usually, you have pretty good hashcode generators for basic types. It could be properly optimized of course. As I mentioned in the article, this is not an issue to use frequently, and it should be safely wrapped to avoid leaks of inner state outside (in my case scope guarantees proper live-time), but otherwise, yeah, it works.

1 Like

Yeah ig you could even use the same delegate that I created and just define the actual extension property inside of a scope using by Delegate(). The problem is that you will need to pass around that context, which sort of increases boilerplate but might actually be more readable in certain cases. For the proper life-time part, I think that as long as you aren’t using like a million of those delegates the overhead is only going to be the Delegate object and the Map object for each property (but not per object, so it isn’t that high and could scale well with a lot of objects of the same type).

Regarding hashcode:
Using the hashcode is problematic because you will end up with many hashcode collisions since the default hashcode is based on the memory address and since objects start out in the small eden space. Once the hashcode is utilized, it is permanently assigned and will continue to provide that value even after it has been moved by the GC. So this results in many map buckets with lots of objects resulting in non-constant-time behavior when over-used. If the object has an overwritten hashcode method and the object is mutable then this breaks because the hashcode changes so you won’t realize that there’s an entry in the map in a different bucket.

Regarding weak references:
Although this is pretty cool in principle, I don’t recommend using weak references as a means of enabling a new paradigm since it adds another phase to the garbage collection process so it affect the responsiveness of the application.