Do Deferreds hold on to references?

I’m converting some code from JDK8 futures to coroutines.

We have an in-memory cache (Caffeine) whose values are CompletableFutures. We keep those futures in the cache for a while after the future is completed. Our understanding is that CompletableFuture has only two references in it (result and stack), and it ensures that after a future is done, the stack is nulled out, so there’s no memory implication to hanging on to the future indefinitely.

I’m considering changing this cache to contain a Deferred. Does Deferred work the same way? It seems like the main Deferred implementation is DeferredCoroutine, which is a AbstractCoroutine, which has plenty of val references in it.

Will keeping a Deferred around for a while keep lots of objects alive?

If this is a problem, can I fix it just by continuing to store CompletableFutures in the map created by Deferred.asCompletableFuture()?

Trying to answer for myself, I ran this code and set a breakpoint on the println:

fun main() {
    val f1 = CompletableFuture.supplyAsync<Int> { 5 }
    f1.get()

    var d1: Deferred<Int>? = null
    var f2: CompletableFuture<Int>? = null
    runBlocking {
        d1 = async {
            delay(1)
            6
        }
        d1!!.await()

        val d2 = async {
            delay(1)
            7
        }
        f2 = d2.asCompletableFuture()
        d2.await()
    }

    println("hi ${f1.get()} ${d1!!.getCompleted()} ${f2!!.get()}")
}

So it seems pretty clear that CompletableFutures, including those made from Deferred.asCompletableFuture, end up being just thin wrappers around their results, whereas Deferreds hold on to quite a bit of context. I don’t yet understand the context well enough to understand if this will ever block things I care about from being GCed, but it certainly feels safer to cache CompletedFutures instead.

I am not sure what vals a you are talking about and why does it matter. But you should remember that in kotlin not all vals are an actual object, many of them are synthetic properties that do not use memory. You should perform a memory profiling to understand what actually uses memory. Still, storing completed coroutines in a cache is probably not the best idea because coroutines are not meant to be used outside its scope. You can still use CompletableFuture and mix them with coroutines, but the best way is probably to create a separate cache for result and use a completion handler to add new results into it.

With the current implementation, if two callers try to receive the same missing value from the cache at the same time, the first one will insert a future into the cache and perform the work to complete the future, and the second caller will piggy-back onto the first caller’s work rather than starting a second parallel lookup of the same data. This seems to work pretty well with CompletableFutures.

It totally makes sense that storing coroutines in a cache is a bad idea; the surprising bit to me is that the Kotlin coroutine equivalent of a CompletableFuture is itself a full coroutine.

And to be specific about interpreting my screenshot: both CompletableFutures (the one I completed directly and the one I created from a Deferred) only have two reference fields; one points to the result and the other one (by the end of the function) is null. On the other hand, the Deferred has a whole bunch of fields pointing to various context objects, which as you say is because it’s a whole coroutine itself.

You seem to have typical case of data races. As you already have a suspended access, you can implement it via Mutex:

val computationCache = HashMap<String, Deferred<T>>()
val resultCache = HashMap<String,T>()

val mutex = Mutex()

suspend fun put(key: String, computation: Deferred<T>){
  computation.invokeOnCompletion{
    mutex.withLock{
      resultCache[key] = computation.getCompleted()
      computationCache.remove(key)
    }
  }
  computation
}

suspend fun get(key: String): T{
    resultCache[key] ?: computationCache[key].await()
}

I am not sure all the method names are correct and there is a proper synchronization for external access (it depends on the problem).

I’m not sure why that’s simpler than just using a data type like CompletableFuture that does what I want. What’s the data race in my code? (Note that I’m not just using a HashMap; I’m using a Caffeine cache that has various tunable cache invalidation policies etc.)

I did not say anything against CompletableFuture. You asked, how to do it with coroutines. Also, I am not sure that CompletableFuture would be much more memory effective. You can’t say without memory profiler.