Memory leak when using coroutine context

jelmerk · March 17, 2024, 10:54am

When you run following cli app with a 1 gb heap it crashes with an out of memory exception when you passs it true and succeeds otherwise

Depending on the argument either runGaqlRequest or runGaqlRequest2 is used. The only difference between these methods is that in runGaqlRequest a coroutineContext is used. This seems to prevent the return value from being garbage collected after there are no more references to in in the run method()

Why does this happen and how can you avoid this ?

For convenience i created a gradle project here GitHub - jelmerk/reproduce-kotlin-problem that demonstrates this problem

import kotlinx.coroutines.*

class App(val crash: Boolean) {

    suspend fun run() {
        myReport()
        val b = ByteArray(600_000_000)

        println("Was able to allocate array with size ${b.size}")
    }

    suspend fun myReport(): List<String> {
        return if (crash) {
            runGaqlRequest()
        } else {
            runGaqlRequest2()
        }.map { "some keyword" }

    }

    suspend fun runGaqlRequest(): List<ByteArray>  {
        return withContext(Dispatchers.IO) {
            listOf(ByteArray(600_000_000))
        }
    }

    suspend fun runGaqlRequest2(): List<ByteArray>  {
        return listOf(ByteArray(600_000_000))
    }
}

fun main(args: Array<String>) {
    val crash = args.isNotEmpty() && args[0] == "true"

    val app = App(crash)

    runBlocking {
        app.run()
    }
}

darksnake · March 19, 2024, 7:15am

Dispatchers.IO creates a large thread pool and uses it to run tasks. Threads are heavy. Each of them requires at least 2 Mb of memory to run. So your problem is probably caused not by coroutines, but by multi-threading.

In order to check memory, you should use a profiler like JVisualVM and check which type of objects occupy the memory.

broot · March 19, 2024, 8:59am

Why would Dispatchers.IO consume several hundreds megabytes of memory? The point of this example is that if using runGaqlRequest(), the reference to the big array is still live even if we don’t use it anymore.

Just to provide some context. The discussion originally started here: kotlin - Memory leak when using coroutineScope - Stack Overflow It looks like coroutines machinery still keeps references to continuations of functions that already returned. And because it keeps these continuations, it keeps their local variables as well. It seems these references are cleaned up only after the current coroutine suspends. But this is only my impression, I have no idea what’s happening here and I may misread all of this.

darksnake · March 19, 2024, 9:31am

If Dispatchers.IO starts a thread pool with 100 threads, then it will consume 200 Mb of for those threads right away. In this particular case, I do not see new launch, so it probably won’t start those threads right away. You can check it by replacing Dispatchers.IO with a different single-thread dispatcher. It is also possible that allocation (not GC) works differently on captured primitive arrays. But to study that one needs a profiler.

broot · March 19, 2024, 9:55am

Yes, we used VisualVM and there I could find references to continuations of functions that already returned. But I’m far from being an expert in this area, so I may entirely misinterpret this.

darksnake · March 19, 2024, 10:10am

Then try checking with single thread executor. It you use main dispatcher, it is possible that you process all request sequentially so previous data is de-allocated and with Dispatchers.IO you start a new thread.

fvasco · March 20, 2024, 3:31pm

I minimized the reproducer:

fun main() {
    for (crash in arrayOf(false, true)) {
        println("crash = $crash")
        runBlocking {
            allocate(crash)
            val b = ByteArray(600_000_000)
            println("Was able to allocate array with size ${b.size}")
        }
    }
}

suspend fun allocate(crash: Boolean): ByteArray {
    if (crash) yield()
    return ByteArray(600_000_000)
}

The difference between the two execution is the suspension point if (crash) yield().
If crash is false, allocate returns the ByteArray and the result was discarded.
Instead, if crash is true, allocate returns COROUTINE_SUSPENDED, after resumption allocate return the ByteArray and the result was assigned to the coroutine’s result variable, and there stay when val b = ByteArray(600_000_000) is invoked.
For my reproducer, coroutine machinery should be modified.

Edit: the result variable should not be reassigned until the next coroutine resumption.

jelmerk · March 21, 2024, 10:10am

The thread it executes on doesn’t really matter, if you look at a heap dump of the process you’ll see that a variable local to the main thread is preventing the large array from being garbage collected

@fvasco 's example also demonstrates this

fvasco · March 21, 2024, 11:10am

This issue has been reported 4 years ago

https://youtrack.jetbrains.com/issue/KT-33986/Null-out-result-field-when-suspending-a-coroutine

or 7 years ago…

https://youtrack.jetbrains.com/issue/KT-16222/Coroutine-should-be-clearing-any-internal-state-as-soon-as-possible-to-avoid-memory-leaks

darksnake · April 1, 2024, 10:21am

No wonder it is not fixed. The assignee is not on coroutines team anymore…

Topic		Replies	Views
Unavoidable memory leak when using coroutines Android	22	25856	February 22, 2019
Coroutine memory management issue Support	7	3254	November 6, 2018
Rewrite code to avoid leaking? Language Design	6	916	June 25, 2021
Understanding Thread Count In Kotlin Flow With Suspend And runBlocking Support	3	1564	October 28, 2019
Combining "Incompatible" Contexts for Coroutines? Support	4	2385	May 4, 2017

Memory leak when using coroutine context

Related Topics