Spring, Coroutines, Virtual Threads

I’m working on a legacy REST backend written in Kotlin using the Servlet Stack / tomcat / thread per request.

In experimenting with going fully async non blocking, I’ve looked at Virtual Threads and Coroutines.

Virtual Threads have the advantage of not requiring any changes in the existing code.

Coroutines do require changes to go fully async, namely adding suspend in a bunch of places, but also require moving to R2DBC and various other “fixes”. Not a huge deal but touches a lot of places.

Further, with Spring ThreadLocal things and Java agent instrumentation, I worry about how to do context propagation in coroutines.

I am fully aware of structured concurrency, still lacking in Java until later this year. Also, pinning which from what I’ve gathered most Java libraries have addressed (like the Postgres driver), but not sure if coroutines have fully addressed.

Which leads me to my questions / dilemmas.

  1. In a legacy Spring Boot setup (not Webflux, Flow, or Rx or ktor), does it make sense to use Coroutines?
  2. I’ve heard of running Coroutines on top of Virtual threads, but wouldn’t that double memory allocations? Wouldn’t that make context propagation that much harder?
  3. Is it ever a good idea to use runBlocking in a Spring service?

Any insights are appreciated. Unclear on what folks are doing in similar setups to mine.

2 Likes

I never had a chance to compare a service running coroutines and VTs side by side, so take my opinion with a grain of salt. I generally have an impression that coroutines don’t provide that much value over VTs. VTs are much more integrated into the runtime itself, while coroutines are a kind of hack implemented in the bytecode. If we need multiplatform, we need to use older JVMs, we like structured concurrency of coroutines, we like their API in general (for example: flows are great), we can go with coroutines. But if we only need a lightweight concurrency, VTs are probably better in a long run.

Coroutines propagate context using… well, CoroutineContext. If we need to integrate with a code that utilizes thread locals, coroutines provide tools to do this.

Coroutines couldn’t really solve the problem of thread blocking, so any blocking code is simply scheduled to a larger thread pool, meant for blocking. And we have to switch manually as coroutines can’t detect blocking. So coroutines don’t have the problem of pinning, because… they aren’t even that far :wink:

We should generally avoid it. runBlocking is meant for bridging with the code which is not coroutine-aware. Spring can run coroutines, so we shouldn’t need runBlocking there.

1 Like

Thanks for the replies. I wasn’t aware that coroutines reschedule blocking, are there resources to look into that?

I have seen coroutines having advantages over Loom. I’ll have to back and find some of those.

I have a hard time advocating for coroutines in this particular app, because virtual threads accomplish a lot of the same (suspension, continuations), but do so without any code modification.

Explicit parallelization (CompletableFuture.supplyAsync) and the forthcoming TWR structured concurrency definitely are not as pleasing as the rest of Kotlin.

Guess that could be made a bit better with some spring magic or a compiler plugin, but not sure the juice is worth the squeeze.

They don’t reschedule by themselves - we have to do it.

It is funny, but I looked quickly trough coroutines documentation and I didn’t find anything about handling the blocking code. But I can assure you it is generally discouraged to run blocking code in coroutines. This is not a hard requirement. It only means if we use a small thread pool for the best CPU utilization (this is default) and a coroutine gets into blocking code, the thread will be blocked and can’t run any coroutines until unblocking. Or if we write GUI application using coroutines and we block the main thread, it still causes the app to stop responding. But, if we create our own thread pools for running coroutines, if we block for short periods or performance isn’t critical for our application, technically we could block.

Coroutines provide a shared, bigger thread pool to offload blocking code there: IO
But we have to do it manually: withContext(Dispatchers.IO) { readFileContents() }.

1 Like

Thanks again. Last question, is anyone running coroutines on virtual threads, or is that a bad idea?

I don’t have a definitive answer on this. Technically speaking, running coroutines on top of virtual threads, is as simple as creating a dispatcher with Executors.newVirtualThreadPerTaskExecutor().asCoroutineDispatcher() and scheduling with it:

val vtDispatcher = Executors.newVirtualThreadPerTaskExecutor().asCoroutineDispatcher()

suspend fun main() = withContext(vtDispatcher) {
    println("#1: ${Thread.currentThread()}")

    val deferred1 = async {
        println("#2: ${Thread.currentThread()}")
        delay(500)
        println("#3: ${Thread.currentThread()}")
        "hello"
    }
    val deferred2 = async {
        println("#3: ${Thread.currentThread()}")
        delay(1000)
        println("#4: ${Thread.currentThread()}")
        "world"
    }

    println(deferred1.await() + deferred2.await())
    println("#5: ${Thread.currentThread()}")
}

Result:

#1: VirtualThread[#20]/runnable@ForkJoinPool-1-worker-1
#2: VirtualThread[#25]/runnable@ForkJoinPool-1-worker-3
#3: VirtualThread[#26]/runnable@ForkJoinPool-1-worker-4
#3: VirtualThread[#28]/runnable@ForkJoinPool-1-worker-3
#4: VirtualThread[#30]/runnable@ForkJoinPool-1-worker-4
helloworld
#5: VirtualThread[#31]/runnable@ForkJoinPool-1-worker-1

However, it feels both frameworks duplicate the same functionality, they do similar things in a different way. They don’t cooperate, they aren’t aware of each other. If we suspend using coroutines, from the VTs perspective we just schedule multiple VTs. If we suspend using VTs, coroutines perceive this as a thread blocking (but we have potentially unlimited number of threads, so this is not a problem).

I see potential benefits of this, e.g. using coroutines APIs and tools, while not having to worry about the blocking code. But I don’t know, I never tried this pattern myself.

1 Like

Speaking as someone who absolutely loves coroutines… yeah, it sounds like in your case using VTs is the way to go.

2 Likes

Thanks. The direction I am leaning is:

  1. If I had to start a new application, I would start with Coroutines (most likely)
  2. If I have a legacy codebase built around the Thread per Request model, with limited need to concurrency control (other than occasional launch and async, then Virtual Threads + CompletableFuture are the way to go

Watching Roman’s video “Coroutines and Loom behind the scenes” - seems to support the above conclusions. He even often refers to Virtual Threads as best for the “Virtual Thread per Request” model.

Right, they don’t cooperate afaik right now. Roman mentioned in his talk (before he left Jetbrains) that maybe Loom + Coroutines could work together better at some point in the future. Not sure if that is something on the Kotlin roadmap or not.

I’m wondering about the same. We can easily imagine my whole above fork-join example could be automatically translated by the coroutines machinery to VTs: when we do async, internally coroutines start a new thread, delay() becomes Thread.sleep and await is join or awaiting on a future. Job done.

However, coroutines provide much more advanced and low-level API than Loom. Continuations are part of the official API and we can do many crazy things with them, not necessarily related to concurrency - we can create state machines, monads, generators, etc. I don’t think these are directly translatable to VTs. Also, coroutines are scheduled cooperatively, so they provide certain guarantees which VTs again can’t provide.

But of course, coroutines could use native functionality of VTs wherever possible and still use their own implementation otherwise (but this would probably mean we still have suspend functions, even if we don’t need them most of the time). Or they could limit the functionality of coroutines if compiling the code with the Loom support.

1 Like

A coroutine example I’m fond of is this, which will switch a UI component to an error state for 3 seconds and then switch it back.

mainScope.launch {
    view.setErrorState(true)
    delay(3000)
    view.setErrorState(false)
}

This always runs on one thread (the main one), doesn’t do anything that most people think of when they think “concurrent programming”, and is safe since the scope will be cancelled when the UI goes away. This is my counterargument whenever I hear someone say, “Coroutines are lightweight threads.”

Thanks for the example. I read from your response that you mean that “Coroutines are more than lightweight threads” … ie, like Roman mentioned, fine grained concurrency. Or more broadly, tools for doing more things with concurrency than vanilla async/await

I think the best definition of coroutines is “code that can suspend without blocking the thread it’s running on”. So I guess VTs is code that can block a thread but you don’t care.

For me “coroutines” mean we have the ability to explicitly jump to another code location and stack. This comes from the name: subroutine is when we call another part in the code and it becomes a part of our execution flow, it is added to our stack, it becomes our child. Coroutine is when we call another existing execution flow, our sibling, another stack.

Such a jump can be used for suspending capability (we jump out somewhere, and after some time someone jumps back to us), but it can be used for many other cases, again: state machines, generators, etc. Kotlin coroutines provide this capability with continuations. VTs don’t provide such capability: we can only request to suspend or resume another VT, but we can’t request to jump to another VT.

But I guess this is a digression from the main topic.

You might be interested in reading/contributing to this:

In my to-read list currently, but haven’t had the time yet.

I don’t fully understand the interplay between VT in Loom and coroutines, so the following maybe incorrect.

When a coroutine is started, it will get a VT as a carrier thread.

When a coroutine suspends, it is unmounted from the VT. Does that VT then get garbage collected?

When a coroutine resumes, I presume a new VT is created and the coroutine is mounted to this new VT.

What happens when the VT yields? For example, make a JDBC call.

I presume that the Coroutine is just hanging out in state on the VT, so the whole thing (Coroutine + VT) are unmounted from the VT carrier thread. When the VT is resumed, the whole thing gets mounted on a platform thread and continues.

Is this right?

Are there currently any downsides to this approach? Obviously, there are a lot more allocations as you create both a coroutine and VT. You also create a new VT every time a coroutine resumes (I think).

Not sure of the other tradeoffs, risks.

And to my earlier question, is anyone doing this in production?

Thanks!

I believe you got it right and this is what I described in previous posts. I don’t see major downsides of this approach, only:

  • added complexity - sometimes we suspend using VTs, sometimes we suspend using coroutines, both mechanisms are independent of each other, so developers, tools, etc. need to be aware of both mechanisms.
  • Potentially added overhead, e.g. we still use suspend functions, continuations, etc. even if we only ever suspend using VTs.

My understanding so far:

  • Coroutines are more lightweight. If you have many small CPU-bound tasks (e.g. actors, complex streaming APIs, reactive pipelines), coroutines will be better.
  • Virtual threads automatically convert blocking IO into async IO. If your workflow is very IO-bound (e.g. just calling a database driver), Virtual threads will be better.
  • In theory, modern frameworks should only use async IO, so Virtual threads’ auto-conversion shouldn’t help much, but in practice many wildly used libraries still use blocking IO

To your last point, a ton of apps use JDBC / JPA which is blocking.

1 Like