Coroutines with blocking APIs such as JDBC


#1

When using an API such as JDBC, their function calls are Blocking, not suspending, such as Statement::executeQuery or Satement::executeUpdate.

This means that for any kind of operation you have to block a Thread. If you are waiting for 10k of queries to complete, you should have 10k of threads in your JVM.

We can reduce the amount of created Threads with pools, but they can not reduce the amount of waiting Threads.

This raises many issues:

  • Coroutines ARE NOT light-weight -> In this case, they require a whole thread!
  • Thread-Safety issues -> This approach forces you to use several threads concurrently, they may be usually sleeping but a bunch of them can wake up at the same time and modify shared state.

What are the best practices when dealing with this APIs?


#2

The same used for regular blocking code.


#3

@fvasco is right. But I will add that for many DBs there are non-blocking clients these days if you’re writing new DB code which it seems like you are. Even the pure Java nio-based ones using CompletableFuture will work well with coroutines or reactive libs.


#4

In vert.x they do it that way that they have to thread pools: one for short runners and one for long runners (e.g. blocking i/o). The one for long runners has a fixed size that does not allow the number of threads to grow above a certain size. The developer needs to check whether sufficient threads are available in the pool for long runners before starting some task that does blocking i/o.

So the developer in his program design needs to think about in advance how to minimize blocking i/o. If things go awry the code has to be prepared to spend some wait time doing other things till one thread for long runners becomes available.

This approach in vert.x shows that there is a dilemma for which there is no silver bullet. You have to know where blocking i/o is happening and have to minimize it in the program design.


#5

Maybe you could switch the context to Dispatchers.IO to help. Check out this if you haven’t already: https://medium.com/@elizarov/blocking-threads-suspending-coroutines-d33e11bf4761

IO-bound code does not actually consume CPU resources, so if we use the default dispatcher we may end up with a situation when, for example, on an 8-core machine with 8 threads allocated to the default dispatcher, all of the threads are blocked on IO, but they do not actually consume CPU, so our 8-core machine is underutilized. IO dispatcher allocates additional threads on top of the ones allocated to the default dispatcher, so we can do blocking IO and fully utilize machine’s CPU resources at the same time.

Technically, coroutines always require a whole thread when running. The advantage comes from them not being bound 1-to-1 to threads.

Coroutines are in fact light. Just because you’ve launched 10k coroutines that all call a blocking method does not mean you have 10k threads (in the case of Dispatchers.IO you’d have a 64 thread limit by default for that group). Yes, you would not be able to block on all of them without 10k threads, so if your coroutine calls a blocking method it would indeed block the thread before continuing on to the next blocking call.

This is no different than using coroutines without blocking calls. Whenever you have concurrent operations on shared mutable state you must watch for race conditions. Even if you run all of your coroutines from a single thread you may still have to worry about shared mutable state.


#6

It does not make sense to have thousands of DB connections. With optimized database you should use something like 100-200 connections, may be even less on not so beefy server. So introduce a thread pool for database operations of that size and offload all blocking operations into that pool. Now your coroutines thread will be able to run without blocking.


#7

I would suggest using an async driver like jasync sql: https://github.com/jasync-sql/jasync-sql
P.S. I contribute to that project.