I’ve got a totally unexpected deadlock in my Java/Kotlin code and finally localaized it to this example:
import kotlinx.coroutines.experimental.*
fun main(args: Array<String>) {
val stream = (1..10).map{ num->
async(CommonPool, CoroutineStart.LAZY) {
println("I am running $num")
}
}.stream()
stream.parallel().forEach{ deferred->
runBlocking {
deferred.await()
}
}
}
Just tested it in kotlin online and it hangs. If I remove parallel()
, it works as expected.
My real much more complicated and it is not that simple to exclude this parallel processing without damaging Java part of the program. The problem is possibly is appearing because kotlin blocks all of the threads in the common pool, So it is probably could not be helped. But maybe there is a way around that.
UPDATE The problem seems to arise only for lazy coroutines. Removing CoroutineStart.LAZY
also solves the problem.
Found a workaround by creating deferred with CoroutineStart.DEFAULT
just when it should be started instead of starting previously created lazy deferred. It seems to be a bug though. Also I found that problem could be caused not only by parallel stream, but also by calling coroutine from strange places like groovy script.
Plus running runBlocking
inside of a for-loop instead of outside seems…unconventional, at best.
It could not be avoided since coroutine part is originally called from some Java/Groovy code. In order to move runBlocking
to the top level, one needs to rewrite everything in kotlin.
Filed an issue. I hope it will be fixed. My workaround works, but it forces me to completely avoid lazy coroutines and I really like them.
This is not a bug. This code creates a genuine deadlock. This code does runBlocking
inside stream.parallel()
which dispatches to CommonPool
(by default), so all the threads in common pool are blocked, but, at the same time, coroutines in this code are dispatched to the the same CommonPool
, which is blocked by runBlocking { deferred.await() }
, so they cannot execute. As a general rule, the code that runs inside parallel streams should never block (runBlocking
should not be used there). If this cannot be avoided, then coroutines shall be dispatched to a separate thread pool.
2 Likes
I figured that much. The problem is that I do not have parallel stream in my real code. I have a call from groovy and it still causes deadlock which is very hard to debug. I did not found what actually causes the dead lock.
The strange thing is that everything works fine for default coroutine and does not work for a lazy one.
I just tried to debug my code and runBlocking
is called from main thread and locks from the fist call if debugger is correct.
OK, I run some tests and actually tried to move runBlocking
to the top ensuring it is run only once for the whole executrion. The deadlock still occurs. So it is not connected to the problem I’ve described at the topic. I will try to localize it, but it is not that simple.
Parallel stream blocks CommonPool completely. It is so by design. You cannot combine parallel stream and coroutines running in CommonPool. You should dispatch your coroutines to another thread pool.
1 Like
I just made sure that I never use parallel stream in this code and switched default CommonPool
to custom fixed thread pool. The deadlock still appears. And it does not explain, why only lazy coroutines are locking.
I will search further.
Can you reproduce it with self-contained code?
I am currently trying to do just that. It works fine with one hand-made coroutine, but does not work in a framework. The framework has internal lazy computation dependencies (I was using CompletableFuture
to manage them before) and I think it could be somehow the source of the problem, but for now I still can’t localize it.
I will get back to you as soon as I find something.
I’ve spent almost two days trying to find the problem. For now I have two very similar calls to the same coroutine and everything works fine in one case and hangs in the other. I’ve eliminated any stream (or at least parallel streams in the process and ensured that actual call occurs from the main thread).
The only difference I see when debugging is that field _state
equals Empty{New}
in working case and List{New}[InvokeOnCompletion[kotlinx.coroutines.experimental.future.FutureKt$asCompletableFuture$2@7671cb68]]
in hanging case. Searching through my code I found that indeed there is second case there is a CompletableFuture
that depends on coroutine in question. What I don’t understand is why it affects the coroutine. In my case this dependent future should never be called anyway because it also depends on some coroutines which are never started (using CompletableFuture.allOf()
).
Tomorrow I will try to buuild a standalone example, but I hardly can imagine this to be intended behavior.
Thanks. Please, make sure you are using the latest version of coroutines library (current 0.20). There was a bug in coroutine state-management that was fixed recently and it did affect coroutines in List{New}
state.
1 Like
I wish I knew about it sooner. Seems to be working fine on 0.20.