Coroutines with debug duplicates exception

Hi all, I’ve found what I consider to be a bug with coroutines and exception handling.

In summary, if you throw an exception inside a suspending function, and the exception either has no cause, or has a cause with a different type to the exception, when the exception is thrown from the coroutine, it’s actually a new exception of the same type as the original exception, which has the original exception as a cause.

You can find my reproducible example here: coroutines-exception-bug/src/main/kotlin/au/com/skater901/Main.kt at main · Skater901/coroutines-exception-bug · GitHub

If you run my code, you’ll find that instead of catching and logging an IllegalArgumentException with no cause, it actually catches and logs an IllegalArgumentException which has a cause of IllegalArgumentException with the same message.

The reason this happens is because of these two functions: causeAndStacktrace(), and tryCopyException(exception: E).

The logic inside causeAndStackTrace() looks wrong to me; the main if condition is only triggered if the exception has a cause, and the cause is the same type as the exception. I don’t understand why it’s written this way; I would assume that in the majority of cases, the cause would NOT be the same type as the parent exception.

The other problem is tryCopyException(); it doesn’t really create a copy of the exception, but rather it replaces the existing exception with a new exception of the same type which has the original exception as the cause of this new exception.

Now it is worth noting that this only happens if coroutine debugging is turned on, but I’m finding with my work projects that coroutine debugging is turned on for automated testing, and I’m not sure why. The system property isn’t set, but the ASSERTIONS_ENABLED property evaluates to true, and I have no idea why.

That said, I don’t think that turning on debugging for coroutines should fundamentally change how exceptions are handled, resulting in this weird exception duplication.

You can fix this problem in my example by commenting out the coroutine debug line which is pretty obvious - this bug only happens when coroutine debugging is enabled - or by commenting out the withContext. Certain coroutine types don’t follow the code path that leads to those functions.

To close, I don’t know if this is actually an intended design feature of the code, but if it is, I’d really like an explanation on why, and what I can do about it to avoid my exception handling code getting all messed up. :slight_smile: Thanks!

I don’t know the background and I see this for the first time, but it definitely looks like an intentional behavior, not an overlook. I guess this is somehow related to the fact that while running a coroutine, the stacktrace may change, so sometimes it may be useful to capture multiple stacktraces, especially if we asked for the debug mode.

Anyway, while the behavior looks a little strange, I think it shouldn’t cause problems. Exceptions are for debugging, so even if we change their handling, that shouldn’t affect the behavior of our application.

1 Like

It’s causing problems in automated tests at my work. Any tests around error handling are affected. For example, we use Quartz for job scheduling, and often will catch any exceptions during the job execution, and rethrow a JobExecutionException with the original exception as the cause. Without coroutines, we can assert that a JobExecutionException will be thrown with a specific cause. With coroutines, we now have to assert that a JobExecutionException will be thrown, with a JobExecutionException as a cause, which has a specific cause.

Maybe you can make an argument that we shouldn’t be making assertions on the exception thrown, but I still think it’s weird that turning on debug mode effectively duplicates every exception thrown.

I ran into this problem previously. See this for the details and how to get around it.
TL;DR:

If the exception class implements CopyableThrowable, CopyableThrowable.createCopy is used. null can be returned from createCopy to opt-out specific exception from being recovered.

Note that if you disable this, the stacktrace is going to be absolutely useless

Nope, this is actually not true at all. I just tested it.

If I run my example project with coroutines debugging turned on, this is the exception stack trace I get:

java.lang.IllegalArgumentException: oh no!
  au.com.skater901.MainKt$main$1$1.invokeSuspend(Main.kt:11)
  au.com.skater901.MainKt$main$1$1.invoke(Main.kt)
  au.com.skater901.MainKt$main$1$1.invoke(Main.kt)
  kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:61)
  kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:163)
  kotlinx.coroutines.BuildersKt.withContext(Unknown Source)
  au.com.skater901.MainKt$main$1.invokeSuspend(Main.kt:10)
  _COROUTINE._BOUNDARY._(CoroutineDebugging.kt:42)
  au.com.skater901.MainKt$main$1.invokeSuspend(Main.kt:10)
Caused by: java.lang.IllegalArgumentException: oh no!
  au.com.skater901.MainKt$main$1$1.invokeSuspend(Main.kt:11)
  au.com.skater901.MainKt$main$1$1.invoke(Main.kt)
  au.com.skater901.MainKt$main$1$1.invoke(Main.kt)
  kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:61)
  kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:163)
  kotlinx.coroutines.BuildersKt.withContext(Unknown Source)
  au.com.skater901.MainKt$main$1.invokeSuspend(Main.kt:10)
  kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
  kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
  kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:277)
  kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:95)
  kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:69)
  kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
  kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:48)
  kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
  au.com.skater901.MainKt.main(Main.kt:9)
  au.com.skater901.MainKt.main(Main.kt)

If I turn off coroutine debugging by commenting out the line where I set the system property, and then run my code, I get the following stack trace:

java.lang.IllegalArgumentException: oh no!
  au.com.skater901.MainKt$main$1$1.invokeSuspend(Main.kt:11)
  au.com.skater901.MainKt$main$1$1.invoke(Main.kt)
  au.com.skater901.MainKt$main$1$1.invoke(Main.kt)
  kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:61)
  kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:163)
  kotlinx.coroutines.BuildersKt.withContext(Unknown Source)
  au.com.skater901.MainKt$main$1.invokeSuspend(Main.kt:10)
  kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
  kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
  kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:277)
  kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:95)
  kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:69)
  kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
  kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:48)
  kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
  au.com.skater901.MainKt.main(Main.kt:9)
  au.com.skater901.MainKt.main(Main.kt)

As you can see, in the second instance, where coroutines debugging is turned off, the exception stack trace is exactly the same as the cause exception when coroutines debugging is turned on. The only difference is that I only get one exception, instead of two exceptions where the original exception is the cause of a newly created exception.

You do get an extra stack trace to debug with coroutines debugging turned on, but imo it doesn’t provide any more info than the stack trace I get with coroutines debugging turned off; either way, I can see that the exception was thrown at line 11 in my Main.kt file. Personally, I think the stack trace is more confusing with coroutines debugging turned on, because now the exception that was actually thrown is no longer the primary exception; instead, it becomes the cause of some new exception that was magically thrown from deep inside the coroutines machinery.

This is a lucky situation where both stacktraces are the same. In general, the initial one is not very useful.

1 Like
fun main() {
    System.setProperty("kotlinx.coroutines.debug", "on")
    runBlocking {
        foo()
    }
}

suspend fun foo() {
    bar()
    println(1) // prevent tail-call optimization
}

suspend fun bar() {
    baz()
    println(1)
}

suspend fun baz() {
    coroutineScope {
        delay(100)
        error("Oh no")
    }
}

It produces:

Exception in thread "main" java.lang.IllegalStateException: Oh no
	at Test1Kt$baz$2.invokeSuspend(test1.kt:24)
	at _COROUTINE._BOUNDARY._(CoroutineDebugging.kt:46)
	at Test1Kt.baz(test1.kt:22)
	at Test1Kt.bar(test1.kt:17)
	at Test1Kt.foo(test1.kt:12)
	at Test1Kt$main$1.invokeSuspend(test1.kt:7)
Caused by: java.lang.IllegalStateException: Oh no
	at Test1Kt$baz$2.invokeSuspend(test1.kt:24)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTaskKt.resume(DispatchedTask.kt:235)
	at kotlinx.coroutines.DispatchedTaskKt.dispatch(DispatchedTask.kt:168)
	at kotlinx.coroutines.CancellableContinuationImpl.dispatchResume(CancellableContinuationImpl.kt:474)
	at kotlinx.coroutines.CancellableContinuationImpl.resumeImpl(CancellableContinuationImpl.kt:508)
	at kotlinx.coroutines.CancellableContinuationImpl.resumeImpl$default(CancellableContinuationImpl.kt:497)
	at kotlinx.coroutines.CancellableContinuationImpl.resumeUndispatched(CancellableContinuationImpl.kt:595)
	at kotlinx.coroutines.EventLoopImplBase$DelayedResumeTask.run(EventLoop.common.kt:493)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:280)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at Test1Kt.main(test1.kt:6)
	at Test1Kt.main(test1.kt)

In the second stacktrace there is a lot of noise and we don’t even see foo and bar anywhere. This is what happens if debug is disabled.

I must say I don’t fully understand the whole mechanism. For example, if we remove coroutineScope from the code sample, it somehow loses the ability to find foo and bar. recoverStackTrace receives a continuation of the main, not of baz, so it doesn’t know the full trace. But as far as I understand coroutines machinery, it should be able to know the exception was thrown when handling the continuation of baz, so I don’t know why it doesn’t work in this case.

This is similar to what I discovered when I was making my reproducible example. I had to use withContext, because if I just used runBlocking on its own, the code never gets into the path of copying the exception. A BlockingCoroutine has a different resumeWith function which doesn’t duplicate exceptions.

Documentation says:

When an exception is rethrown between coroutines (e.g. through withContext or Deferred.await boundary),

So this probably answers why coroutineScope/withContext is needed. I just don’t understand the decision to not recover the exact place where the exception was thrown. Or I don’t understand the technical limitation that disallows this.

1 Like

There is a library that doesn’t require any additional calls for recovering stack traces: GitHub - Anamorphosee/stacktrace-decoroutinator: Small lib for recovering stack trace in exceptions thrown in Kotlin coroutines