Rationale behind awaitAll's error handling?

I’m noticing some counter-intuitive error-handling behavior when using awaitAll. For example:

suspend fun main(): Unit = coroutineScope {
  val deferred = async { throw Exception("fake") }
  try {
    listOf(deferred).awaitAll()
  } catch (e: CancellationException) {
    println("CancellationException: $e")
  } catch (e: Throwable) {
    println("Throwable: $e")
  }
}

This is caught in the first catch, but if I replace listOf(deferred).awaitAll() with just deferred.await(), then the exception is caught in the second block.

Although this is documented behavior:

If the Job of the current coroutine is cancelled or completed while this suspending function is waiting, this function immediately resumes with CancellationException. There is a prompt cancellation guarantee.

I still find it quite counter-intuitive. What advantage does doing it this way provide? When would one prefer this prompt cancellation over the more straightforward .forEach { it.await() }?

For the above code I’m not sure if we have any guarantees if we get CancellationException or fake in both cases: when using awaitAll() or await().

Please be aware there are two separate error propagation mechanisms involved in this case. There is a race between them and depending which wins, you get one exception or another. First is because we invoke await()/awaitAll(), so we would like to re-throw the error from the deferred. Second is because we launched that deferred as our child, and its failures automatically cancel the parent.

Using .forEach { it.await() } doesn’t guarantee you’ll get fake. Put a long running deferred first, then the one that fails and you will get CancellationException. Which makes sense, because you don’t even await on the child that failed. Well, you don’t even have to await on anything - simply replace your current awaitAll() with delay() and you will also get CancellationException. Maybe there is an opposite guarantee to get CancellationException if using awaitAll(), but my guess would be that it may throw fake as well.

If you don’t like the behavior of errors automatically propagating from children to parent, simply use supervisorScope. Then awaitAll() is guaranteed to throw fake from the above code.

Also, please be aware if you don’t plan to use the value returned from await()/awaitAll(), then you don’t need to even call it. coroutineScope() automatically waits for any coroutines launched inside it. We don’t have to wait on them explicitly.

I find a big difference between await and awaitall

This code will not crash my app, but instead will cancel the coroutineScope silently and if I launch another coroutine in the same coroutinScope, it will not throw any error at the time of calling launch , but will not execute the coroutine. So my app gets stuck instead of crashing.

coroutineScope.launch {
            val renders = ArrayList<Deferred<Unit>>()
            val job = coroutineScope.async {
                throw OutOfMemoryError()
            }
           renders.add(job)
            coroutineScope.launch {
                renders.awaitAll()
            }
        }

But the following code

      coroutineScope.launch {
            val renders = ArrayList<Deferred<Unit>>()
            val job = coroutineScope.async {
                throw OutOfMemoryError()
            }

            coroutineScope.launch {
                job.await()
            }
        }

will crash my app
Are we not supposed to call awaitAll ?

I believe the problem is not in awaitAll(), but in your code. There is a race condition there and even await() doesn’t guarantee to show the error. Even if you observe 100% consistent behavior, it may work differently e.g. in another machine.

The problem is that you launch a consumer of results inside the same scope as failing jobs. That means whenever a job fails, await()/awaitAll() is cancelled. It is possible we don’t at all get to the await()/awaitAll() line, because it is cancelled even before that.

Why do you use so many coroutineScope.launch {}? Launch only once, then do async {} and waiting directly in the current coroutine. Then it will work as expected.

1 Like