The first time a Loom coroutine suspends, the portion of the call stack from the top down to the coroutine entry is copied out into an array (actually 2 arrays). In Kotlin, the corresponding data will already have been copied into the linked chain of Continuation
objects that get built by suspending function calls. The cost is of the same order for the data that is actually copied, but Kotlin will have performed more allocations, and Kotlin will also have built and discarded these continuation objects for calls to suspending function that didn’t actually suspend. This cost that you pay when you don’t suspend is the one that bothers me, but even neglecting that you can see that a loom suspension will have a lower amortized cost.
When a Loom coroutine is resumed, at least its top-most frame needs to be copied back to the call stack. The return address of this frame will be set to a handler so that when the top-most call returns, it will return into some code that will copy back the next-topmost-frame, etc., incrementally as they are required. Kotlin doesn’t have a directly corresponding cost here, but of course any frame copied back to the call stack must have been copied out at some point, and it’s copied back only once, so we can count this in the amortized cost of suspension.
When a Loom coroutine suspends again, its stack arrays will still contain any frames that will not resumed, so these do not need to be re-copied. It will make space at the end of the arrays if required and copy in any new frames. Again, Kotlin will have made Continuation
objects for these frames, etc., etc., so the operation will be cheaper in Loom.
However, the very top-most frame in this case may be a frame that was copied out and in before. Kotlin will only make a Continuation
object for a frame once, so this one frame represents an extra cost for Loom that Kotlin doesn’t have. A single stack frame is not a big thing, however, especially since Java doesn’t have any big value types, so this is essentially a small constant cost per suspend/resume that is dwarfed by all the other constant costs involved in that.
So, in terms of actual suspend/resume operations, Loom’s system is more efficient. Added to that, you have more efficient byte code, because it doesn’t have to indirect through a Continuation
object, more compact byte code, and no red/blue function implementations.
The basic continuation mechanism in Loom is just better. This is not because the Kotlin guys made any mistakes, of course. It’s just the benefit of being able to mess with the VM.
But again, you know, Kotlin delivers more than just continuations. It’s a whole language that, among other things, provides a practical and easy to use coroutine model based on those continuations. Java has a long way to go before it matches Kotlin in that.