My experience with "select" in coroutines

frizzil · March 6, 2023, 7:56am

Hi folks, I wanted to talk a little about my experience with coroutine select in the context of a game engine.

First of all, I’m very happy that it exists in Kotlin! Coroutines are so incredibly useful, and there’s so much yet-unexplored potential for game design purposes, so thank you JetBrains!

The select part of the library seems experimental and little spoken of, so I wanted to talk about it a bit, and maybe where my needs are differing from what’s provided. Here’s an example script taken directly from a combat tutorial I’m prototyping in my game (making use of a “dialogue DSL”):

    // Wait for enemy encounter:
    script.waitFor(PlayerScript.State.AGGROED)
    wait(1000.milliseconds)
    pauseGame = true
    defaultAdvance = GuiSpeech.AdvanceMode.AUTO
    show {
        - "Alright, good work! I'll pause the game so you can plan."
        + "Press the right mouse button or click the right thumbstick to lock on!~"
    }
    var lockedOn = false
    var takenDamage = false
    var ranOutOfStamina = false
    var startedBleedingOut = false
    var timesDead = 0
    whileSelect {
        if (!lockedOn) script.onState(PlayerScript.State.LOCKED_ON) {
            lockedOn = true
            soundFacade.playTutorialSuccess()
            wait(1000.milliseconds)
            show {
                - "Great, now your camera will track with the enemy!"
                + "On mouse and keyboard, you can look freely as the camera tracks."
                + "On controller, you can tilt the right thumbstick to adjust your aim.~"

                - "Try to lead the bad guy's position as you aim."
                + "The longer you charge your shot, the easier it will be to hit your target!"
                + "Keep an eye on your stamina though...~"

                - "Oh! And these guys are fast! Watch out when they disappear and break your lock!~"
            }
            true
        }
        if (!takenDamage) script.onEvent(PlayerScript.Event.HIT_BY_PROJECTILE) {
            takenDamage = true
            wait(500.milliseconds)
            show {
                - "Ouch! Make sure to move around while you're aiming - the first rule of combat is to survive!"
                + "You'll also lose your charge when you're hit, so watch out!~"
            }
            true
        }
        if (!ranOutOfStamina) script.onEvent(PlayerScript.Event.NOT_ENOUGH_STAMINA) {
            ranOutOfStamina = true
            wait(500.milliseconds)
            show {
                - "Oh no, you're out of stamina! You can either press E or the Y button to eat and regain it,"
                + "or you can look around for a green halo of light on the ground and blow it up.~"

                - "Running away is also an option, but these guys can be hard to shake!"
                + "Plus you're liable to run into more bad guys that way, so be careful!~"
            }
            true
        }
        if (!startedBleedingOut) script.onState(PlayerScript.State.BLEEDING_TO_DEATH) {
            if (player.vitals.bloodPct > 0.15) {
                wait(500.milliseconds)
                startedBleedingOut = true
                show {
                    - "You're bleeding out! AAAAHHHHH!"
                    + "...okay, it's not the WORST thing that could happen..."
                    + "But if you don't take care of it, you're _totally_ gonna die.~"

                    - "If you go search for one of those _red_ halos of light and blow it up,"
                    + "it should stop the bleeding, and recover a bit of your health to boot."
                    + "Those bad guys aren't gonna make it easy though, so be evasive!~"
                }
            }
            true
        }
        script.onState(PlayerScript.State.DEAD) {
            timesDead += 1
            withContext(GameDispatchers.General) { delay(1000.milliseconds) } // real time
            when (timesDead) {
                1 -> {
                    show { - "Ah well, it happens to the best of us! All you can do is get up and try again, eh?" }
                    script.waitFor(PlayerScript.State.SPAWNED)
                    wait(500.milliseconds)
                    pauseGame = false
                    show { - "Alright, let's find another bad guy and beat 'em up! I'm counting on you!" }
                }
                2 -> {} //...
            }
            script.waitFor(PlayerScript.State.AGGROED, true)
            wait(1000.milliseconds)

            pauseGame = false
            show { - "Alright, round ${timesDead + 1}! Here we go!" }
            pauseGame = true
            true
        }
        script.onState(PlayerScript.State.AGGROED, false) {
            wait(4.seconds) // wait for last enemy to explode
            pauseGame = true
            show()
            - "Great work! You've officially completed the tutorial."
            + "I hope you enjoy your time in the world of Sojourners!~"
            false
        }
    }

To me, this feels like an incredibly natural pattern for a combat tutorial, where event order is out of our control, and yet some degree of structure is required. However, you’ll notice that I check the “do once” conditions before registering any select clause. This is to prevent infinite looping on “state” clauses, which invoke automatically if the state is currently active at selection start.

The “do once” pattern works well, but it isn’t particularly natural, nor is it efficient in terms of allocation - every time an event is selected, every single clause must be reconsidered and reallocated. I’m not sure what a better solution would be, but to me it definitely feels like an area for improvement… maybe a new selectRepeated function could work, where clauses are persisted across multiple select invocations and can be added/removed over time? As it stands, the only purpose of code between the clauses is to filter the clauses, but this could be readily encapsulated imo.

The other thing I wanted to talk about was complexity in implementing my own select clauses - you’ll notice I have onState and onEvent methods which are implemented like so:

    context(SelectBuilder<R>)
    fun <R> onEvent(event: Event, block: suspend (Any?) -> R) {
        @OptIn(InternalCoroutinesApi::class)
        val clause = makeSelectClause1<Any?>({ getList(event) += it; true }) { getList(event) -= it }
        return clause.invoke(block)
    }

@InternalCoroutinesApi
inline fun <Q> makeSelectClause1(crossinline register: ((Q) -> Unit) -> Boolean, crossinline deregister: ((Q) -> Unit) -> Unit): SelectClause1<Q> {
    return object : SelectClause1<Q> {
        override fun <R> registerSelectClause1(select: SelectInstance<R>, block: suspend (Q) -> R) {
            if (select.isSelected) return
            val handle = { q: Q -> if (select.trySelect()) block.startCoroutine(q, select.completion); Unit }
            if (register(handle)) select.disposeOnSelect { deregister(handle) }
        }
    }
}

Honestly there’s a lot to talk about here, and I’m sure there’s reasons against this, but:

The pattern of passing a SelectInstance that can only be intercepted by implementing a SelectClause1 feels laborious, especially when you can’t often expose the clause externally anyway (see next point.) Why can’t SelectBuilder expose the instance directly? It doesn’t appear to be getting reused, after all! (See previous point.)
You’ll notice I’ve defined functions that accept parameters and the clause’s lambda, and don’t simply return a clause. This is because Kotlin syntax forbids calling invoke on a value returned from a function via lambda syntax - this is interpreted as an extra parameter to the original function instead, even if it doesn’t accept a lambda, resulting in a compile error! The existing syntax of clauses as invokable properties (e.g. onAwait) seems unnatural to me to start with, but this limitation makes it worse imo. It’s not obvious that you’ll run into this problem defining your own clauses.
There’s enough boilerplate involved that I felt the need to create an inline helper function to define anonymous classes at call sites. The pattern to create a “handle” and dispose it, invoke block as a coroutine that chains with select.completion, etc is very unintuitive, and I had to scrape it from kotlinx.coroutine library code to have any chance of getting it right (and I probably didn’t.) The pattern may be technically necessary and perfect in some sense, but from a user perspective I think this could use a lot of improvement.

I suppose to sum up my concerns, it comes down to unintuitiveness, boilerplate, and multiple unavoidable gotchas. Even with my helper function, clauses can get pretty hard to understand, and I feel the need to define more than a few of them! Conceptually though, I feel like new clauses should be about as hard to add as a new suspend function that invokes suspendCancellableCoroutine (or even directly convertible from such definitions.)

Example complex clause:

    context(SelectBuilder<R>)
    fun <R> onState(state: State, value: Boolean = true, block: suspend () -> R) {
        @OptIn(InternalCoroutinesApi::class)
        val clause = makeSelectClause0({
            if (states[state.ordinal] == value) {
                it() // invoke immediately (can infinitely loop!)
                false // don't register for select dispose
            } else {
                getList(state, value) += it
                true
            }
        }) {
            getList(state, value) -= it
        }
        return clause.invoke(block)
    }

And for contrast, the equivalent suspend functions of my clauses:

    suspend fun waitFor(event: Event): Any? {
        val list = getList(event)
        return suspendCancellableCoroutine { cont ->
            list += cont
            cont.invokeOnCancellation { list -= cont }
        }
    }
    suspend fun waitFor(state: State, value: Boolean = true) {
        if (states[state.ordinal] == value) return
        val list = getList(state, value)
        return suspendCancellableCoroutine { cont ->
            list += cont
            cont.invokeOnCancellation { list -= cont }
        }
    }

Thanks for reading, hopefully someone working on the select library finds this useful! If you have experience working with select, please feel free to share as well.

broot · March 6, 2023, 8:52am

I don’t say this is wrong, but it feels to me like an overuse of select(). select() is one of basic concurrency tools, I’m not sure if it is even meant to be extended with the application logic.

Did you consider creating a channel or flow of events? Seems to me like a much simpler and more typical solution.

frizzil · March 11, 2023, 9:28pm

Sorry for late reply, been a crazy week.

If I’m not mistaken, Flow and Channel cannot meet my needs for the following reasons:

They are both homogenous, whereas I need the ability to await completion of any combination of suspended coroutines (not necessarily just “events.”) This is accomplished specifically by select, and would be invaluable for enemy AI, for example.
They are single consumer, i.e. their elements are removed once examined. I may have any combination of scripts running that are interested in the same “events.” I don’t want to duplicate the data-structure for each consumer, either.

Why do you think select would be outside the general application domain? Perhaps implementing our own clauses would be (due to current complexity), but conceptually I think select belongs front-and-center for something like game AI, which I’ve seen precedent for in other game engines. All it means is “return the first clause that completes and dispose the others.” I don’t see the danger here that you might see with something like threading primitives, for example.

broot · March 12, 2023, 9:50pm

Yes, I mean writing our own clauses. All existing clauses are related to generic concurrency primitives and you started extending this functionality with your application logic. Again, I don’t say this is wrong, but for me it feels strange. In your implementation you used things like startCoroutine() and even SelectInstance which is marked as @InternalCoroutinesApi - this is a good indicator you went pretty low level.

I most probably don’t see a whole picture here, but on first sight your case looks like a common scenario where we would like to process some asynchronous events sequentially. Typical solution to this problem is creating an event loop. Why it doesn’t work here?

frizzil · March 12, 2023, 11:59pm

Alright, I’m not sure what you mean by “event loop,” but I discovered SharedFlow and StateFlow, which appear to be specifically for events with multiple consumers… but do not appear to have a select clause yet. This article was helpful: Shared flows, broadcast channels. See how shared flows made broadcast… | by Roman Elizarov | Medium

I’m going to investigate their implementation, and if it’s not too heavy I’ll probably write a select clause for them (if possible) then swap everything out to them. (The only reason my existing implementation would be better is because it can assume single-thread constraint and avoid synchronization, and may also avoid some allocations.)

broot · March 13, 2023, 12:33am

I don’t think this is possible. The reason is that flows are by design cold and that means they don’t do anything until we are collecting from them. We can’t take the “next” item as from some iterators or channels.

Something along these lines:

private val queue = Channel<suspend () -> Unit>()

private suspend fun dispatch(block: suspend () -> Unit) = queue.send(block)

suspend fun main() = coroutineScope {
    launch {
        queue.consumeEach { it() }
    }

    script.addStateListener { state ->
        when (state) {
            PlayerScript.State.LOCKED_ON -> dispatch {
                ...
            }
        }
    }
    script.addEventListener { ev ->
        when (ev) {
            PlayerScript.Event.HIT_BY_PROJECTILE -> dispatch {
                ...
            }
            PlayerScript.Event.NOT_ENOUGH_STAMINA -> dispatch {
                ...
            }
        }
    }
}

Even better, replace listeners with flows.

frizzil · March 13, 2023, 12:44am

The reason is that flows are by design cold

Actually, SharedFlow and subclasses are specifically an exception to this and are hot (see that article I linked). The real question is whether the SharedFlow API could permit a SelectClause0/1 implementation without library-level changes.

Something along these lines:

In all honesty this pattern involves relatively fewer allocations, I’ll have to think about it some more. But the point still stands that each listener is constrained to listening only to Events, i.e. you can’t mix-and-match with other suspending functions (or select clauses in my case.) There’s also some boilerplate you’d need to deregister the listener upon completion (or move that aspect outside the running script, but use flags to enable listening… but now we’re getting into spaghetti FSM territory that coroutines are supposed to solve!)

Now that I’m thinking about it, I could remove the select call and have a generic suspending waitAnyEvent function, that returns any event, then we just loop on it until whatever condition is met. We lose the mix-and-match there as well, but for constrained cases it could be ideal. Downside is it’s not very efficient invoking a bunch of coroutines any time any event is called.

Topic		Replies	Views
Problem with select Libraries	6	108	August 12, 2024
Channels of Channels Support	2	1745	June 14, 2017
Experimental status of coroutines in 1.1 and related compatibility concerns	7	9939	February 2, 2017
Kotlin Coroutines are super confusing and hard to use Language Design	6	9890	December 6, 2018
Intellij idea does not show coroutine results but Kotlin Playgroun does Support	2	549	April 21, 2023

My experience with "select" in coroutines

Related topics