[QUESTION] parallel processing like LMAX disruptor

imagine that we have a data processing pipeline, consists of 3 steps

val finalValue=fn3(fn2(fn1(originalValue)))

however fn2 is cpu intensive, thus we want to parallelize it

                    /-> fn2(on thread1) --\
fn1(originalValue) -+-> fn2(on thread2) --+-> fn3(...) -->
                    \-> fn2(on thread3) --/

which is the best approach?

I can think of two approaches:
use two channels to fan-out and fan-in, but this doesn’t preserve the source data(originalValue) order, so we have to buffer, reorder, potentially block to provide back pressure… too complex

another approach is to use LMAX disruptor, a famous java library: LMAX Disruptor
but this adds another dependency and is not kotlin friendly(similarly we can use java stream, we also have to consider how to integrate into stream’s rather complex internal apis for managing parallelisms/stop infinite streams).
most importantly, disruptor excels in low latency high throughput, it adds code/config complexity for this, but not all project need them all. I’m not concerned too much about latency, as I may later refactor the code to distribute fn2 not only on cpu cores(threads) but also on clusters(different machines). custom data processing pipeline with different performance targets, may not be suitable on spark/flink

I do not think that anyone could give you clear advice base on the data in the question, because there are lot of details you are not talking about. Like how time-consuming are your tasks. Let me just give you a several notes:

  1. There are no problems with using Java APIs from kotlin-jvm. You just add a few extensions and make them quite kotlin-friendly.

  2. If you have huge amounts of data and thinking on distributed computing, you probably should go with GitHub - Kotlin/kotlin-spark-api: This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x.

  3. If you are working locally then almost all tools are there in coroutines-core library. The paralel flow processing is not yet implemented because it is still being desined, but it is quite easy to do your own extension to do exactly that. Like it is done in kmath.

  4. DataForge framework was designed for similar yet much more complicated purpose (it propagates data alongside with the analysis metadata). I do not think it is what you need and you should stick with coroutines though.

1 Like