Kotlin sequences vs. Java streams

I am converting some existing Java code to Kotlin.

The Java code used streams.

Is there any reason to prefer continuing to use Java streams in Kotlin, or is there any reason to prefer redoing the streams as Kotlin sequences?

If I have third-party utilities that only work with streams or with sequences, obviously that would determine what I could or couldn’t use, but, if I don’t depend on any such third-party code, are there any benefits for performance, memory, flexibility, etc. of one or the other?

1 Like

Streams are optimized using dynamic instructions not yet available in Kotlin, so my guess is that streams should work faster. Also Streams have benefit of parallel evaluation, so in general, I think Streams should work better for large data samples. If I remember correctly, there were some methods in stdlib-jdk8 to convert from streams to sequences and back, so you can safely use Stream based internals and convert result of operations to Sequence if you need it.

3 Likes

@darksnake Thanks.

Are there any reason why Kotlin kept sequences after Java 8 provided streams?

Was it to provide a similar feature for platforms other than Java (like JS)? (I assume this is the primary reason)

Was it to provide a similar feature for Java 7 and earlier?

Was it for backwards compatibility for older versions of Kotlin?

I think, all the reason you mentioned. Kotlin can target Java 1.6 and other platforms like JS and Native that do not have Stream. Sequence API is more reach. It means that if you do not have huge data, it still makes sense to use Sequence. Also, you should remember about sequence generators, which are really powerful.

2 Likes

Hi @rgoldberg,
in my experience both types provides good performance.

I prefer Kotlin Sequence because it is more handy, moreover you can build a Sequence using a coroutine. (Honestly you can define: fun <T> stream(block: suspend SequenceScope<T>.() -> Unit) = sequence(block).asStream())

2 Likes

@fvasco Thanks for the info.

Why are sequences handier than streams?

I haven’t yet used coroutines. I’ve heard about them, but haven’t even read up on what they are. Will add that research to my todo list.

It is a personal opinion.

I propose you an exercise, I left an undocumented piece of code. It is understandable?
Try to rewrite this example using Java Stream, try yourself my experience!

fun main() {
    val fibonacciSequence = sequence {
        var a = 0
        var b = 1

        while (true) {
            yield(a)
            val f = a + b
            a = b
            b = f
        }
    }

    fibonacciSequence
            .take(10)
            .groupBy { it % 2 == 0 }
            .forEach(::println)
}
val fibonacciStream = Stream.iterate(Pair(1, 0)) {(a, b) ->
    Pair(b, a + b)
}

fibonacciStream
    .limit(10)
    .map { it.second }
    .collect(groupingBy<Int, Boolean> { it % 2 == 0 })
    .forEach(::println)

Great job @rgoldberg!

You should lift .map { it.second } to the first block. Moreover you should use mapToInt().

Choose your preferred API.

I suspect that your implementation requires a bit more allocation on TLAB, however performance impact may be irrelevant.

What do you mean by that? I can’t put it in the iterate UnaryOperator argument, because I need both the previous & current values from the Pair. I shouldn’t put it before limit(10), because why perform maps before limits?

Why should I do that? I would have initially used an IntStream instead of a Stream<Int>, except that groupingBy only works with the latter. If I mapToInt(), that returns an IntStream. When grouping, a Map<Boolean, List<Integer>> would probably be created, which would need Integer, not int. Unless they used int[] instead of the list, but that would have resizing issues… Maybe I’m missing something, and there’s some way to easily do this without requiring Integers

fibonacciStream should be Stream<Int>

No, you are right,
I missed this details.

I wrote a detailed article comparing the pros & cons of streams vs. sequences:

Although I didn’t include it in the article, I just wanted to mention that the invoke dynamic JVM instruction doesn’t actually come into play here. You won’t get invoke dynamic just because you’re using streams since the lambda that you pass into the stream is defined in Kotlin so it will be generated the same way as when you use sequences. So invoke dynamic is not a factor when comparing streams vs. sequences when both are used from Kotlin code. If anything, the lambdas are eliminated altogether when using the inline terminal operations for sequences so no lambda is ideal.

However, it’s actually not a clear-cut answer especially with certain autoboxing scenarios (see the article for specifics).

1 Like