# Faster Collection.map{} function by 44%

Hi,
default implementation of kotlin map function uses MutableCollection<>.add() method for building of result list. The `add` method has unnecessary overhead (checking of capacity size, update of size of list,…)

My idea is to use array as target and then convert that into list.
I tried two implementations - one iterator based and one index based and gained 5-15% performance. It is not game changer, but map is quite frequently used function, so it could have real-world impact.

``````package test

import org.openjdk.jmh.annotations.*
import java.util.concurrent.TimeUnit

@State(Scope.Benchmark)
@Fork(1)
@Warmup(iterations = 10)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
class TestBenchmark {
private lateinit var data: List<String>

@Setup
fun setUp() {
data = List(100) { it.toString() }
}

@Benchmark
fun mapDefault() {
data.map { it + it }
}

@Benchmark
fun mapArray1() {
data.mapArray1 { it + it }
}

@Benchmark
fun mapArray2() {
data.mapArray2 { it + it }
}
}

inline fun <T, reified R> List<T>.mapArray1(transform: (T) -> R): List<R> {
val result = arrayOfNulls<R?>(size)
for (i in indices)
result[i] = transform(this[i])
return result.asList() as List<R>
}

inline fun <T, reified R> Collection<T>.mapArray2(transform: (T) -> R): List<R> {
val iterator = iterator()
return Array(size) { transform(iterator.next()) }.asList()
}
``````

mapDefault: 1277464 ops/s - baseline
mapArray2: 1342551 ops/s + ~ 5% ops/s
mapArray1: 1480874 ops/s + ~ 15% ops/s

1 Like

This is interesting, I’ll look into it when I have some free time. Some comments for now:

1. `it + it` may a pretty heavy operation compared to what we try to benchmark. My intuition says it should take much more than the mapping itself. I suggest using ints, not strings.
2. You should either return the results of `map` or push it into the blackhole. Otherwise, JVM may entirely remove the `map()` and then you benchmark an empty function.
3. Arrays also do size checks. If using lists, theoretically we do 2 such checks vs 1, but collections are most probably highly optimized in JVMs, so I wouldn’t assume what we see in the source code is what we run. It is good you looked for benchmarks.

But generally, I think the idea makes sense.

1 Like

Thanks broot, you are right. Now improvement is even more significant

Performance of mapArray2 is worse than default,
default
mapDefault 3928969 - baseline
mapArray1 5659579 ~ +44%
mapArray2 3279785 ~ -17%

``````package test

import org.openjdk.jmh.annotations.*
import org.openjdk.jmh.infra.Blackhole
import java.util.concurrent.TimeUnit

@State(Scope.Benchmark)
@Fork(1)
@Warmup(iterations = 10)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
class TestBenchmark {
private lateinit var data: List<Int>

@Setup
fun setUp() {
data = List(100) { it }
}

@Benchmark
fun mapDefault(blackhole: Blackhole) {
blackhole.consume(data.map { it + it })
}

@Benchmark
fun mapArray1(blackhole: Blackhole) {
blackhole.consume(data.mapArray1 { it + it })
}

@Benchmark
fun mapArray2(blackhole: Blackhole) {
blackhole.consume(data.mapArray2 { it + it })
}
}

inline fun <T, reified R> List<T>.mapArray1(transform: (T) -> R): List<R> {
val result = arrayOfNulls<R?>(size)
for (i in indices)
result[i] = transform(this[i])
return result.asList() as List<R>
}

inline fun <T, reified R> Collection<T>.mapArray2(transform: (T) -> R): List<R> {
val iterator = iterator()
return Array(size) { transform(iterator.next()) }.asList()
}

``````

@ybznek
please consider that not all `List` are instance of `RandomAccess`.

3 Likes

What is generally better is using a sequence because it isn’t building a collection and just lazily handling one item at a time

1 Like

I wouldn’t say sequences are “generally better”. It is like N*M vs M*N. Of course, at least theoretically we can do it in less CPU cycles and less allocations, but in practice I believe sequences are generally slower as of Kotlin 1.9. Maybe they could be improved in the future, but we are not there yet.

The thrust of the original question was trying to optimize how the allocation is done to speed up the map. If you do not actually need the collection it will generally be “better” because none of that allocation is ever done.

What is the source of this belief? I see no way that they are “generally slower” and certainly not slower than mapping from one collection into another collection.

I think the main point of the question is not to optimize allocation or avoid using collections. Point is to use the lower level array for the majority of the workload and then only wrap it into a collection as the last step - instead of working through the wrapper all the time.

I read such claims a few times in the internet, with some benchmarks to prove it. I never tried to test this myself, but the explanation actually makes a lot of sense. Sequences add overhead, they are complicated and can’t be inlined. Loops are obviously simple to the JVM and can be easily optimized. If Kotlin compiler will be ever able to fully inline sequences to produce a single loop with just operations inside, that will be for sure better. But for now it generates a loop with a chain of calls between operators. And apparently, this is generally slower than going multiple times in a loop.

See for example: Sequences and Inlined Lambdas , but as said above, I think I saw other similar discussions either here, on Stackoverflow or maybe Kotlin bugtracker.

Of course, there are cases where sequences are better, but speaking about the general superiority or about the default approach which we should choose, the answer is not that simple.

In the corrected example, I suspect that the reason `mapArray2` has worse performance than the built-in `map` is because you’ve involved multiple `List` implementations. The `setUp` creates a `java.util.ArrayList`, and `map` also creates a `java.util.ArrayList`, but in using `Array.asList()`, you create a `java.util.Arrays.ArrayList`. When the JVM is dealing with multiple implementations of an interface, it doesn’t get as many optimization opportunities.