# Map() processed all items and can be very slow

I am working with huge lists and have some performance problems:
I narrowed it down to the following code:

``````    var counter1 = 0
var counter2 = 0
val list = ArrayList<Point>(100)
for (i in 0 until 100) list.add(Point(i, 0))

list.map {
it.x
++counter1
}.subList(0, 5).forEach {
++counter2
}
Log.i("XXXXXXXXXXXXX", "counter1 == \$counter1, counter2 == \$counter2")
``````

I expected the output to be:
I/XXXXXXXXXXXXX: counter1 == 5 counter2 == 5

But it actually is:
I/XXXXXXXXXXXXX: counter1 == 100, counter2 == 5

This means that map() processes all elements even if only a part of them are really needed. Why is it not implemented in a way that it returns an Iterator over the transform function?
This would be way more efficient for huge lists or am I missing something here?
Are there other more efficient versions similar to map() or do I have to write my own transformation function?

Collection transformations donâ€™t work on iterators, but on collections. `map()` produces entirely new list, the list â€śdoesnâ€™t knowâ€ť it will be filtered at a later step - it has the same size as the original list.

If you look for lazy transformations then you should use sequences:

``````list.asSequence()
.map { ... }
.take(5)
.toList()
``````
2 Likes

Ah thanks, this makes sense. So I should use sequences over lists whenever possible:

``````    var counter1 = 0
var counter2 = 0
val list = ArrayList<Point>(100)
for (i in 0 until 100) list.add(Point(i, 0))

list.asSequence().map {
it.x
++counter1
}.take(5).forEach {
++counter2
}
Log.i("XXXXXXXXXXXXX", "counter1 == \$counter1, counter2 == \$counter2")
``````

Now itâ€™s printing:
I/XXXXXXXXXXXXX: counter1 == 5, counter2 == 5

Well, I donâ€™t think we should use sequences whenever possible, but whenever we can benefit from them. Your example is a case when it is reasonable to choose sequences over collections.

As for now sequences are generally a little slower and they require additional code. I personally use collections by default and switch to sequences if needed.

2 Likes

I second what broot said.
Koddos for doing the proper loop of looking for the performance problem instead of pre-optimizing.

Definitely donâ€™t use sequences whenever possible.
A rule of thumb is to switch to sequences for performance when youâ€™re dealing with large collections and many functional operations. Donâ€™t pre-optimize but you can consider if your collection will be unbounded and prepare for that.

And of course use sequences when the kind of data youâ€™re working with should be represented as a stream of items of unknown (potentially infinite) length.