Performant and elegant iterator over custom collection


#1

Hi! First of all, I would love to give my feedback and opinions about the language before: I did not believe in love at first sight but after a week and a half using Kotlin, here I am, completely in love. Heart_eyes:
That being said, my question:
I am a video game developer and I have to be very careful with the performance and the allocated memory. I’m slowly moving my code to Kotlin and it’s time to write my custom collections but I’m not sure what’s the best way to iterate over them.
I’m decompiling to Java so I can see what’s going on under the hood and figure out the best way to do it.
I started with a very basic wrapper class:

class IntList(ordered: Boolean = true, capacity: Int = 16) {
    private var items = IntArray(capacity)
    var size: Int = 0
        private set
    var ordered: Boolean = ordered
        private set

    fun add(value: Int) {
        var items = this.items
        if (size == items.size) items = resize(Math.max(8, (size * 1.75f).toInt()))
        items[size++] = value
    }

    operator fun get(index: Int): Int {
        if (index >= size) throw IndexOutOfBoundsException("index can't be >= size: $index >= $size")
        return items[index]
    }

    operator fun set(index: Int, value: Int) {
        if (index >= size) throw IndexOutOfBoundsException("index can't be >= size: $index >= $size")
        items[index] = value
    }

    private fun resize(newSize: Int): IntArray {
        val newItems = IntArray(newSize)
        val items = this.items
        System.arraycopy(items, 0, newItems, 0, Math.min(size, newItems.size))
        this.items = newItems
        return newItems
    }
}

And at that point I was ready to start iterating over the array. By ‘best way’ I mean (in order of priority): the most performant for the CPU; the cheapest in allocated memory and for garbage collection; and the cleanest and clearest.
In Java I would do:

IntList list = new IntList();
for(int i = 0, n = list.size; i < n; i++) {
    int value = list.get(i);
}

but I know Kotlin can do better than that (in terms of clarity and simplicity).

I’ve seen the source code for IntArray and List classes, but that wasn’t really helpful because they are basically their Java counterparts with some extension methods and properties and it’s clear that the compiler is doing some special dark magic (specially for the primitive arrays) and I was wondering if there’s a way to achieved that same super optimal results for my class.
When I code in Kotlin

val list = IntArray(0) // or intArrayOf(someValues)
for(i in list) {
    val value = i
}
for(i in list.indices) {
    val value = list[i]
}
for(i in 0..list.lastIndex) {
    val value = list[i]
}

I get a super optimal Java

int[] list = new int[0];

int i;
int var3;
for(var3 = 0; var3 < list.length; ++var3) {
    i = list[var3];
}

i = 0;
var3 = ArraysKt.getLastIndex(list);
int var10000;
if(i <= var3) {
    while(true) {
        var10000 = list[i];
        if(i == var3) {
            break;
        }

        ++i;
    }
}

i = 0;
var3 = list.length - 1;
if(i <= var3) {
    while(true) {
        var10000 = list[i];
        if(i == var3) {
            break;
        }

        ++i;
    }
}

But instead if I do

val list = IntList()
for(i in list) {
    val value = i
}
for(i in list.indices) {
    val value = list[i]
}
for(i in 0..list.lastIndex) {
    val value = list[i]
}

I can’t achieve any similar Java output.
For the first for, I couldn’t implement the iterator in any way without getting created an Iterator object.
For the second one I tried adding two very basic extension properties (imitating the IntArray implementation):

val IntList.lastIndex: Int
    get() = size - 1

val IntList.indices: IntRange
    get() = IntRange(0, lastIndex) // or 0..lastIndex (it's the same)

but I ended up with an IntRange created every time:

IntRange var10000 = IntListKt.getIndices(list);
i = var10000.getFirst();
var3 = var10000.getLast();
if(i <= var3) {
    while(true) {
        list.get(i);
        if(i == var3) {
            break;
        }
        ++i;
     }
}

And only the third option gave me the same Java output as with the primitive array. But it’s basically the same boilerplate (not that bad) way as the one I would have written in Java.

Is there any way I can get more optimal compiled Java code using any of the first two options? Anything I can tell the compiler to achieve better results? Do you know any other performant and “pretty” (or elegant) way of iterating over a custom collection?

I hope I made my self clear (sorry if my English wasn’t so good) and thanks in advance!


#2

After I wrote the question I thought of implementing a lazily initiated iterator that is reseted every time it is retrieved by the iterator() operator function. But in that case I wouldn’t be able to do something like this

for(i in list) {
    for(j in list) {
        // something
    }
}

It would be really great if there is any way of telling and helping the compiler get an optimal output.
I don’t mind if my IntList class gets a little less clear or if it gets more complicated if, in consequence, I’ll got better performance and a cleaner and simpler API.


#3

What code is generated when you make the indices extension property inline? It should give more visibility to the compiler, but I have not tested that it is sufficient.


#4

The compiler has special case optimizations for loops iterating over arrays and IntRanges. There is no possibility to apply those optimizations to user classes at this time.


#5

You can write inline forEach function which will be compiled to optimal code.

inline fun forEach(block: (Int) -> Unit) {
    var index = 0
    val size = size
    while (index < size) {
        block(get(index))
        index++
    }
}

#6

I also tried that but IntelliJ (despite it built and ran the project properly) couldn’t decompile it. I don’t remember right now which Exception was thrown.
But then I tried making it an inline function and the results were the same as with the not inline property.
Thanks!


#7

Yeah, I read that in the reference and since my indices property is, after all, an IntRange, I was hoping it would get optimized as well.
It would be really great to have a way to communicate and help the compiler a little bit achieving those kind of ‘special case optimization’ (despite it would be a little more cumbersome).


#8

This is a good alternative (specially for the first simplest form (for(i in list))) but almost always I prefer using for loops instead of forEach except in very few cases.
Anyway, maybe I will stick with this option if it is the most optimal. :ok_hand:
Thanks!


#9

In this Google I/O session, Jake Wharton from Square talks about making a GroupView behave more like other collections and suggests the @vbezhenar way for iteration:

Thanks again! :slight_smile:


#10

I suggest you to create a good benchmark. Iterating with creating an additional Iterator is an extremely common case with Java code, so I’m sure, that it’s optimized good enough. It may look like creating an object, calling its methods, etc, but JIT might just inline everything and in machine code it’ll be a simple loop.


#11

Three tipps for you:

  1. Use this snippet as an allocation-measure: http://www.rationaljava.com/2015/07/measuring-allocations-programmatically.html
    With it you can exactly see how many bytes are created inside your loops.

  2. If you do micro-benchmarks: use jmh.

  3. Here are some examples about the type of looping Kotlin allocates iterators for: https://stackoverflow.com/questions/37609071/array-list-iteration-without-extra-object-allocations