Release compilation mode

Hi,

Are there any topic or issues on github/youtrack, which are related to the following idea: add release mode into kotlin compiler for JVM?

The idea is from native optimizing compilers. C# language has the same logic (see optimization details here)

The main goal is ability to recompile code with optimizations, which can be human unreadable. Of course this can lead to debugging issues which are expected.

What optimizations I’d expect (all are related to the private/internal visibility):

  • Re-using property value, which was read before. The simplest case is when you read the same property twice inside the function, however both reads return the same result for readonly property from data class
  • Operations reordering
  • Converting lambda-like loops to foreach loop
  • Caching toString result (for immutable input only of course)
  • Method inlining

Sorry, but are you talking about kotlin? If you are, then what platform do you mean?

If you we consider Kotlin-jvm, then all the optimizations you listed are implemented in all regimes.Probably the same goes for kotlin-native since it is done by LLVM. Some of the function inlining is done by the kotlin compiler so will work on all platforms.

The question is for Kotlin-jvm

all the optimizations you listed are implemented in all regimes

Am I right that you are talking about jit optimizations? In this case kotlin should generate code, which will be friendly optimized by jit. If yes then question: am I right, that code below will be converted to the simple loop?

        // input start
        val ints = (0..120).toList()
        // input finish

        // code sample start
        ints.filter { it / 5 > 3 }
                .map { it.toString() }
                .filter { it.contains("1") }
                .map { it.length.toString() }
                .joinToString() 
        // code sample finish

So what I’m talking about: in some cases Kotlin compiler knows more about code that jit compilers (for example - immutability, loops like above). C# can optimize such code in release mode. So question: are there any discussion about the same mode for kotlin?

Please read the documentation. filter and map operation on arrays and lists are marked as inline (see here), which means that they are transformed to loops by kotlin compiler before translating to the bytecode.

Other things are optimized by jit, meaning there is not reason to optimize them before.

@darksnake, I know about inlining. My question about loop conversion.

Let’s go step-by-step.

Firstly compiler will split operations to have something like this:

        val var1 = ints.filter { it / 5 > 3 }
        val var2 = var1.map { it.toString() }
        val var3 = var2.filter { it.contains("1") }
        val var4 = var3.map { it.length.toString() }
        val var5 = var4.joinToString()

Then compiler will apply function inlining, e.g. first loop will be like this:

        val var1 = ArrayList<Int>()

        for (element in ints){
            if(element / 5 > 3){
                var1.add(element)
            }
        }

        val var2 = var1.map { it.toString() }
        val var3 = var2.filter { it.contains("1") }
        val var4 = var3.map { it.length.toString() }
        val var5 = var4.joinToString()

Then compiler will expand all loops, so we will have something like this:

        val var1 = ArrayList<Int>()

        for (element in ints){
            ...
        }

        val var2 = ArrayList<String>()

        for (element in var1){
            ...
        }

        val var3 = ArrayList<String>()

        for (element in var2){
            ...
        }
        
        val var4 = ArrayList<String>()

        for (element in var3){
            ...
        }
        
        val var5 = ...

So as you can see (please correct me if I wrong) we will have separate foreach loops, each of them will create iterator on head + temporary array (before jit optimizations of course)

However kotlin compiler can convert this set of loops to the single loop, so we will have something like this:

        val var5 = StringBuilder()

        for (element in ints){
            if(element / 5 > 3){
                val var2 = element.toString()

                if(var2.contains("1")){
                    if(!var5.isEmpty()){
                        var5.append(", ")
                    }
                    
                    var5.append(var2)
                }
            }
        }

I know, that some JITs can sometimes convert separate loops to the single loop. However, my question: are there any discussions to apply the following optimization on the kotlin-to-bytecode compilation step?

Other things are optimized by jit

Am I right that your point is that all JIT compilers have all optimizations which I listed above?

For the first part, why ask if you can just check. Just decompile the generated code to java and see how it works.

For the second part, I am sure about operation reordering and inlining. Function result caching is also done (getters and toString from your list), but I am not sure about how, when and the efficiency.

Interesting, will recheck and return …

@darksnake, let’s return to discusson.

Please see kotlin code:

Kotlin code
fun main(vararg args: String) {
    // input start
    val ints = (0..120).toList()
    // input finish

    // code sample start
    val res = ints.filter { it / 5 > 3 }
            .map { it.toString() }
            .filter { it.contains("1") }
            .map { it.length.toString() }
            .joinToString()
    // code sample finish

    println(res)
}

Please see below decompiled java code - it is exact what I said above - we have several independent loops.

decompiled java code
import java.util.ArrayList;
import java.util.Collection;
import java.util.Iterator;
import java.util.List;
import kotlin.Metadata;
import kotlin.collections.CollectionsKt;
import kotlin.jvm.functions.Function1;
import kotlin.jvm.internal.Intrinsics;
import kotlin.ranges.IntRange;
import kotlin.text.StringsKt;
import org.jetbrains.annotations.NotNull;

@Metadata(
   mv = {1, 1, 11},
   bv = {1, 0, 2},
   k = 2,
   d1 = {"\u0000\u0014\n\u0000\n\u0002\u0010\u0002\n\u0000\n\u0002\u0010\u0011\n\u0002\u0010\u000e\n\u0002\b\u0002\u001a\u001f\u0010\u0000\u001a\u00020\u00012\u0012\u0010\u0002\u001a\n\u0012\u0006\b\u0001\u0012\u00020\u00040\u0003\"\u00020\u0004¢\u0006\u0002\u0010\u0005¨\u0006\u0006"},
   d2 = {"main", "", "args", "", "", "([Ljava/lang/String;)V", "ktest_main"}
)
public final class StartKt {
   public static final void main(@NotNull String... args) {
      Intrinsics.checkParameterIsNotNull(args, "args");
      byte var2 = 0;
      List ints = CollectionsKt.toList((Iterable)(new IntRange(var2, 120)));
      Iterable $receiver$iv = (Iterable)ints;
      Collection destination$iv$iv = (Collection)(new ArrayList());
      Iterator var6 = $receiver$iv.iterator();

      Object item$iv$iv;
      int it;
      while(var6.hasNext()) {
         item$iv$iv = var6.next();
         it = ((Number)item$iv$iv).intValue();
         if (it / 5 > 3) {
            destination$iv$iv.add(item$iv$iv);
         }
      }

      $receiver$iv = (Iterable)((List)destination$iv$iv);
      destination$iv$iv = (Collection)(new ArrayList(CollectionsKt.collectionSizeOrDefault($receiver$iv, 10)));
      var6 = $receiver$iv.iterator();

      String var13;
      while(var6.hasNext()) {
         item$iv$iv = var6.next();
         it = ((Number)item$iv$iv).intValue();
         var13 = String.valueOf(it);
         destination$iv$iv.add(var13);
      }

      $receiver$iv = (Iterable)((List)destination$iv$iv);
      destination$iv$iv = (Collection)(new ArrayList());
      var6 = $receiver$iv.iterator();

      String it;
      while(var6.hasNext()) {
         item$iv$iv = var6.next();
         it = (String)item$iv$iv;
         if (StringsKt.contains$default((CharSequence)it, (CharSequence)"1", false, 2, (Object)null)) {
            destination$iv$iv.add(item$iv$iv);
         }
      }

      $receiver$iv = (Iterable)((List)destination$iv$iv);
      destination$iv$iv = (Collection)(new ArrayList(CollectionsKt.collectionSizeOrDefault($receiver$iv, 10)));
      var6 = $receiver$iv.iterator();

      while(var6.hasNext()) {
         item$iv$iv = var6.next();
         it = (String)item$iv$iv;
         var13 = String.valueOf(it.length());
         destination$iv$iv.add(var13);
      }

      String res = CollectionsKt.joinToString$default((Iterable)((List)destination$iv$iv), (CharSequence)null, (CharSequence)null, (CharSequence)null, 0, (CharSequence)null, (Function1)null, 63, (Object)null);
      System.out.println(res);
   }
}

As you can see, kotlin does not have at least one optimization from above.

So the question is still actual: are there any discussions to apply the following optimization on the kotlin-to-bytecode compilation step?

I’m not sure about the optimizations, but for this specific example have you looked at sequences?

Sequences are made for lazy evaluation like Java’s Stream. Using a sequence performs the operation in a single loop as needed by a terminal operation like joinToString()

I don’t know if there is a similar optimization applied by the compiler in some special case. JIT may do the optimization after a few passes through the code.

Here’s some links

@arocnies, I agree with you, I can rewrite code above with effective way.

However my question was about automatic optimizations by compiler. Stream is a good example, because sequence-based code will still create unnecessary objects on heap in comparison with simple for loop.

JIT may do the optimization after a few passes through the code.

Sometimes Kotlin knows more about code than JIT. For example, read-only data classes can be mutable from JIT side (because you can change backing field by using reflection). So these items can be optimized by Kotlin (and could not be by JIT).

Let’s look at the optimizations one by one.

One of the features of the JVM model is that class files are independent and can be independently released. As such before runtime you don’t know whether you are using the same data class as original or a version that has different semantics. I know that Java does violate this in the case of compile time constant substitution, but that doesn’t make it right or mean that Kotlin should. It may optimize cases where the data is in the same class. But the jit needs to optimize the case in general anyway (as it knows - sort of - the full loaded state) should optimizes anyway. (ps. ignore classloaders)

The correct reordering is machine architecture dependent. Leave it to the JIT (and CPU) that know the best ordering.

As said, this already happens using inline functions. The implementation currently works, but a more powerful solution might be an option (but not a priority - it works as it is).

This requires being able to analyse the behaviour of toString. Even an immutable type could have a dynamic toString (it may call Random…, inspect some global…). Add the bit about not being able to trust the actual toString implementation until runtime (it could be substituted) and it is clear that the JIT does this better.

Method inlining is certainly already done in the JIT, and can be achieved with the inline function. It is not always clear that inlining is correct, but inlining at compiler level could be implemented (for locally defined functions - for out of class functions see earlier), together with for example constant folding. However as the JIT does this the main benefit is code size (and sometimes method count - in case of Android) it has little priority.