StringBuilder and Kotlin


#1

I check some string concatenation bytecode outputs today and it looks like kotlin's internal handling of string concat is really nice, even better than explicitly using a StringBuilder with append calls.

https://gist.github.com/jrenner/e074d009614df27fd950

Is it fair to say StringBuilder is much less necessary in Kotlin?


#2

Well, the difference between using explicit StringBuilder vs string interpolation or concatenation is that StringBuilder, being a Java class, is guarded with not-null assertions all over the place, so one can say that the compiler does a poor job handling explicit calls to StringBuilder. This is the only thing that makes concatenation/interpolation byte code better.

It is not fair to say that StringBuilder is less necessary in Kotlin: its main purpose is optimizing multiple concatenations, for example, when one conatenates strings in a loop. Without a StringBuilder such code has quadratic time complexity, which can kill your program’s performance on longer inputs.


#3

Well, the difference between using explicit StringBuilder vs string interpolation or concatenation is that StringBuilder, being a Java class, is guarded with not-null assertions all over the place, so one can say that the compiler does a poor job handling explicit calls to StringBuilder. This is the only thing that makes concatenation/interpolation byte code better.

But all the tests except the last one are unnecessary, as you'd get an NPE anyway. The JIT surely optimizes it away, but you could reduce the bytecode size by omitting them, couldn't you?

It is not fair to say that StringBuilder is less necessary in Kotlin: its main purpose is optimizing multiple concatenations, for example, when one conatenates strings in a loop. Without a StringBuilder such code has quadratic time complexity, which can kill your program's performance on longer inputs.

Can we write something like

val sb = StringBuilder()
for (Something x : someCollection) {
  sb += "$x.a $x.b ";
}

? Can we get it optimized to

sb.append(x.a).append(x.b)

rather than

sb.append(StringBuilder().append(x.a).append(x.b))

? It produces no garbage and is twice as fast, but not exactly equivalent in case of an exception, but this could be handled. Usually, sb is a local variable which can’t survive the exception anyway. In the remaining cases, shortening sb in a catch clause to it’s original length would do, right?


#4

But all the tests except the last one are unnecessary, as you'd get an NPE anyway. The JIT surely optimizes it away, but you could reduce the bytecode size by omitting them, couldn't you?

In theory, yes, but that would require some intimate knowledge about how StringBUilder is implemented, i.e. the compiler would have to treat StringBuilder specially (setting problems of extending StringBuilders aside here).

It produces no garbage and is twice as fast, but not exactly equivalent in case of an exception, but this could be handled. Usually, sb is a local variable which can't survive the exception anyway. In the remaining cases, shortening sb in a catch clause to it's original length would do, right?

Again, in theory, this is possible. But it's clearly a microoptimization, we'd need to make sure it's worth the trouble before implementing it.


#5

But all the tests except the last one are unnecessary, as you'd get an NPE anyway. The JIT surely optimizes it away, but you could reduce the bytecode size by omitting them, couldn't you?

In theory, yes, but that would require some intimate knowledge about how StringBUilder is implemented, i.e. the compiler would have to treat StringBuilder specially (setting problems of extending StringBuilders aside here).

I didn't mean anything StringBuilder specific, but method chaining in general, or equivalently a sequence of calls like

something = something.doSomething(…)
something = something.doSomethingElse(…)

No null check for something is needed after the first row, as the second would throw anyway.

It produces no garbage and is twice as fast, but not exactly equivalent in case of an exception, but this could be handled. Usually, sb is a local variable which can't survive the exception anyway. In the remaining cases, shortening sb in a catch clause to it's original length would do, right?

Again, in theory, this is possible. But it's clearly a microoptimization, we'd need to make sure it's worth the trouble before implementing it.

I'd bet, it's pretty important (factor two for a common operation), but I'm sure, there more important things to do now. Moreover, it's a hidden optimization which changes nothing but speed, so there's no need to hurry. Maybe the JIT does it itself, it could in theory.


#6

I use the following method extensions to achieve method chaining

 
/** execute a function with this as parameter */
fun <T> T.self(f:(x:T)->Unit):T{
    f(this)
    return this
}

/** execute a function with this as receiver */
fun <T> T.me(f:T.()->Unit):T{
  f()
  return this
}

val user = User() me {   setName("...")   setAge(18) }


#7

No null check for something is needed after the first row, as the second would throw anyway.

To guarantee this, we'd need to know that chained calls actually return "this", and not null or anythng else. Which we can't know.


#8

To null check for something is needed after the first row, as the second would throw anyway.

To guarantee this, we'd need to know that chained calls actually return "this", and not null or anythng else. Which we can't know.

No, all you need to know is kotlin/jvm/internal/Intrinsics.checkReturnedValueIsNotNull.

INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
``

DUP
LDC "StringBuilder"
LDC "append"
INVOKESTATIC kotlin/jvm/internal/Intrinsics.checkReturnedValueIsNotNull (Ljava/lang/Object;Ljava/lang/String;Ljava/lang/String;)V
LDC "c"
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;

No, what I mean is that the first red line is redundant as the second would throw anyway. It's virtual call, so it has no choice if its recipient is null. Or am I missing something? The produced message, maybe?


#9

I see what you are saying. Yes, the only difference would be produced message + predictability of the exception class. The latter might be addressed by having checkReturnedValueIsNotNull() throw NullPointerException with a nice message. Thanks for the idea, we'll think about it.

Incidentally, the assertion generation logic has been changed in the recently released M9, and currently the behavior is exactly as you suggest (assertions are gone from some otehr places too), but as we are planning on putting some assertions back, this is not final yet.