Why do the primitive array classes need to be exposed?

I started looking into Kotlin a few weeks ago, and am a big fan. I won’t list all the things I like about it, except to say that it seems like a very pragmatic language. I was glad to see that the builtin kotlin package was cleaned up a bit between beta 4 and RC1, moving most of the specialized builtin-related types, like LongRange, into more specific packages. The major exception is the special array types, like LongArray, which are still in the kotlin package. I’m curious as to why those classes are necessary at all. If “Array” had been made an interface instead of a class, those classes could simply implement the Array interface, and only their factory functions (i.e. intArrayOf(42)) would have to be exposed… the fact that in practice those particular array types were implemented via something other than the main Array implementation for performance reasons could be a hidden implementation detail, which would both give the Kotlin team more flexibility to change how those optimizations work, and eliminate some slightly confusing clutter in the builtins package (the fact that the recommended type of an integer array isn’t Array<Int> is a little surprising).

Even if that’s not possible, I’d also suggest that if the *Array classes really do exist purely for performance reasons, and Array<Int> still works perfectly fine, do the optimized classes need to be in the builtins package? To me, a performance optimization not necessary for fundamental use of the language would make more sense in a less prominent location.

Thoughts?

1 Like

If your proposal is to make Array<Int> into an alias for IntArray, that’s rather easy.

The problem is that an Array<Int> would work nothing like an Array<X> where X is a reference type.

Non-primitive-types arrays are polymorphic over reference types. This means you can have a single copy of the code to handle all kinds of different types that satisfy some bound. For instance, the following function can handle arrays of any reference types.

fun test(x: Array<out Any>) { ... }

From there you have two choices: either forbid Array<Int> to be passed to this function, and all like them. If you do this, you have gained nothing over IntArray: you still need to write special cases for each primitive types, you cannot reuse any code. In this case, it’s in fact better to have a clearly distinct name!

The second choice is to make test and its ilk handle primitive types. You can do this by creating specialized versions of your functions for primitive types.

However, there is a problem. Types spread around quickly, and you cannot know via static analysis only where primitive types are going to end up being used. This means you need to special case everything. This lead to a huge increase in the size of the generated bytecode. It can in fact be much worse than (old size) * (number of primitive types) because functions/classes with multiple type parameters need to handle all combinations of primitive types/reference types. For three type parameters, that’s already 9^3 = 729! So it’s not practical.

Scala does allow selective primitive specialization for select functions/classes via the @specialized annotation, so that’s a possible avenue.

Array<Int> represents the boxed version, I think. I agree it’s confusing but that’s the fault of the JVM rather than Kotlin, which just has to live with this complexity for now.

It probably doesn’t make sense for Kotlin to go the Scala route and implement its own concept of value types and specialisation until Valhalla is much more mature, and Kotlin could target Java 10 or whatever version it’s released in. Having Kotlin and Java specialised generics fail to interop would be a crying shame: I’d rather wait.

Of course, whilst Vahalla is in some ways going to simplify, in other ways it’ll just make things even more complicated, because then List<Int> might refer to a list of boxes or a specialised version of List to int and presumably there will need to be some sort of new syntax to determine which it is. Good luck explaining the differences to programming newbies!

1 Like

I have the exact same question, mostly because I recall that Scala has primitive arrays hidden away. But Scala has a much more complicated type system as well.

This feels like a break in the design: Kotlin does not have the concept of primitive types, but wait, for arrays it does. So I agree with the OP: this should be hidden away more. The constructor Array() could be overloaded for primitive/built-in types and construct, say, an IntArray while exposing the type Array<Int>.

@norswap Instead of special-casing everywhere, could you not box the whole array on demand? It’s clear that an IntArray should only be used as Array<Int> and not as Array<out Int>, but why bother the programmer with that? But then, Kotlin does not have implicit boxing. Huh.

@mikehearn

that’s the fault of the JVM rather than Kotlin, which just has to live with this complexity for now

That’s not a very good argument on its own. There definitely are other solutions (you mention Scala yourself).

presumably there will need to be some sort of new syntax to determine which it is

The compiler will have to figure out the details. Otherwise, keep clear.

I’m not sure I follow.

  • Array<Int> is always applicable wherever Array<out Int> is expected (it is more specific). The problem here is not the out bound at all. It’s the fact that the memory layout of an array of integers is quite different from that of an array of object pointers, and the code expect to manipulate object pointers.

  • What do you mean by “box on demand”? Do you mean an IntArray to Array<Int> conversion? If it is to be on-demand, nothing prevents you to make a function that does just that, language support buys you very little here.

  • By the way, as far as I understand, where there isn’t specialization, Scala just forces unboxing the whole array, which can be quite costly. Definitely not something you’d want to happen implicitly.

But honestly, if such solutions are fine, why not just use Array<Int> directly?

Ad your first bullet, I may have gotten things the wrong way around. My bad.

Definitely not something you’d want to happen implicitly.

Depends. Do you want more abstraction or less? Kotlin seems to pull its punches on a few occasions, leaning more towards programmer-optimization rather than compiler-/runtime-optimization. If that’s “better” than Scala’s approach, time will tell.

But honestly, if such solutions are fine, why not just use Array<Int> directly?

Because that’s always inefficient. The point is that you might want the ease of having type Array<Int> with as much performance as possible as long as possible. That seems not quite possible with Kotlin at this point.

Furthermore, note how Scala arrays can be used as collections thanks to implicit conversions whereas in Kotlin you have to explicitly convert.

I guess this is really about higher-level design decisions: do we want to do more stuff implicitly, maybe with “hidden” performance impacts that may be a problem for unwary programmers, or force everybody to do more things explicitly? If you have Java on one end of the spectrum and Scala on the other, Kotlin falls somewhere in between. Maybe it’s a good compromise – we’ll see.

In this specific case, I maintain that the resulting language/API is weird: no primitive types, but primitive array types. We don’t abstract primitive types away completely, but we also don’t keep them.

We’re comparing having a native array (int[]) and a wrapped one (Integer[]).

The native one is faster to read/write, but the wrapped does not need to be entirely converted each time it crosses a generic boundary.

If you are going to use generic operations, the native array will only perform better in cases where the ratio of native access vs wrapped access (in the generic methods) is very high. In the generic method, you pay the regular wrapped read/write overhead but also the overhead of wrapping the whole array.