On Generics


#1

Hi!

I spent the last couple of days playing with Kotlin, and thought I’d give my feedback. In general, I’m extremely impressed with the language and its direction. I think it’s got real potential and I’m very tempted to hacking on an llvm backend. However, the generic system worries me a little.

Variance and type reification is super nice, so I’m really glad you’ve identified that. But I’m a bit concerned that you might be falling into the same traps as scala and its type reification (ala Manifests). I feel the Tuple{1-23} and {X}Array is evidence of this.

And sooner or later, people will need things like scala’s HList’s (An arbitrarily nested Tuple2 with a whole bunch of sugar) e.g. https://github.com/milessabin/shapeless

Most of which could be avoided by providing by starting with a better designed “generic” system. I think D is an extremely good example of it done pretty well, and C++11 to a much lesser extent.

I personally think a type system really needs to support variadic generics (e.g. Tuple<vararg T> ) along with the ability to provided specialized definitions e.g. “Array<Short>” that “code-generates” a specialized version.

Now I know I’m asking too much, but the other thing I find very missing in scala is the ability to put values in the type system. Often there’s times I want something like: FixedSizeArray<32, Int> or even a NodeDepth<3> (as you’ll see in the Scala sources, they often resort to using a preproccessor to code-generate something like FixedSizeArray32<Int> and NodeDepth3.

Thanks!
Keep up the awesome work


#2

Hi Eric,

I’d be curious to see you expand on this and give a few examples where you think these advanced features would be useful, in particular the Shapeless type (awesome technical work but still a bit of a curiosity to me) and dependent types (still struggling to find some practical uses to these, more details here).


#3

Hi Cedric,

Sure, no problems.

I think there’s a few misconceptions in your blog, for instance “
First, we need to create our list without putting any elements in it, but doesn’t this violate the constraint?” isn’t much of a problem, and has never stopped the C++ guys (e.g. http://www.boost.org/doc/libs/1_50_0/doc/html/array.html ). The trick is to have a constructor similiar to the one in Kotlin. e.g.

FixedSizeArray[Size: Int, Type](init: (int: Index) -> Type)

(i.e. a function that can initialize it). You can also fill the Array with some default value (mutability and dep typing are largely orthogonal!)


Before I start providing examples, I should make it clear that I’m just advocating a generic system more along the lines of D or C++, than some full blown dependent typing like Agda.


So my background is from C++, and the last 9 or so months I’ve been using Scala. Here’s some cases I’ve run into, that either scala’s dependent method types helped – or I wish I had a more C++ish type system:


Example 1: We are working on an abstraction layer on top of Hadoop, where hadoop requires that all Key and Values implement the “Writable” interface. So we want to provide a default implementation for all the common types. This is straight forward, until you run into types that people wrote code generators for [because the type system limited them]. E.g. Tuple1,  … Tuple23, ProductN, FunctionN, etc. etc. So now we have to decide between copying and pasting hundreds of lines of code, or writing our code generation for scala’s code generated stuff ewwwww. Had this been a case of Tuple[N: Int]  (like the new C++11 one), this would’ve been far superior and we could’ve written it properly.

Example 2: [Note this is a simplification of the problem we solved using Scala’s dependent typing here: https://github.com/NICTA/scoobi/blob/master/src/main/scala/com/nicta/scoobi/application/Persister.scala#L33 ] I’m going to give a far more understandable analogy here.

Let’s say you want to do a database/relational style equijoin of something that has the type  Collection<K, V> and Collection<K, W> – for arguments sake, lets say you want to end up with something of the type Collection<K, #(V, W)>. This is good and true, but it doesn’t really hold when you nest them:

a join b join c

You wouldn’t want to end up with Collection<K, #(V, #(W, X)> but you’d rather: Collection<K, #(V, ,W, X)>  – in short, you want a different depending on one of the types ;D This problem actually comes up a lot when you understand it. For instance, I find boost::asio parser nicer to use than the scala parser combinators because of this issue (and because its smart enough to know when you parse something that returns a ‘Unit’ you don’t want to add it to the type etc.)

Example 3: For the love of god, I really something that has half the power of C++'s boost::multi_index

Example 4: For efficiency, sometimes you want to move things from being stored in each instance of the object, but rather in the type. Like in Scala’s persistent HashMap they have:

“Note: the builder of a hash map returns specialized representations EmptyMap,Map1,…, Map4 for maps of size <= 4.”. But with a nice type system, you can just do this: https://github.com/espringe/wisp/blob/87e2e682b5230ae9fa373cc6af23b2f6ea1e7266/persistent/map.hpp#L59

without having to (again!) resort to code generation. As an added bonus, you have some nice typesaftey (Map1 can only point to a Map2 etc.)

Example 5: Type safe feature vectors. We do a lot of machine learning, and having typesafe feature vectors would be simply amazing. A feature vector is basically the result of data extraction, and what you train your machine learning algorithm on. They are typically are from a few hundred elements, to a few thousand. They’re fixed width, and fixed type. They generate a model which takes in data of the same. Right now we have to throw away all type saftey, basically. But evne if we could preserve the size would be very useful, and stop a lot of mistakes. And having fulll type information (i.e a big ass tuple, where every element in the tuple must meet some requirement) would just be great

That’s just off the top of my head, but I could probably provide some more if you’re still not convinced :slight_smile:


#4

Hi Eric,

Thanks for your interest in Kotlin.

What I gathered from your post is that you request a template-based approach to generic classes:

  • ability to specialize implementation for particular arguments,
  • ability to metaprogram with (hopefully, compile-time-constant) numbers in types.


Please, don’t call C++ “a nice type system”. To our standards of understandability, C++ has a very non-nice type system: too permissive, too cryptic errors.
Maybe “concepts” somewhat fix that one day, but they didn’t quite work out so far… And tey seem to be a rather complex machinery.

We investigated template-like design options and went for what we have now (no reification at all, arrays specialized for primitives) for a bunch of reasons mainly coming from the constraints imposed by the Java ecosystem:

  • Dynamic class loading (crucial for very any applications) makes any kind of implicit (C+±style) templating very hard
    • if we are to generate specialized classes, in what class loader should they sit?
    • and if not in the system one, how will different libraries talk to each other?
    • if we are for runtime generation, this problem does not go away and we have performance penalties on top…
    • This proposal attempts to work around this issue: KT-744, the story there is mostly about “if you want to build a code generator into your language, you’d better isolate it somehow”.
  • Reification (even something Scala-like) messes up the Java interfaces of Kotlin code, which is problematic in a heterogenous project.


This does not mean that there’s no way we can do (some of) what you request. The question is: how much of it is both really necessary and possible to implement without compromising other aspects of the language.
It seems that rather few people actually want all that power. But these people are very bright and passionate, and I like them, so I’ll do my best to please them :slight_smile:

I would be very interested to discuss what you propose further. My experience is that on early stages of such discussions it can be much easier to talk (e.g. over skype), than to exchange forum messages/emails. If you would like to talk it over, you can reach me at "andrey dot breslav at jetbrains dot com" to arrange a call.

P.S. On the llvm back-end: there are some preliminary discussions about this going on in our team. If you would like to participate, we’ll find a way to arrange it.


#5

abreslav wrote:

Hi Eric,

Thanks for your interest in Kotlin.

What I gathered from your post is that you request a template-based approach to generic classes:

  • ability to specialize implementation for particular arguments,
  • ability to metaprogram with (hopefully, compile-time-constant) numbers in types.

This is correct

Please, don’t call C++ “a nice type system”. To our standards of understandability, C++ has a very non-nice type system: too permissive, too cryptic errors.
Maybe “concepts” somewhat fix that one day, but they didn’t quite work out so far… And tey seem to be a rather complex machinery.

I agree, and I also agree the lack of ‘concepts’ or ‘requirements’ that is one of the main reasons they’re so painful to use. However, I do think a lot of this complexity is artifical and just an artifact of the accidental nature of C++ templates. I think Walter Bright proves this with D.


We investigated template-like design options and went for what we have now (no reification at all, arrays specialized for primitives) for a bunch of reasons mainly coming from the constraints imposed by the Java ecosystem:

  • Dynamic class loading (crucial for very any applications) makes any kind of implicit (C+±style) templating very hard
    • if we are to generate specialized classes, in what class loader should they sit?
    • and if not in the system one, how will different libraries talk to each other?
    • if we are for runtime generation, this problem does not go away and we have performance penalties on top…
    • This proposal attempts to work around this issue: KT-744, the story there is mostly about “if you want to build a code generator into your language, you’d better isolate it somehow”.

All good points, and something I don't any any insights into. But I'll give it some thoughts

  • Reification (even something Scala-like) messes up the Java interfaces of Kotlin code, which is problematic in a heterogenous project.
  • This does not mean that there's no way we can do (some of) what you request.

    Indeed. Scala also has these "evidences" that make calling scala from Java hell on earth. [ You effectively need to create a 'simple' wrapper first in scala, then only call that simple wrapper from java ]

    The question is: how much of it is both really necessary and possible to implement without compromising other aspects of the language.

    It seems that rather few people actually want all that power. But these people are very bright and passionate, and I like them, so I'll do my best to please them :)

    This reminds me of a conversation I had with a coworker:

    Me: I really find Scala more fun than C++, but the lack of metaprogramming is driving me insane
    Him: What are you talking about? you always hated metaprogramming in C++
    Me: But I love when people and libraries have made use of it
    Him: True, that’s pretty awesome

    I would be very interested to discuss what you propose further. My experience is that on early stages of such discussions it can be much easier to talk (e.g. over skype), than to exchange forum messages/emails.
    If you would like to talk it over, you can reach me at “andrey dot breslav at jetbrains dot com” to arrange a call.

    P.S. On the llvm back-end: there are some preliminary discussions about this going on in our team. If you would like to participate, we’ll find a way to arrange it.

    I don't think there's anything more I can offer regarding the generics/templates, as you're obviously fully apprised of all the issues in play. But the llvm backend does interest me, and something I'd enjoy contributing to /  participating in the discussions. For now though, I'll fork the project on github and try get my bearings.


    #6
    • Dynamic class loading (crucial for very any applications) makes any kind of implicit (C++-style) templating very hard
      • if we are to generate specialized classes, in what class loader should they sit?
      • and if not in the system one, how will different libraries talk to each other?

    I don't know enough about this problem, but perhaps it can be alleviated by only allowing specializations of a class from within the same "module" it was defined in or something?

    Of course, that would prevent you from emulating things like haskells type classes (e.g. std::char_traits, std::allocator or the compator classes in the stl) – but they can be very complex, and  can’t actually think of when that has been useful for anything other than performance hacks – so I wouldn’t care about losing it.

    I think you’d still have to resort to runtime class generation, but at least you should at least know what classloader has to do it and with some pre-create hints it might not be too bad at all. (e.g. Tuple$1 to Tuple$10 might be pre-generated by the class loader – but anything more would be created on the fly).

    Thanks for listening to my ramblings


    #7

    I don't know enough about this problem, but perhaps it can be alleviated by only allowing specializations of a class from within the same "module" it was defined in or something?

    This means that I can't use templates for public library classes, like collections.

    I think you'd still have to resort to runtime class generation, but at least you should at least know what classloader has to do it and with some pre-create hints it might not be too bad at all. (e.g. Tuple$1 to Tuple$10 might be pre-generated by the class loader -- but anything more would be created on the fly).

    Most of the time the right class loader is the system class loader that we have no control over.