Compound extension


#21

Now that I understand your proposal better, let me go back and address some of the concerns I didn’t address directly.

I understand that calling a.doSomething() is equivalent to calling to calling doSomething() , but leaving it all up to compiler could make everything very complicated.

I am not sure the alternative you are proposing. In both proposals the compiler must decide what doSomething() resolves to. We are just discussing what rules it would use to do so.

Consider situation where B is a subclass of A , then the function [A, B].f will shadow [B].f

First, this problem already exists in extension and is dealt with by explicit resolution rules. The above example is equivalent to the following,

open class A {
    fun B.f() {}
}

class B : A() {}

fun B.f() {}

fun t() {
    val a = A()
    val b = B()
    with (b) {
        with (a) {
            f()
        }
    }
}

The scoping rules dictate that the fun B.f() {} in A will be called not the module level fun B.f() {}. This is because the resolution stops looking for extensions once a match is found and the A's fun B.f() {} is found as A's scope is consulted first. My proposal would not change this.

Second, shadowing will exist in any proposal and it exists in yours as well. Given the same example, with both [A, B].f and [B].f in scope, you will need a rule that decides which is applies when in a scope chain containing A, B.

You are trying to mix type-based and order-based approach, which probably won’t end well.

My proposal is simply given [A, B].f is in scope, A is considered to have an extension member B.f.

My proposal is recursive, in that, [A, B, C].f implies A has [B,C].f which in turns implies B has C.f. My original syntax is more clear in this regard as it would then state that A.B.C.f implies A has an extension member B.C.f which implies B has an extension member C.f. In other words, A.B.C.f can be seen as extending A with B.C.f. With this understanding it is clear what is being extended, A in this case, and what the extension is, B.C.f, and when it applies, when A is part of the unqualified scope.

It introduces no new concepts to the language or new ways to look up members in scope. It affects the collection of extensions.


#22

Could you please explicitly describe resolution rules you are talking about? I proposed two different schemes.

  • One requires the set of function types to be a subset of current scope types. The order does not matter. Maybe one receiver could cover two types simultaneously. In this case we need some kind of additional rule to treat duplicate type problem to guarantee backward compatibility. Maybe we can introduce something like “latest matters rule” so G, A, B, A === G, B, A. Function definitions with same type parameters but different order are considered to be the same definition.

  • Second requires the the list of receivers to be sublist of current scope list (in precisely the same order). It automatically solves ambiguity problems, but limits call flexibility.

It seems like your idea is closer to type-based resolution, but it require more specific rules. For example, I’ve just proposed a mechanism to treat duplicating types, but it was a simple example. Some types will be subtypes of others, so rules must be very clear and concise. Also they should be tested against all examples presented so far.

Both approaches have their own pros and cons. If we can’t clearly decide which approach to take, KEEP should contain both possibilities until one of them would be discarded by some test or by community feedback.


#23

Could you please explicitly describe resolution rules you are talking about?

I was referring to your second proposal.

The first one is not compatible with the current A.f extension and is ambiguous with regard to operators such as [A, B].unaryPlus(). It is unclear which type receives the operator. I believe these are fundamental flaws with the unordered approach.

It seems like your idea is closer to type-based resolution, but it require more specific rules.

The only change I made was to how extensions are collected. All existing resolution rules and scoping rules are unaffected. I am not sure what more specific rules you are referring to.

I believe what makes it seem more complicated is you are trying to understand my proposal in terms of your original two proposals. It is not based on either. It should be understood on its own.

Both approaches have their own pros and cons. If we can’t clearly decide which approach to take, KEEP should contain both possibilities until one of them would be discarded by some test or by community feedback.

Sure. Can you fork my proposal and and add a section describing the rules? I don’t believe I can do them justice.


#24

Next iteration of resolution proposal

Here I will try to combine best parts of all current proposal and account for backward compatibility. The resolution could be done in following steps:

  1. Create a list of actual contexts G, A, B, C, ... where G is a global context. Top level non-extension functions have only G as context. Class members have that class as context. Extension function have their receivers added to context where they are defined. Running a function with receiver adds that receiver to the list of contexts where this function is defined.Types in the list could be duplicating. We will call those actual types Cy where y is the index
  2. Normalize context list by walking list from left to right an removing elements which have the same type or subtype to the right. Meaning G, A, B, A will be reduced to G, B, A. Context list uses compile-types so types with different parameters are considered to be different types so List<Int> won’t replace List<Double>.
  3. Checking the definition of the function. Function receiver list is written in form of [R1, R2, R3]. Duplicate types or ambiguities throw compile error. Types could have parameters defined outside type list like fun <T> [T, Operation<T>].doSomething(). We will call receiver types Rx where x is the index.
  4. Binding of extension function. Each of types R in receiver set is matched against the elements of context list from right to left, binding it to first match and thus creating a map Rx to Cy. If there is a bound pair for each of Rx then function is considered resolved and bound to context. Multiple R could be bound to the same context C, so it is possible to have just one context for multiple receiver types if it matches them both.

The normalization step is probably could be avoided in this scheme. The results of this procedure are the decision about binding and binding map Rx to Cy.

This resolution is done via declared receiver types Rx which are then substituted by actual runtime objects representing Cy.

Compatibility check

  • [A].f === A.f.Extension function with single argument should work exactly like existing extension function. It seems like it does. It is resolved and bound to the closest context matching its type.

  • Match current member receivers strategy. Seems to be working the same way. It resolves to the closest context even if this context implements both receivers.


#25

Some additional thoughts

I was thinking a lot about this context-based resolution idea and I think that it is possible to go a little further (not immediately, mind you, it is the idea for the future). I am leaving this global context G everywhere in my schemes and it does not play any important role because it usually is empty. But in fact it could provide a lot of opportunities.

File level context

G is basically resolved to file which does not have any type of its own. But if there was a way to bind a context or a set of contexts to the file itself (not proposing any syntax solution, but it should be quite easy). Then everywhere in my schemes above we will just replace G with G, F1, F2. Meaning that all classes and functions in this file will have additional implicit contexts, just like type-classes. Of-course, it means that Fs could only be singletons in this case.

Extension classes and interfaces

We consider a class or interface to be top-level non-empty context. It seems to be not so hard to add a set of external contexts to the class context. It could look like class [T1,T2].MyClass{}. In this case the instance of this class could be created only in context bound to both T1 and T2 and all members of this class will have additional implicit contexts, meaning member of MyClass will be able to call members of T1 without additional declarations. From the implementation point of view, the instances of T1 and T2 could be class constructor implicit parameters (or a single map parameter, which is probably better), it should not violate anything, even Java compatibility (you will have to pass instances of contexts to create class instance). The interfaces could work the same way, just provide synthetic (invisible) property which holds the Map<KClass, Any> of actual receiver.

Note: There will be some problem with class type parameter syntax here since unlike functions, type parameters in classes are declared after class name, but probably it could be solved.

This second proposal is in fact much more flexible than first one since we can use non-singleton contexts and define them explicitly on class creation. Also, both ideas probably could cover most of type-classes use-cases.


#26

I like the idea of extension classes.


#27

I don’t really. I think they sound fun from a theoretical point of view, but they don’t really add much to the language. Also we already have inner classes which (if I understand this proposal correctly) do pretty much the same, except that they have to be declared inside the class, so no extension.
But my main problem with extension classes is that I just don’t see the benefit. They make the language more complex and I can’t think of any real example where I would want to use them. Why not just use a normal class and pass references to them? Yes it’s a bit more verbose but I’d argue that’s a good thing.


#28

In fact they cover the same spot as type-classes could and people are pretty sure we need those. It is not the same as inner class, because

  1. inner class must be declared inside the class from the beginning, you can’t add it to existing class
  2. inner class do not give you type sums as extension classes could (we are talking about multiple receivers)

Anyway, right now, I do not propose it to be implemented immediately. It is just a by-product of this general direction of thinking when considering language evolution. We need to know possibilities when deciding one or another way.


#29

In my project we follow a pattern that would benefit from extension classes. It goes as follows:

fun <T> T.foo() where T : Context1, T : Context2 {
    // Some logic that needs both Context
}

class Client(
   override val propNeededByCtx1: Dependency1
   override val propNeededByCtx2: Dependency2
 : Context1, Context2 {

    fun test() {
        // I can invoke:
        foo()
    }
}

Context1 and Context2 are interfaces that have some dependencies and provide some logic in default functions that use those dependencies.

Now with extension classes I could declare Client like this (pseudo syntax):

class [Context1,Context2].Client {
    fun test() {
        // I can invoke:
        foo()
    }
}

The creation of instances will probably be more verbose in this case. But this is usually a Singleton created by a factory, so not a big problem.


#30

I completely forgot about where construct. It also allows to create type sums, but it is in fact rather limited because you can’t construct a sum from two different objects. Also the syntax is not pleasant. It seems like multiple receivers could allow to deprecate it.


#31

@darksnake I did some investigation on how the Kotlin compiler resolves extension functions and I have come to the conclusion that your take on how extension function should resolve is closer to how the compiler works than mine.

However, I would make some changes because,

  1. The G in your example doesn’t appear necessary as the global scope (file level scope) is never passed as a parameter so doesn’t need to be in the list of contexts.
  2. The normalization step seems unnecessary as it would fall out in resolution. Also the distinction between “compiler types” is unnecessary as the compiler only deals with compiler types and Java types for example is a platform mapping.
  3. The compiler refers to what you call a context as an implicit receiver so we should switch to this vernacular to avoid confusion.

I would then recast your steps to the following,

  1. Create an ordered list of implicit receivers This already exists in the compiler as the ImplicitScopeTower

  2. Checking given an extension function with receivers T1Tn and a receiver scope type of R and implicit receiver scopes in the scope tower of I1Im an extension is valid if R :> Tn and there exists some permutation of (Ti, Ij) in i 1n-1 and j in 1m where Ij :> Ti and for each element in the permutation all i and j values are in increasing order. If multiple valid permutations are possible then the last permutation is selected when the permutations are ordered by the values of i and j.

  3. Report ambiguous calls If multiple valid candidates are possible the call is ambiguous and an error is reported.

For example, a declaration fun [A, B, C].foo() would have a receiver type list with three elements T1 = A, T2 = B and T3 = C. For the following invocation of foo.

class A {}
class B {}
class C {}

fun [A, B, C].foo() {}

fun t(a: A, b: B, c: C) {
  with (a) {
    with (b) {
       c.foo()
    }
  }
}

It would have an implicit scope tower of a, b with I1 = A and I2 = B. The R would be C and is valid because C :> C. The permutation receivers [(T1, I1), (T2,I2)] which is valid as A :> A and B :> B.

This is compatible with the current semantics as the current semantics are the above without the permutation of the remaining receivers with the implicit scope tower as n is always 1 which results in a valid empty permutation [ ].

This can be done fairly efficiently if the permutations are produced by pairing Tn-1 to some Ij and then pairing Tn-2 to some Ik where k < j until a valid permutation is found. The first valid pairing found will be the last permutation given the ordering described above. This can be accomplished in worst case O(n*m) steps.

One thing to consider is whether the ordering of j is strictly necessary. That is would the following permutation be valid [(T1, I2), (T2,I1)] which is what you would get for,

fun t(a: A, b: B, c: C) {
  with (b) {
    with (a) {
       c.foo()
    }
  }
}

#32

I agree about normalization step. It appeared during thinking process. Later I found it unnecessary, but decided to keep it because it makes the procedure somehow easier to understand from human perspective.

I keep writing global context everywhere for two purposes:

  1. Just to designate the root of “tower” as / in Linux file tree.
  2. As I mentioned here in future it is possible that global context could be substituted by something. We are not there yet, just want to keep ourselves to possibilities.

As for receiver types versus context. It it a question of terminology which is not very important at the moment. I think that it is better call them receivers when called inside the extension and using this@ notation, but when we are talking about function use site, the context therm is appropriate. The thing is that this feature we discussing is quite important to the style of architecture design I call “context-oriented programming” where instead of object-oriented polymorphism, you get explicit context-based behavior substitution. I wrote an article about it (and probably will write more since people seem to like it). It is not that important though.

Will you PR KEEP proposal anytime soon? I think since we agree on the main point (resolution) we can iron out the details in the KEEP.


#33

I will leave it out of the normative section then as it is unnecessary and falls out when the the scopes are searched in order. The non-normative section can introduce normalization as to simplify the intended behavior.

I want the proposal to stand by itself, at least in the normative sections. The non-normative section can talk about capturing the file scope as a potential receiver parameter but that can be done as a separate proposal. Therefore, I will not talk about G in the normative section but just refer to implicit scopes in general. As G can be thought of as an implicit scope it is included indirectly already. However, to allow G to be capture there needs to be a way of describing G's type which has impact on the language as a whole, not just extensions. If G can be described such that you can declare a type T where G :> T then G is included implicitly.

I agree. I will putting folding the result of our discussions today and probably into tomorrow. I will create the PR sometime Friday (I am in PST, GMT+8).

Chuck.


#34

#35

I think that the key points of discussion results should be added to proposal.