Rationalizating Pure Functions on Kotlin


#1

Hi there!

The purpose of this thread is to discuss the degree to which we can/can’t write Pure Functions in Kotlin. For starters, this topic has been discussed before, and I understand that Kotlin does not enforce Pure Functions as a language feature. If I’m correct, this is mostly due to the fact that Kotlin and Java are “Pass by value of reference”.

At first, I thought this was a non issue because I’ve been hearing plenty of talk about “val” being a way to create “immutable data” (not necessarily from the Kotlin team, just on the interwebz). However, I’ve come to understand that val does not mean immutable, it means read only. So if I pass an argument to a function, even though it’s a val (which is always the case for function arguments), this only assures you that you can’t change the underlying object in the scope of the function itself.

So, here’s the point I’m interested in discussing. If we look at a definition of a Pure Function as far as the Haskell community is concerned (I’m willing to assume that they probably know better than the average imperative programmer about this topic), it requires:

  1. Its return value is the same for the same arguments (no variation with local static variables, non-local variables, mutable reference arguments or input streams from I/O devices).

  2. Its evaluation has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or I/O streams).
    (source)

So, we could achieve these standards in the context of some “toy problem” where we don’t require any concurrency or data I/O, and there’s no possible way that a given mutable reference argument will ever change. This is typically the kind of code snippet I see when I read about people writing “Pure Functions” in Kotlin.

However, the moment any mutable reference to an argument exists outside of the scope of the function itself (which appears to be possible whenever working with var properties), the function is very not pure, at least by the above definition.

If the above reasoning is correct, then I get the following requirements for writing Pure Functions in Kotlin:

  1. If a given object can be verified to never have a mutable reference associated to it , in the context of the entire system, it can be regarded as truly immutable; therefore it could be used as an argument for a Pure Function (read: its immutable enough).
  2. One may not determine from a given function, whether or not it is pure, based on looking at it in isolation. This is because truly immutable arguments can be achieved, but are not enforced by the compiler.

Anyways, I’d like to affirm that I’m not here as a functional purist (I have an imperative upbringing myself), but I care very strongly about the fact that there might be a large number of people eagerly teaching beginners how to write what I’ll jokingly call “Contextually Pure Functions”, which appears to me as a major misunderstanding of the concept.

I’d be happy to hear any thoughts on this, including if my facts/reasoning are wrong at any point. Just trying to straighten some definitions out for myself, and possibly others by extension.

EDIT I’ve added a sample here to demonstrate my point that it’s possible to have a situation where unexpected state changes occur when concurrency is introduced. I’m admittedly trying to fail at concurrency and immutability, but my point is that it appears that you can’t determine if a function is pure in Kotlin, unless you look at it’s arguments in the context of the whole system.

import kotlinx.coroutines.experimental.delay
import kotlinx.coroutines.experimental.launch
import kotlinx.coroutines.experimental.runBlocking

fun main(args: Array<String>) = runBlocking {
    //Blin.a is initialized to 0
    val b = Blin(0)
    val c1 = launch {
        printCurrentValue(b)
    }

    val c2 = launch {
        b.a = 1
    }

    c1.join()
    c2.join()

}

class Blin(var a: Int)

//Looking at this function alone, I have no idea of expecting b's state to change.
suspend fun printCurrentValue(b: Blin) {
    doThing(b.a)
    delay(1000L)
    doThing(b.a)
}

//assume doThing is some long running operation
fun doThing(x: Int) {
    println("I was called with $x")
}

#2

So basically you’re saying:

  1. A pure function always returns the same value for a given input.
  2. Pure functions cannot have side effects.

And that:

  1. We could write immutable types (just like in Java, using all final variables).
  2. We could write pure functions using our immutable types, no IO, no randomness, and causing no side effects.
  3. A function often can’t be known as pure without searching outside of the function to check for mutable types or IO devices or random number generators or side effects.
  4. The compiler doesn’t check that pure functions are really pure.

Is my understanding of what you said correct?

For me, I don’t see how pure functions could reasonably be checked by the compiler. But, maybe a developer could indicate a function is pure using something like Kotlin Contracts? This means we we’d have to trust the developer that the function is pure.


#3

First of all, you can think of their as being two vals: one is immutable the other is read only. Because val can be used as shorthand for a delegated getter, class vals are often merely read-only instead of truly immutable (if you’ve used a lot of smart-casting within Kotlin, you’ve probably seen and felt the difference). But if the class/property is final and it is not using a getter (or delegating a getter), then it is immutable. And for local variables within a method scope, val is immutable.

Certainly it’s POSSIBLE to write pure functions within Kotlin. (note: stateful things doing stateful things with your functions RESULT does not make it impure; all that matters is what’s going on in the EVALUATION). But one needs to be careful of framework changes that can occur during evaluation. For example if a function CALLS a non-pure function, then IT is a non-pure function. And since Java has no framework for marking/validating methods as pure/non-pure, though it would be POSSIBLE to write a language framework that identified pure and non-pure functions, it would be extremely limited in its interoperability with Java. And if you throw an exception (such as divide by zero), then it is not a pure function. So there’s a little more involved than just making sure a method has no access to mutable variables with a longer lifespan than the method itself. And from the compiler/pre-compiler perceptive, there would be pretty much no way to guarantee a method is a pure function if it uses Java code except, as arconies mentioned, some form of annotation based entirely on trust which very much lessens the purpose/power of pure functions).


#4

Blockquote
So basically you’re saying:

  1. A pure function always returns the same value for a given input.
  2. Pure functions cannot have side effects.

Blockquote

This isn’t my definition to be exact. I’m supposing that it is the most accurate, based on Haskell being a popular and “Purely Functional” language. One of the big problems, is that there appears to be at least two different working definitions of a Pure Function.

  1. That’s kind of the issue. We can write immutable reference types, but we are not restricted from from having another mutable reference type which points to the same underlying object (as far as I’m aware). If all reference types are immutable (val), then this isn’t really an issue since you can still be sure that no one is messing with the state of the underlying object. That’s the main concern, the state of the underlying object being potentially open to change.

  2. Yes, but only presuming that no mutable reference exists for any of a function’s arguments. If we can assure that, then I believe one must conclude that the arguments can’t be guaranteed to be immutable. Even if it will be 99% of the time, this is not acceptable to any definition of a Pure Function which makes sense to me (so far anyways).

  3. From what I can see, yes. You can’t look at a function in isolation and be assured that it is pure, for the above reasons.

  4. Certainly it doesn’t.

Note that I’m not actually advocating for a new language construct, nor compile time checking. If such a solution comes out of my philosophical rambling, then great. I don’t expect it to though; really just trying to understand exactly what’s going on here regarding technical “purity”.

Unfortunately, in absence of a solution like that, it does appear that we must rely on the devs to come up with their own conventions for avoiding accidentally mutable function arguments. Fortunately I don’t think it’s likely to run in to that as an issue unless you write crappy code, but that’s beside the point of being technically correct.


#5

In an effort to imagine how this could be possible in Kotlin, I’ve come up with this:

NOTE: I’m not advocating this idea, it’s only a thought experiment

Here’s how I imagine it being done:

Add a new pure modifier for functions:

pure fun foo() { /*... */ }

Pure functions may only call other pure functions.
Pure functions may only use pure types.

Here’s an example

pure fun makeGreeting(name: String) : String { // String would have to be a pure type to be allowed.
  val uppercaseName = name.toUpperCase() // toUpperCase() would have to be marked a pure function.
  return "Welcome $uppercaseName!" // Returning pure type, so this would be allowed.
}

Again, I’m not sure I like this idea, but I’m absolutely up for the discussion around the concepts. It just takes me a bit to get out of my “minus 100 points” mode and into a “what if anything goes” mode :wink:


#6

“And for local variables within a method scope, val is immutable.”

Edit: I get that the reference itself is immutable; should have clarified that

Is this actually correct? The point I’m hung up on is this:
If I pass an argument to a function, I get that it is of type “val”. This means that I can’t change the underlying state associated to it within the scope of said function. However, if another reference (which happens to be a var) to the same state existed outside of the scope of that function, I don’t see any reason why it couldn’t be changed, thus breaking immutability.

Naturally this wouldn’t be a big deal unless threading was involved, but I don’t really care. State is either immutable or it isn’t.


#7

You’re right but wrong about why.

Maybe it’s just the wording.

Here you say:

It’s true the state could change, but you do not need another reference to exist or to be var.

This is not true.

Here’s an example:

class Cat(var name: String) // This type is mutable
val Bob = Cat("Bob") // here's our val reference. This does NOT mean that "can’t change the underlying state associated to it within the scope of said function"

fun petCat(cat: Cat) {// The cat variable comes in as a `val`
  cat.name = "John Smith" // Here we mutate the val.
}

#8

I take your point that I was incorrect there; thanks for clarifying.

That being said, the above function would not be pure as it changes state of course.


#9

Ok, I did get some input from Roman on twitter which seemed to confirm my hypothesis:

“Make val not var! (and your code will be pure). If you only use vals in your classes, then they are immutable and it is enforced. Pragmatically, the fewer vars you have in your code the better. Ideally, no vars at all.”

source

So in other words, if you’re doing a good job as a programmer, you should be able to ensure immutability by using “val” everywhere.


#10

A var and a val can’t “share the same reference” in the pointer sense, they can only share WHAT they’re referencing. You can’t get a reference to a val (other than by invoking it), you can only get what it references. And once you change that reference, you’re changing YOUR reference, not it.

To give an example.

data class Thing(val immutable: Int)

val a = Thing(0)  // a is immutably referencing an immutable object
var b = a // b is referencing the same object
var b = Thing(1) // a and b are now referencing different objects

Of course if the thing they are referencing is not ITSELF immutable, then you can have an immutable reference to something mutable, which could very easily violate the Pure Function principals and many other things:

data class Thing(var mutable: Int)

val a = Thing(0)  // a is immutably referencing an immutable object

fun doSomething( thing: Thing)
{
   // Black Box that might very well change the value of a.mutable
}

It is important to not mistake val as a guarantee that the entire chain of things it reference is immutable, just that IT is immutable. Such a guarantee is certainly possible in Kotlin and it COULD even exist at the language level (which is kind of the idea of some imutable syntax proposed), but for pure data it’s pretty easy to keep track of that on your own. But when it comes to workflow, truly pure functions and pure function chains can be far more difficult to track.


#11

I understand that point very well. Forgive me if I’m not using the best words, but what I’m talking about is a mutable reference, and an immutable reference, pointing to the same location in memory.

Edit after misunderstanding a point you made


#12

If val a: Int exists and var b: Int exists, then in no situation will changing b change a (talking about actual immutable vals, not delgates/getters). In this specific case that is because a and be were never pointing to the same place in memory, they were only pointing to different places in memory that shared the same value. The same is true if it were val a: Thing and val b: Thing. They were both pointing to different places in memory that THEN was pointing to the same memory where the class data is stored. But in general you shouldn’t be trying to think about things in the memory level with abstract languages. The language contracts are what you should be concerned with.


#13

If we go back to compiler checks: Checking for val-only does still not ensure that an object is immutable. You could still implement the getter as an impure function

val foo: Int = 0
    get() = value++ // I hope value accesses the backing field :)

I still kind of like the idea of adding pure to the possible contracts. Maybe this will be simpler once project valhala adds value types. That way we could restrict pure functions to only take value types as parameters, which would simplify compile time verification of them. That way they might be more restricted than pure functions defined in other languages.


Another idea might be to add trusted functions in the contract

fun pure() {
   contract {
       pure(::doSomething, ::doSomethingElse)
    }
    doSomething()          // this could be defined in java and therefor is not marked as pure, but we trust it
    doSomethingElse()
}

This does not however solve how to stop pure functions from depending on or changing global state.


#14

If val a: Int exists and var b: Int exists, then in no situation will changing b change a (talking about actual immutable vals, not delgates/getters). In this specific case that is because a and be were never pointing to the same place in memory, they were only pointing to different places in memory that shared the same value.

I’m talking about the specific situations where and a and b point to the same location in memory. I’m never talking about situations where a and b don’t point to the same location in memory. If there’s something I keep saying that implies otherwise, would you mind pointing it out? I’m not being disingenuous here, really trying to grasp where the disconnect is.

I’ve added a sample to op to demonstrate the situations I’m speaking of, hopefully that will clarify things.


#15

As I said, it’s impossible for val a and var b to point to the same place in memory. Your example is of val a and var b pointing to different places in memory which then point to the same class which itself is not immutable, which I already talked about. If Blin’s sub-components are not immutable then Blin cannot realistically be used in a pure function unless the function just doesn’t look at the mutable sub-components. In Kotlin language, val a being immutable does not imply that a.x is immutable. Which is not to say that it guarantees that it isn’t, either. But regardless, in your example there’s no way for doThing to change x because parameters in Kotlin are immutable. (Not the components of parameters. Again, immutability in Kotlin is only contracted at the root level, not at all subcomponent levels).


#16

I think I understand the confusion here; it’s admittedly probably my usage of pointer/reference jargon which I’m very not fond of due to situations like this.

When I say val a and var b pointing to the same location in memory, I was speaking of them sharing a number which can be used to look up the same object (i.e. they both have unique addresses, both can be used to find the same object or location in memory, which I thought meant that they “point” to it). I’m not saying that I can have a situation where I look up the address of “a”, and it is the same address of “b”. Definitely not trying to say that, because it doesn’t make any sense.

Your example is of val a and var b pointing to different places in memory which then point to the same class which itself is not immutable, which I already talked about.

This is still, and always has been, what I’m (trying) to talk about.


#17

I think both of us should stop talking about the low-level notions of “pointers” in a language that has no concept of them because we’re clearly don’t mean the same thing by “points to”.

In any event, yes, if by “truly immutable” you mean “immutable throughout its entire hierarchy” it is correct to say that this is not enforced by the compiler. This doesn’t mean that pure functions can’t exist in Kotlin (especially considering that if you’re using Basic Types, then immutability is the same as “true immutability” ), just that there is no language framework to track and designate functions as pure.


#18

One question is how safe we want this to be.

For example, even if all references to an object were an immutable type, what’s to stop someone downcasting to a mutable type? Or even if an object had no mutators at all, what’s to stop someone accessing its internals via reflection?

I suspect those aren’t worth protecting against. But I think this could be useful even without ultimate safety. (Compare how type parameter checking can be defeated in similar ways, due to type erasure; but it’s still an extremely useful feature.)

This does sound like a good match for the forthcoming contracts functionality. (I like the idea of contracts, but it sounds far too manual a process right now. Ideally, I think it should be something the compiler could infer for itself; annotations would only be for double-checking, like Java’s @Override annotation or Scala’s @tailrec.)


#19

This is reminding me of the early days of java and arguments about pass by value vs pass by reference on Usenet news groups. I would always clarify it by asking the question, “Does java pass objects by reference or by value?” The answer is neither! Java has absolutely nothing in the language that represents objects. It only has references to objects which are passed by value.

I think the thing missing from this discussion is the idea of const references and const functions from C++. If you had the ability to achieve const correctness you could assure that a function is pure. Achieving const correctness however was always a huge task on C++.


#20

This is a pretty great conversation, thanks all for having it.

For my part, I think adding a pure modifier on a function would be a fantastic idea. This would introduce a compile-time check that passes iff the function depends on other pure functions and types that are immutable ‘to the bottom’.

For val properties declared using get() syntax, the property itself would only be determined as immutable if get() itself were marked pure.

It should additionally be possible to ‘force’ purity using contracts/annotations; both to resolve the Java interop scenarios, as mentioned above, but also to allow for code that introduces non-compile-time verifiable optimizations that are effectively pure (for example memoization via lookup table).

I actually think this modifier could potentially replace the inline modifier, which I have concerns with anyway. Inlining is a compiler technique to make code faster; I don’t think it should be referenced in theoretically declarative code. I see the value in telling the compiler that something doesn’t need to be retained for reflection; but it seems to me like pure plus something like unre (inverse of reified) would together serve the same (and more) purpose.

I also think it would be good to infer purity contractually, where possible, in the same way that auto typing is available by inference.