Null-safety and initialization order

Dear community,

what have I understood wrong about null-safety, if I’m surprised that the following code is causing a NullPointerException? (cf. play.kotlinlang.org/)

class DelusiveNullSafety {

    init { accessProperty() }  // Causes NPE

    private val nullSafeString: String = "property"

    init { accessProperty() }  // Works

    private fun accessProperty() = println("Access: ${nullSafeString.length}")
}
fun main() { DelusiveNullSafety() }

The documentation explains two exception sin the context of initialization, but (to my understanding) do not cover the presented example.

I would appreciate very much any hints about further documentation or explanation about this issue. Thank you very much in advance and kind regards, Daniel Schaffrath

Two points worth explaining here:

  • In the JVM, all variables have a default value of 0, false, or null (even for references that are non-nullable in Kotlin). That’s nearly always hidden from you, because Kotlin and Java make you specify an initial value. But there are very rare cases where you can see the variable before it has been initialised, and that’s when you get the default value. This is one of those cases!

  • Property initialisers and init blocks are executed in textual order, i.e. the order they appear in the code. So in this case, it’ll run the first init block before assigning “property” to nullSafeString (while it still has its default null value, hence the NPE); then it’ll make that assignment; then it’ll run the second init block.

That explains the behaviour you’re seeing.

(Technically, property initialisers and init blocks are implemented as part of each constructor, after calling the superclass constructor but before running the constructor body. So code in a constructor can usually assume that all the initialisation has been done — in its class and any superclasses, though not in any subclasses.)

This demonstrates why special care is needed when constructing a class — it’s important to ensure that everything is initialised before it’s accessed (not just to avoid seeing the default values, but also because invariants etc. may not have been established). In particular, it’s a bad idea to access methods or properties that could be overridden in subclasses: that can result in subclass methods being called before the subclass has been fully constructed — which can cause serious bugs that are much harder to find than the one here.

8 Likes

Thank you very much @gidds for your explanation.

Still, I don’t understand the rational behind the decision to execute property initialisers and init blocks in textual order, thereby sacrificing the very nice property of null safety. Also, I don’t see any advantage or special feature in executing the initialisers in that order. What would be an argument against first initializing all properties and then call (in textual order) the init blocks? Wouldn’t that safe the null-safety?

That’s unfortunately not true. Moving all variable iniziatlizers before init blocks doesn’t solve nullabliity.

For example this works:

class SomeClass {
    private val string1: String;
    init { string1="foo" } 
    private val string2: String = string1
   fun foo() = println(string2.length)
}
fun main(){
  SomeClass().foo()
}

But if you move the declaration of string2 before the init block, as you’re suggesting it won’t since string2 will be null.

Generally combining init blocks and initializers you can create arbitrary complex chains of dependencies which the compiler can’t figure out.

4 Likes

Really Kotlin’s promise of null safety is “the compiler will not let you write code which sets/uses null when that thing is non-nullable.” Which is still pretty good.

Actually, this is not quite true, the code shared in the question does not cause compile-time error, it produces NPE at runtime. Therefore, if Kotlin made such a promise, it seems that it could not fulfill it.

2 Likes

I think my original question suggests that I haven’t understood the function/purpose of an init block. The following code does not cause an NPE.

class DelusiveNullSafety {

    constructor() { accessProperty() }

    private val nullSafeString: String = "property"

    private fun accessProperty() = println("Access: ${nullSafeString.length}")

    init { accessProperty() }
}
fun main() { DelusiveNullSafety() }

So it seems that init blocks allow for some very “low-level” initialization/construction of an instance. But why is that at all necessary? Or in other words: Which “expressiveness” does Kotlin gain by allowing these low-level init blocks? @al3c Your example might serve as an example, but isn’t it a little “over-constructed” and one could simply move "foo" into the property initializer of string1? I assume there are more abstract/complex use-cases where init blocks really shine?

The documentation about classes explains:

The class header can’t contain any runnable code. If you want to run some code during object creation, use initializer blocks inside the class body.

To me, this seems to be an artificial limitation. Why not allow code in (primary) constructors, get rid of init blocks and thereby safe null-safety (and also allow for non-redundant secondary constructors)?

The code doesn’t try to set a non-nullable property to null, or pass a null argument to a function that takes a non-nullable parameter. Therefore the code compiles. That is what Kotlin promises.

Now based on how the JVM initializes memory and the intricacies of class instance initialization, it’s possible to write code that accesses a property whose backing field hasn’t yet been set to the correct non-null value. But asking the compiler to catch that is a tall order, and Kotlin doesn’t promise that it completely eliminates NPEs.

1 Like

I assume, the statement “the compiler will not let you write code which sets/uses null when that thing is non-nullable.” is your abstract/summarized version of the null safety page on kotlin.org? Or do you have another reference?

Regardless of that, I don’t understand how one could summarize this page in this way. The page explains “The only possible causes of an NPE in Kotlin are:” followed by a few bullet points - none of which IMHO refers the setting/usage aspect you mention. It would be very kind of you, if you could elaborate a little more.

The NPE caused by the code in my original post most probably falls into the category of the bullet point “Data inconsistency with regard to initialization”. But I think it neither falls into the listed sub-category “leaking this” nor “superclass constructor calls an open member”, or does it? If so, could someone please elaborate? If not, the documentation might need an update.

To be honest, I’m not exactly sure what you mean. There’s nothing magic or low-level in init blocks. The problem with initialization is that it is hard to provide the full null-safety without compromising on the developer experience. We would have to ban calling any functions/properties in the constructor, because otherwise it is hard to make sure they won’t touch uninitialized fields.

How is it different than the current solution? From the technical perspective, Kotlin’s solution with codeless primary constructor and initialization blocks is pretty much the same as in Java. When compiling, both sections are simply merged into a single constructor anyway. I think it was done like this purely for code conciseness and clarity. We can have a primary constructor as a part of the class header, we can define both properties and constructor params at the same time, which greatly decreases the code bloat. Otherwise, it doesn’t change too much.

2 Likes

I’m not providing an answer to the whole debate but it’s worth noting at this point that init blocks, together with property initializers, conceptually make up the “body” of the primary constructor. Because of how primary constructors work from a purely syntactical viewpoint, they cannot (again, just syntactically) have a body like secondary constructors. Therefore you can use init to put code into the body of the primary.

Because of that, I would generally discourage from the practice of calling instance methods inside an init block. I don’t know enough about the design backgrounds of init to definitively say that this should be a hard constraint enforced by the compiler, but it’s definitely a good rule of thumb. If you want to pack more complex initialization logic into a function, you can put that function into the companion object. Also, whenever possible it might be more elegant to directly initialize properties (for example with function calls) than to initialize anything in an init block. They are more useful for situations where you only want to execute code that doesn’t initialize anything (log statements, for example).

Concerning ordering, yes, even if the compiler doesn’t enforce it, it’s also a good rule of thumb to place init blocks after all properties. That’s beneficial even if you only think about readability.

2 Likes

Aren’t they contradicting statements? I agree with you, it is better to avoid calling member functions/getters/lambdas in the constructor. And by “constructor” I mean both property initializers and init blocks as they are the same. Unfortunately, as you said, sometimes it is more elegant to actually call a function and in this case we are leaving the safe zone and have to be careful as Kotlin can’t protect us anymore from NPEs.

Property initializers vs init blocks is just a matter of visual taste. They are pretty much the same thing.

“Hiding” the access (in an init block) to a non-nullable property through a function call (like accessProperty() in the example above) seems to be indeed a thing. The following code is generating a compile time error requesting nullSafeString to be initialized.

class DelusiveNullSafety {
    init { println(nullSafeString.length) }  // Variable 'nullSafeString' must be initialized
    val nullSafeString: String = "property"
}

Whereas the following code does not, and runs fine.

class DelusiveNullSafety {
    constructor() { println(nullSafeString.length) }  // Prints "8"
    val nullSafeString: String = "property"
}

So, I’m still puzzled about the nature of init blocks and Kotlin’s (very worthwhile) (cl)aim to solve the The Billion Dollar Mistake.

But this is really relatively simple.

All languages struggle with the problem of partially initialized instances and many of them try to solve or partially solve this problem by enforcing certain rules, order of execution, etc. Kotlin is no different here. In Kotlin order of execution is:

  1. Super constructor.
  2. Primary constructor (only setting props passed to constructor).
  3. Property initializers and init blocks top to bottom.
  4. Secondary constructor.

At each step it is guaranteed previous stages are fully initialized, e.g. when running a constructor we can be sure the super class was fully initialized, when running a secondary constructor we know the current class was initialized, etc. The tricky part is 3., because in this step we often need to use results of the very same stage. Kotlin helps with this by disallowing to read properties “below” the current one. It also requires all properties are initialized directly in the constructor, they can’t be initialized in a custom function - this ensures all props were properly set after running the constructor. But by calling a function in the constructor, we get out of the safe zone for reading of properties - Kotlin doesn’t ensure we can only read already initialized props this way.

While I can’t say what’s the reason for this, I guess it was a deliberate decision. Kotlin authors could very easily solve the problem by disallowing calling any member code in the constructor. I believe some languages do this. But they chose to temporarily loosen the null safety instead of banning functionality which people need. This is a lesser evil. They made more such compromises, e.g. they allow to pass this to another object during initialization - this is also not null-safe, but good for developer experience.

Also, please note this problem is much broader than just null safety. Problem is about partially initialized objects and NPE is only one example of potential issues. If we e.g. create an empty list and we add items to it in the constructor, but accidentally we read its contents before that - boom, we have a problem again, even if it is not related to nulls.

2 Likes