Rust-style variable shadowing

Preface

Having worked with Rust recently, I’ve come to like one very specific feature a lot: Variable shadowing.
In rust it is possible to shadow variable names without warnings or errors, even in the same scope, and I propose adding this feature to Kotlin.
Here is an example of how it works in rust:

let foo: String = "4".to_string();
let foo: Option<i32> = foo.parse().ok(); // parse foo into the equivalent of Int?
let foo: i32 = foo.unwrap_or(1234); // equivalent of foo ?: 1234

How it works

Variable shadowing means that it’s possible to create a new variable with the same name as another variable that has already been declared within the same scope.
This means that it is possible to instantiate a variable with type String, and then later have another variable with the same name in the same scope, but of type Int, or even just with a different value stored inside.

Difference to mutation

Mutation means actually changing the value the variable references, meaning that if you pass your variable to another place, it also changes there. This is the idea of var.
Variable scoping however is different in that it does not change the original variable, but creates a new, independent one.

Why this is great

There are several reasons as to why variable-scoping is a great feature.

1. Increase in readability

It’s not uncommon to have long method call chains that transform a given piece of data, like the given rust-example. In Kotlin, one would currently implement the same code as follows:

val foo = "4".toIntOrNull() ?: 1234

This is not a problem. but now imagine a much longer call-chain, where it becomes necessary to put things into seperate variables:

val myUserData = "some; stuff: 12".trim()
val parsedUserData = myParser.parse(myUserData)
val sanitizedUserData = sanitizeUserData(
  option1 = "foo",  option2 = "bar", option3 = "baz",
  parsedUserData
)
val updatedSanitizedUserData = sanitizedUserData.copy(name = someNewName)
val stringifiedUpdatedSanitizedUserData = gson.toJson(updatedSanitizedUserData)
println(stringifiedUpdatedSanitizedUserData)

To be fair, this is a contrived example and not really good code (altough i’ve actually seen variable names simmilar to stringifiedUpdatedSanitizedUserData…), but it shows the improvement this feature could do:

val userData = "some; stuff: 12"
val userData = myParser.parse(userData)
val userData = sanitizeUserData(option1 = "foo", option2 = "bar", option3 = "baz", userData)
val userData = userData.copy(name = someNewName)
val userData = gson.toJson(userData)
println(userData)

The reason this is an improvement is that userData is a good name (please imagine it actually being good, haha), and that after parsing and sanitization, the data is still very well described by the term “userData”.

The most important difference imo is when using .copy on immutable data classes.
If you store data in immutable classes and then update them, you will very rarely want to work with the old data afterwards. I’ve made the mistake of accidentally giving someData to functions after creating someDataUpdated = someData.copy(...) multiple times.

TL;DR: It saves you from having to either put everything into one big, unwieldly call-chain or alternatively having to think of variable names for the intermediate variables, and can also save you from accidentally working with outdated data when updating immutable classes and storing them in updatedData-style-named variables

2. Making mutability even less necessary

Many times you see (mostly) beginners make variables mutable because they might want to work in them over the course of many function calls. an example:

var input = readLine() ?: return null
logger.info("user entered: $input")
input = input.trim().capitalize()
doSomething(input)

this, again, is obviously a contrived example, but i’m sure you can remember a point in time where you or someone you worked with made a variable mutable just to be able to change it in this way (non-dynamically).
This could be prevented if you could do

val input = readLine() ?: return null
logger.info("user entered $input")
val input = input.trim().capitalize()
doSomething(input)

while this does not make a big difference here, keeping things immutable is a great goal which in more complex scenarios CAN be a big factor.

Changing var to val

There are still some valid use-cases for mutable variables and for-loops that mutate them.
In these cases, most of the time the variable will only be mutated within the loop, and will be treated as immutable afterwards. Enforcing this would be great.
This could be done like this:

var someAggregatedNumber = 0
for(.....) { someAggregatedNumber = someFunction(someAggregatedValue) }
val someAggregatedNumber = someAggregatedValue

Having mutability be contained as tightly as possible is always a good idea!

Reasons this might not be a good idea

There are obviously some arguments against this proposal, some of which I hope to address here.
Let me start by saying that there are languages that have this feature, and I’ve not yet heard anyone complain about it. This are mostly languages that encourage immutable data, even in local scopes, simmilar to kotlin.

Variable-shadowing could make you use the wrong value accidentally

This is of course a problem. Indeed, you would be more likely to use a shadowed variable accidentally and get unexpected results than without this feature. This is the biggest argument against it.
BUT:

  • This problem already exists when using nested scopes, altough most IDEs give warnings about name-shadowing, which would of course not be the case if this was implemented
  • Making shadowing a language-feature would most likely cause you to keep this in mind, especially when actively working with shadowed declarations. As long as you’re aware of the feature, you might be even less likely to make this mistake than you would have been with it not being on your mind at all.
  • Most of the time I’d see this feature be useful when also changing times, like I did in the parsing example. In this case there would be compiler-errors if you tried to use your parsed data somewhere where the string-version was expected, making the error obvious

What are the performance implications?

well,… I don’t know. Maybe encouraging the use of intermediate variables over method-chains could have a negative performance impact, but most likely this would be optimized away.
On the other hand, this would encourage the use of normal function-calls over method-chains containing .let calls for static functions, which could potentially help performance. (altough I guess .let call are most likely optimized to normal function calls under the hood anyways)

Just use Mutability and Call-chains

I already adressed mutable variables as a non-alternative, as they do not give the same guarantees and don’t allow for type-changes.
Call chains are an alternative most of the time, but can get unreadable when many static calls are need or call chains get to deep into lambda territory. Having intermediate variables is a great alternative, that currently faces the naming problem.

6 Likes

+1

I do not see, how does it improve readability. Especially when types are different. Also reassigning the value seems to be a bad code style. In Kotlin most of that could be achieved by using scoping functions like also.

12 Likes

Interesting idea. It’s cool to see Rust has the feature–always better to see an example of how a feature effects a language instead of being the first.

I’m not convinced the gains are substantial enough to switch a warning into a feature and allow it to be used more broadly.

Currently, the general use case of shadowing is a warning and discouraged in most cases as it is easy to create confusion. There are low cost alternatives name shadowing (not always as readable but still low cost).

This proposal would have to be beneficial enough to not just overcome the standard minus 100 points but also the additional negative stance already held against name shadowing.

I think there are plenty of arguments to be had in favor of name shadowing in specific use cases, but the general case is what holds me back. Allowing variable switching is pretty scary knowing that I’ll have to deal with other coders’ code–especially if I don’t have the help of a linter stopping them from swapping variables without a warning or helping me detect them.

2 Likes

Agreed and kotlin already has a feature to allow this for the specific use case. Just add @Suppress("NAME_SHADOWING") to the file, function or statement.

3 Likes

I was going to counter that by pointing out an important difference: you can still access the value from an outer scope, using an @ qualifier.

But, having tried it, I find that you can’t! (You can quality this, along with return, break, and continue; but I think that’s all. Maybe that should be an addition to the language?!)

You live and learn… :slight_smile:

Anyway, I think I’m with @arocnies on this: I can see benefits in some relatively common cases, but I suspect it’s likely to cause confusion too.

In particular, people will expect to be able to cut-and-paste code within a function, with it either failing to compile (if moved above a relevant definition) or compiling and running as expected. This would break that assumption, and could silently cause bugs.

It also fails to help in some rather similar common cases, such as:

var someValue = getSomeValue()
if (someValue.isBlank())
    someValue = getDefault()

// …someValue never changes again…

If you’re used to name shadowing, you might expect to be able to use it here, too — but you can’t. (Unless you make the if an expression and add an else clause, which is arguably better logic but longer-winded and more awkward.)

So I’d want to see more unambiguous benefits from this before I could recommend it.

Ultimately, I want Kotlin to stay easy to learn and to reason about. There’s too much unnecessary complexity in the world, and I love Kotlin for avoiding most of it!

1 Like

@lkowarschick I forgot to mention that I really like how you organized the idea.
Here’s some related discussion I found after googling around for more info after finding the standard Rust doc on it pretty short: https://www.reddit.com/r/rust/comments/2cho4g/why_does_rust_need_local_variable_shadowing/

I don’t see how this is different from making the variable mutable…
If you were talking about fields it would be a different story, but functions are most of the time procedural.

In my eyes the difference between var and val in a procedural function is that by var you do have to check if the same name refers to another value then initialization, while you don’t have to do this by a val.

What you’re asking, if I’m right, is to make it possible to let the same name refer to a different value. This means that now you need to perform the same checks for vat and val.

Can you explain why I’m wrong?

It is not evident to me de readability gain.
This example already works and it is messy as yours, IMHO.

    val userData = "some; stuff: 12"
            .let { myParser.parse(it) }
            .let { sanitizeUserData(option1 = "foo", option2 = "bar", option3 = "baz", it) }
            .copy(name = someNewName)
            .let { gson.toJson(it) }
    println(userData)

Variability isn’t “necessary”.

    val input = run {
        val line = readLine() ?: return null
        logger.info("user entered: $line")
    }
            .trim().capitalize()
    doSomething(input)

" Changing var to val", are you considered the fold function?

This can lead to confusion, as it may be unclear which variable subsequent uses of the shadowed variable name refer to, which depends on the name resolution rules of the language.
Wikipedia

Example:

fun main() {
    val check = true

    // many many lines later

    println(check) // What should be the type and the value of check?
}
2 Likes

I’d do this differently:

val someValue = getSomeValue().takeUnless { someValue.isBlank() } ?: getDefault()

but that’s beside the point.
variable shadowing would help in your example, as you could put

val someValue = someValue

after your if-statement, making it immutable from that point.

1 Like

first of all: Yes, using method-chaining using .let-calls is an alternative, with some relevant draws:

  • It’s arguably more confusing for beginners, which is an important factor for many.
  • It falls flat as soon as you have to use some nested lambdas withing the let calls, because then you’ll have to name your variable anyway, or work with shadowed it which imo is a lot worse than shadowing the outer variable name

using run is an alternative as well, with simmilar draws. Introducing nested scopes and especially receiver scopes makes code a lot more complex, because it makes the variable you’re working with even less obvious than if you reassigned it to the same name (especially talking about run, this can get confusing very fast, and is not really an improvement, as you now don’t have your variable named at all).
also, your example for using run is incorrect, you’d need to return the actual line value at the end of the block, adding another line to it.

Yes, folding is an option. But again, it is a lot less readable and expressive in some cases. As I said, most loops can and should be replaced with map, filter, and fold. But there are still cases where a loop is a lot more readable… which is where my point is. In these cases, you don’t want to choose between guarantees of having something immutable and readability of using the loop, so with variable-shadowing, you could have both.

Some more prior Knowledge from the Rust-community

I’ve asked some experienced rust-developers in the Rust-discord for their opinions and experiences on variable scoping, which I’ll now share.

Some context for Rust

Before starting to share the feedback the rust-community gave me though, let’s give some context to this community.
Mozilla started developing Rust for Firefox, as a way to get rid of all memory-related vulnerabilities and bugs they had. With this goal in mind, rusts primary focus has always been safety, correctness and a “avoid all potential bugs, even trading in some developer productivity for it” philosophy.

Memory-related problems can reasonably be compared to accidental use of the wrong variable, the main danger that variable-shadowing brings with it. Rust, whilst having a compiler that actively punishes you for doing anything that could potentially be not 100% what you want, still allows and encourages variable shadowing.

A language that maximises for usability, readability and arguably for general developer-comfort should thus be at least as able to introduce this feature without facing any noticable problems in software-correctness.

Now let’s look at the thoughts I got from a few Rust-developers:

1

let x = x.unwrap() makes perfect sense - x is the same ‘value’, even if before it was Option and now it is String.
when there’s shadowing, I can treat a value as a semantic value, rather than a lexical value.”

let user_input = user_input.parse::<i32>().unwrap() is something I like being just able to do. it is, both before and after, the user input.
without variable shadowing, I’d have to name one of them *_str or the other *_int , for no good reason.
it’s the compiler’s job to figure out my local types, I shouldn’t need Systems Hungarian just because

This is my main point. Variable names have semantic meaning, which is a lot more important for readability than their lexical meaning.
When we read code, we rarely actually read it line-by-line, trying to follow every statement exactly. What we actually do is skim over the code, notice any important parts and follow the main data-flow.
This data flow becomes easier to follow when the variable names do not change.

2

and there’s no ‘contract’ to keep about the code. Most pushback against shadowing is some vague ‘but what if I dragged code around and it accidentally compiled and I wasn’t paying attention’, and if mut/let/types/whatever else still compiled and you were using the same variable name when you didn’t mean it and you’re dragging code around without paying attention to the target site? then it’s declaratively your own fault.

Another good point. There are edge-cases where variable-shadowing can make your code accidentally do stuff you didn’t want, but these are a lot more rare than you might think.
How often do you really copy-paste code from one context into another context that is using the same variable-names for different things, but with the same type? I cannot remember doing this ever. While there might be cases where this hurts you, as long as it is a feature of the language you would be aware of it.
To pretty much solve this, you could also introduce some special highlighting in IDEs for places where variables are shadowed. Not as a warning, but as a reminder.

3

I don’t really have detailed thoughts, when I do use it I like that it’s there. It’s more useful in Rust than other languages as you can use it to easily change the mutability of a variable.

This argument still applies to kotlin, even if a little less. In rust there is a difference between mutable references and immutable references, where immutable references are (ignoring some special cases) deeply immutable. Changing mutability thus has a bigger effect there as it would in Kotlin, but the compile-time guarantees that immutability offers still stands, especially when talking about performance critical code, where mutability is often required in some places.

4

“One of the ways it’s certainly helped me is that if I keep the same name for something throughout the entire function, I can easily trace back how I came by that value, Whereas if I had to name it foo here and bar there and baz somewhere else, I’ve completely forgotten that I was investigating quux by the time I’ve reached the string-parsing point.
so it certainly makes code more readable”

This relates to the point of semantic meaning vs lexical meaning. While it might be a tiny bit harder to follow the exact, detailed data-flow of a function in some edge-cases, having variables named consistently increases “skimability”, which is a very important factor for readability.

5

“I’ve rarely been bitten by using shadowing,and all those times it was some compile-time error that was moderately easy to figure out.
What has happened to me more often,in languages that don’t allow shadowing,is that I accidentally refer to the old variable instead of the new one.”

This is important experience. I think to actually evaluate if the problems caused by variable-shadowing are as bad as some might think they are we should try to get some more experience from the rust-community, as they have had this feature for a long time and will know about it’s consequences and implications a lot better than we do.

Prohibiting the use of outdated variables is the main counter argument to “you might accidentally use variables you didn’t intend to”. While shadowing variables can cause accidental use of “new” values, it does in turn prohibit accidental use of outdated values. This second problem, while maybe a bit more rare, is most likely a lot more fatal. Using outdated values, for example in a database update, can mean that you loose important changes. Loosing something is a lot harder to notice than things generally misbehaving, so catching a less likely, but more fatal bug might be more important than catching a still rare, easier to find and less likely to be fatal bug.

1 Like

To be honest, I don’t think there is a perfect way to resolve this issue. Some people like this warning, others don’t. There are good arguments for both sides.
I think the best way to solve this would be if kotlin provided some sort of feature that would allow to turn off specific warnings. This way every team can decide for themselfs if they want to have this warning or not.
Looking through youtrack I found an issue that will exactly this: https://youtrack.jetbrains.com/issue/KT-8087

It is currently not allowed at all.
I think turning this into a warning would be a great way to start introducing it. It won’t hurt anyone, as most people generally follow compiler-warnings unless they actively intend to do what they are warned about, in which case the warnings can be supressed, maybe even on a project level like you mentioned @Wasabi375 .
We could then play around with it, and see if it does actually bring any noticable increase in ambiguity, in which case it will stay a warning forever. If we conclude that it does not in fact hurt, the warning can disappear.

To @arocnies point

I disagree that this feature should have that big a barrier to overcome.
In essence, the point of the “minus 100 points” is to avoid including unnecessary features. This is done to keep language-complexity as low as possible, which is an important goal.
BUT: In this case, we aren’t actually adding a feature. We’re removing a limitation. Language-complexity is a result of syntactical constructs, stdlib features and structural concepts regarding things like the type-system. Having name-shadowing be allowed does not add complexity to the language on any of these levels.

View it like this: a beginner can look at the list of scope-functions (let, run, also, apply, ...) and think “oh god this is complex”. A beginner will see delegated implementation and things like by lazy and will have to look deeply into these to actually understand them.
But when a beginner see’s variable-shadowing being allowed, the only reaction will be “oh, wow, cool, i guess”, or maybe “Oh, this could be dangerous, I hope there’s a warning about this”. But it will not make the language itself any harder to learn, understand or comprehend.

I agree that it will need to overcome the general negative connotation that variable shadowing brings with it, but it’s not a language feature that increases complexity in any way.

My point and your point are not enough.

Therefore the “minus 100 points” rule has been already applied for the current feature, so its value is at least 100 points.

However request have to pass 100 points to be considered better the current.

I really like the idea of shadowing a var to a val. There’s been many times I have to change names mid function to prevent mistakes but then I end up using the original name by mistake if I forget I copied it to a val. It would also be nice if this worked with function parameters so you can change the parameter and continue using the same name to prevent confusion.

1 Like

Normalizing parameters is a common requirement where this kind of shadowing is a big help.

fun doHttpThing( method: String ...) {
   val method = Method.valueOf(method.toUppercase())
   //It's better to have one method that's valid instead of one that's valid and one that isn't
   //It's better if the one valid one is immutable
   ...

It would also be great to be able to undeclare variables that shouldn’t be used anymore

I think unsetting variables will become important when you have functions that should be broken up into smaller functions…
I can be wrong though…

This feature makes sense in Rust and has its places, but it does not in JVM languages and the preface example doesn’t make sense in Rust. Let me explain.

Regarding readability, the call chain of parsing and validating data, I and probably a lot of people would agree that changing the original variable through data type validation and parsing belongs in a different variable name.

Regarding mutability, I’ve definitely written more than a few functions that shadow an always-val parameter with a var to change it, but looking back I’ve never thought that was good form to replace the parameter with a variable.

Regarding the fact that in Rust it’s actually a good feature vs. not in Kotlin… it’s because Rust as a language deals with lower level types and it strives to make the programmer deal with those low-level types with safer methods and implications. No matter what programming language or environment you’re dealing with, what we’re inevitably talking about here is parsing, converting, and validating data in some form or fashion. Your proposal/discussion is implying that the data that a programmer deals with should have a clever way of bypassing the process of conversion and validation via type safety, and I think that’s just not a good way of doing things in a JVM language, or any high-level language that doesn’t have raw machine pointers or other low-level conversions. To be fair, I don’t think your example of Rust name shadowing in the preface is a good example of why it’s allowed in Rust - in fact, I’d argue that Rust language experts would call that code snippet a bad practice. The first foo is an input that needs to be converted, parsed, and validated to an output, and a programmer should clearly define the difference in source code.

3 Likes