Proposal for the Option/Maybe datatype

I’ve got a corner case where I wanted both null and Option<T> together, but I was happy to borrow the code from the Internet and really I might have been better to create a custom sealed class hierarchy.
I used a map as a cache of results of attempts to find something: null meant “not looked yet”, None meant “looked and not found”, Some<T> meant “looked and found”. But I’m not taking advantage of map etc. here. I chose to stick with the usual Option data type rather than invent my own names because I thought it would be more familiar, but perhaps misleadingly so, so maybe that was the wrong choice.

Not that I think it needs to be part of the standard library, but I did come across a case where null and Maybe/Optional had different constraints. In particular, I was working on a MediatorLiveData<T> where T was unbounded and I needed to have a private property that held the last value. I therefore needed a way to represent that a value had not yet been set. Using null to signify this woudn’t work since null is a valid T. Using an Optional<T> works, though. Even if T itself is Optional<R> (which would be strange but valid), I then have Optional<Optional<R>> and I can distinguish whether the T has not been set or the R has not been set (and my MediatorLiveData<T> doesn’t even need to worry about R). With T? I can’t distinguish whether the property has not been set from whether it’s been set to null.

In my case I just wrote a quick sealed class Maybe<out T>with an object Empty: Maybe<Nothing>() that I initialize my property with and a class Value<T>(val value: T): Maybe<T>() for everything else.

2 Likes

I’m just giving my 2 cents because you asked the community (I haven’t read the other comments yet), as someone who comes from the object oriented world I think I use Null exactly in the way optional/maybe is used. I have some own functions to make the life easier to work with null in my way. So for me no, I’m fine with just null but if others need it I don’t want to be the reason why they don’t get it. But doesn’t Kotlin Arrow include them? I thought that’s were you find all the functional stuff.

This is the one reare case where Option/Maybe is a good idea. But even then there are workarounds that don’t require this. In your case you could use an implementation that is similar to kotlins lazy implementation (although it’s a bit hacky)

That said, this is in my experience an extremely rare situation that you have to differentiate between null and unasigned. In most cases I would prefer to solve this with a sealed class. That way null becomes the unasigned state and you create an object for the null case.
In any case this is so rare that I don’t think adding Maybe/Optional to the stdlib is a good idea. Most people will just use it when they should use null instead and that will make kotlin worse, especially if it get’s missused in popular libraries.

5 Likes

I would even argue that using null for whatever purpose in this case can lead to misunderstanding. I would therefore prefer using a non-null sealed class with separate sub-types for null / unassigned state.

4 Likes

I would argue this case is that rare. In Kotlin we are very strict about nullability, null is not just a default/missing value as in Java - it has to be set explicitly and often it has a very specific meaning. Whenever we see a String? variable in the code, we know it is nullable for some reason and its null value is equally important to other values it could contain.

The problem arises if we have some kind of a container for other values, especially parameterized one. Similarly as above, if we use Map<String, String?> then we explicitly say we need to store null values in it. Null values are important to us, they mean something - otherwise we would use Map<String, String> instead. Unfortunately, it is not that easy to distinguish whether some key is missing in the map or it maps to a null value.

I know Map isn’t a great example, because it is partially dependent on how maps were designed and implemented in Java. I don’t say we should change Map interface. Still, I think Optional/Maybe in stdlib would be beneficial when designing some data structures and APIs in Kotlin.

Examples of real use cases:

  • Parsers/Serializers for formats like JSON, XML, protobuf, etc. When parsing we very often need to distinguish {"foo": null} and {}. Same when generating the data. In fact, Apollo GraphQL client for Java/Kotlin implements Optional/Maybe as their Input class.
  • Reading user input, configuration options or some other kind of parameters if there are default values. We need to know whether user set some parameter explicitly to null value or we should use the default value.
  • Storing partial updates to data objects. For example, we fetched a database entity and we would like to track in-memory changes to it before pushing to db. Or user was presented with a data form and we need to know which fields have been modified.
  • Calculating and storing differences between two Kotlin objects of the same type - this is similar to above.

I implemented Optional/Maybe in one of my projects: OptionalValue.kt. I utilized inline class with special Absent object, so I believe it should be pretty efficient CPU/memory-wise, but I didn’t test it thoroughly yet.

1 Like

Both T? and Option<T> are actually a sum of two types: T and a single-value type ( null or “Option.EMPTY”).

@broot I see that you actually describe cases where two such types are not enough: you need three. And you propose to solve this by a combination ( a sum) of these two types (or by Option<Option<T>>).
I would call this a naïve implementation of a more complex type.

So I think that instead of introduction into a general usage of another similar type, we better create / use a generic type that naturally has three type values, e.g. NULL | EMPTY | T
This could be implemented in Kotlin as a sealed generic class of three classes… ( I see something similar here: How to get objects in invariant generic sealed classes? )

More on types here: GitHub - hmemcpy/milewski-ctfp-pdf: Bartosz Milewski's 'Category Theory for Programmers' unofficial PDF and LaTeX source and in Kotlin: Union types - #27 by damianw

2 Likes

TLDR: I don’t agree :wink:

Yes, technically speaking it allows to store 3 distinct states: non-null value, null, absent/undefined. Still, I like to think about it conceptually as a generic wrapper with only two states (present/defined, absent/undefined) and containing any arbitrary Kotlin type. As Kotlin supports nullable types, we have 3 states in total. I believe this is more… precise or consistent (?).

Let’s assume we have some kind of entity object:

data class User(
    val username: String,
    val age: Int? // null means "no information" and this is different than 0.
)

Now, we need a way to describe changes made to this object, for example to have an audit log of changes. We need 2 states for username (changed to X, not changed) and 3 states for age (changed to X, changed to “no info”, not changed), so we could implement this as:

data class UserChanges(val username: String?, val age: Option<Int>)

But this is not really consistent, because we handle nullable and not nullable types differently. For each property we would need to think: “How many states do we need here?” and “not changed” would be sometimes represented as null and sometimes as Option.EMPTY. It would be much more straightforward and consistent to just wrap each property with Option<>.

Alternatively, we can do this as:

data class UserChanges(val username: Option<String>, val age: Option<Int>)

, but then we lose information about nullability and always see all three states, even if NULL state is never used (e.g. for username). We can also do:

data class UserChanges(val username: Option<String>, val age: Option<Int?>)

, but then I think we lost the whole point of our three-state Option type. We introduced it in the first place to avoid a “two-state in a two-state” situation. However, by looking at the last example we clearly see that actually we are back at the starting point. We have Option type which tells us there could be 2 states (present, absent) and there is a type parameter T which could provide a single or two states. If we design Option as a three-state type, we introduce several problems:

  • It is a litte confusing, because we first said it is a three-state type and now we say: well, not always, its 2 or 3 - depending on its type parameter T.
  • For the similar reason, it would be impossible to provide a clean API for reacting exhaustively to its contents. I mean something like when statement for enums or Result.fold(). It is impossible, because Option<String> has 2 states and Option<Int?> has 3 states.
  • In most cases we would convert null to special Option.NULL value only to convert it back to null when reading.
  • Arguably, in most cases we don’t really want to separate all three states, but only present and absent. In our original User class we had age stored as Int?, so I think it is a reasonable assumption that when reading a list of changes to some user (for example to apply it into another user), we really need only two states: age not changed, age changed to a new Int? value.

If we design Option with only two states: present and absent, everything becomes simpler and cleaner. Option doesn’t care about null values, it doesn’t handle them in a special way. It just passes its T type (which could be nullable) and this is it.

And yes, it could be implemented as sealed class. I prefer inline class, because it does the same thing, but is more memory efficient. Maybe CPU-efficient as well, but I’m not sure about this.

Sorry for a long comment, it is hard to explain it in a few sentences.

We both agree that usage of Option or some other types is a matter of personal preference and of a task at hand. And any desired type can be easily implemented.
But in any case this is not about lack of something in Kotlin core :slight_smile:

A major benefit is a substantial performance improvement for fundamental types.

val a: Int? = 1          // boxing occurs
val a: Optional<Int> = 1 // no boxing occurs

If value classes accepted a generic value, Optional could then be implemented with a value class (as in @broot’s example) while avoiding boxing (unlike in @broot’s example).

I’m not sure if I understand you correctly. On JVM level everything is either a primitive or an object. If it is a primitive then it can’t store any additional information than just a value. We can’t create a primitive integer with nullability or any other extra state.

Well… actually we can, by reserving some integer values for these extra states, so e.g. NULL is stored as Int.MIN_VALUE and ABSENT is Int.MIN_VALUE + 1. Using this technique and Kotlin’s value classes we can create a nulllable Int that doesn’t need boxing. Even my OptionalValue example could be rewritten for integers and I believe it will avoid boxing. Unfortunately, we would need to implement a separate OptionalValue for each primitive type. Is this what you asked for?

How can you avoid boxing in Optional<Int>? That will either need to store an Int? OR be something that needs to be manipulated via an interface so that you have a “present” and an “absent” implementation.

To avoid boxing with Optional<T>, it can be implemented by having the compiler translate Optional<Short> to a value class holding a Int, translate Optional<Int> to a value class holding a Long, and so on. One of the unused bits would be used as the “value present” flag.

Of course the compiler would then have to forbid Optional<Long> and Optional<Double>.

This is somewhat unorthogonal but as you correctly point out, there’s currently no other way on the JVM.