Language Design: Check and just return if false

I wonder if Kotlin could have a native check or ensure function that simply immediately returns - instead of throwing exceptions - when it fails.

Use case:

I am parsing the same ill-formatted data with hundreds of different parsing strategies on a visitor-like approach. My code has thousands of visitor objects, one for each parsing strategy, and each visit function has several lines of argument checks before executing the actual parser. The first visitor to successfully parse the data wins. The key here is that the conditions are false 99% of the time. So, I don’t want to use the regular check function that will trigger exceptions almost all the time.

Because there are so many conditions, the if becomes quite large and the readability quite poor.

fun visit(): MyObj? {
    return if (
        condition1 &&
        condition2 &&
        condition3 &&
        ....
    ) {
        MyObj(...)
    } else {
        null
    }
}

It would be nice to place those conditionals outside the if, line after line, and have the compiler turn them all into a massive IF before running the rest of the code.

fun visit(): MyObj? {
    ensure(condition1)
    ensure(condition2)
    ensure(condition3)
    ....
    return MyObj()
}

I tried using lots of if (!condition) return null but inverting complicated conditions really messes the readability up… even when I break everything down into their own little functions.

Working on a negative is not a natural way of reading things.

Needless to say, performance is important. I don’t want to add any indirections that might slow down checking the conditionals.

Arrow solves this pretty well:

fun visit(): MyObj? = nullable {
    ensure(condition1)
    ensure(condition2)
    ensure(condition3)
    ....
    return MyObj()
}

ensure is really what it’s called! Also, it doesn’t need you to specify a value if you use a builder that has a singleton value (like null), but otherwise you could also do:

fun visit(): MyObj = merge {
    ensure(conditions) { myObj1 }
    ensure(condition2) { myObj2 }
    ensure(condition3) { myObj3 }
    ....
    return MyObj()
}

Under the hood nullable runs those conditions one by one and throws to stop the processing as soon as it happens, capturing the exception and returning null. Am I correct?

It would be similar if I did this:

fun visit(): MyObj? = runCatching {
    check(condition1)
    check(condition2)
    check(condition3)
    ....
    return MyObj()
}.getOrNull()

It’s kinda what I want to avoid. The overhead does present a significant slowdown in the software because it would be throwing all the time.

It’s similar-ish, but Arrow’s is safer because it marks the exceptions specially so that it doesn’t catch extra ones. Also, the biggest cost of exceptions is stack trace creation, which Arrow’s raise has turned off, and so the slowdown is not that significant

Still… it’s hard to justify creating +1000s of (nullable chain + exception) instances just to throw them all away in the next line of code. The garbage collection on Android runs crazy :frowning:

Let’s go back to the initial issue then. What’s the issue with inverting?

if (!(
  condition1 &&
  condition2 &&
  condition3 &&
  ....
)) return null

The main issue is readability.

This

if (!condition1) return null
if (!condition2) return null
if (!condition3) return null

is more readable than this

if (!(
  condition1 &&
  condition2 &&
  condition3 &&
  ....
)) return null

But a positive check would win, IMO:

ensure(condition1) 
ensure(condition2)
ensure(condition3) 

How about:

inline fun ensure(condition: Boolean, exit: () -> Nothing) {
  contract {
    returns() implies condition
  }
  if(!condition) exit()
}
//usage
ensure(condition1) { return null }
ensure(condition2) { return null }
1 Like

Just 2 cents from a person who pretends they know the answer (but they don’t :wink: ): this behavior can be achieved through a hack, by utilizing suspend functions. Suspend functions essentially provide the exception-less, propagating exit behavior. And while I never used the Arrow, I believe they utilize this technique instead of exceptions (at least in some cases). So my suggestion is to verify if the solution provided by @kyay10 really uses exceptions. It seems this is the only possible option (and this is true if using classic programming), but maybe actually it is not.

1 Like

So Arrow used to use suspend for this indeed, but they stopped a while, while back. You can use suspend for this, because suspend can implement exceptions without having exceptions in the language, but you can imagine that the cost of that is pretty comparable to exceptions (without a stack trace). suspend functions will create some continuation objects at least, so it’s similar to the cost of creating an exception object.
There is a hack here one can do where you have a singleton exception (with stack trace disabled) that you throw and catch, and the cost of that will be very, very minimal since it’ll be immediately caught so it won’t destroy any stack frames really.
Ultimately, I think benchmarking is the way here though.

1 Like
ensure(condition1) { return null }

Def closer. It’s just going to be a bunch of { return null } everywhere :slight_smile:

Real example:

fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? {
    ensure(tag.hasIdx(2)) { return null }
    ensure(tag[0] == TAG_NAME) { return null }
    ensure(tag[1].length == 64) { return null }
    ensure(tag[2].length == 32) { return null }
    ensure(tag[1].isBase64()) { return null }
    ensure(tag[2].isHex()) { return null }
    return Base64AndHexEncodedAuthor(tag[1], tag[2])
}

It could be like this with a new operator:

fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? {
    ensure(tag.hasIdx(2)) 
    ensure(tag[0] == TAG_NAME) 
    ensure(tag[1].length == 64)
    ensure(tag[2].length == 32)
    ensure(tag[1].isBase64())
    ensure(tag[2].isHex())
    return Base64AndHexEncodedAuthor(tag[1], tag[2])
}

You have also syntax with when:

fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? {
	return when {
		tag.hasIdx(2) -> null
		tag[0] == TAG_NAME -> null
		tag[1].length == 64 -> null
		tag[2].length == 32 -> null
		tag[1].isBase64() -> null
		tag[2].isHex() -> null
		else -> Base64AndHexEncodedAuthor(tag[1], tag[2])
	}
}

fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? {
	return when {
		tag.hasIdx(2)
		|| tag[0] == TAG_NAME
		|| tag[1].length == 64
		|| tag[2].length == 32
		|| tag[1].isBase64()
		|| tag[2].isHex()  -> null
		else -> Base64AndHexEncodedAuthor(tag[1], tag[2])
	}
}

fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? {
	when {
		tag.hasIdx(2)
		|| tag[0] == TAG_NAME
		|| tag[1].length == 64
		|| tag[2].length == 32
		|| tag[1].isBase64()
		|| tag[2].isHex()  -> return null
		else -> {
			// [...]
			return Base64AndHexEncodedAuthor(tag[1], tag[2])
		}
	}
}
1 Like

That’s a great suggestion actually because one can do:

fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? {
    return when (false) {
      tag.hasIdx(2),
      tag[0] == TAG_NAME,
      tag[1].length == 64,
      tag[2].length == 32,
      tag[1].isBase64(),
      tag[2].isHex() -> null
      else Base64AndHexEncodedAuthor(tag[1], tag[2])
    }
}

It’s a bit unorthodox for when, but hey, it works pretty well here, and does exactly what you want.

The issue with when is that you have to invert the conditions and negate every one of them. Working on a negative is not a natural way of reading things.

Point here is that the last two answers forgot to invert them from my example :slight_smile:

But also, when is just not a good solution for readability. I wouldn’t want to get a code like that to understand.

Could you just create a DSL for this?

class ConditionEvaluator<T> {
    private val conditions = mutableListOf<Boolean>()
    private lateinit var resultProvider: () -> T

    fun ensure(condition: Boolean) {
        conditions.add(condition)
    }

    fun result(resultProvider: () -> T) {
        this.resultProvider = resultProvider
    }

    fun resolve(): T? = if (conditions.allMatch { it }) resultProvider() else null
}

fun <T> ensureConditions(conditions: ConditionEvaluator<T>.() -> Unit): T? = ConditionEvaluator<T>().apply(conditions).resolve()

fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? = ensureConditions {
    ensure(tag.hasIdx(2)) 
    ensure(tag[0] == TAG_NAME) 
    ensure(tag[1].length == 64)
    ensure(tag[2].length == 32)
    ensure(tag[1].isBase64())
    ensure(tag[2].isHex())

    result { Base64AndHexEncodedAuthor(tag[1], tag[2]) }
}
1 Like

I think OP’s main idea was that it compiles to almost the same bytecode as regular ifs and returns. In your code ensureConditions introduces 3+ heap allocations, constructors, etc.

Also, it doesn’t short-circuit - it performs all checks even if the first fail. This can be fixed by passing lambdas to ensure, but this means even more object initialization.

The last example by @kyay10 was correct as it compared results to false (when (false)). However, this is so bad for reading that I personally would keep as far from this solution as possible :wink:

2 Likes

Hmm, I thought I saw in one of the examples above that it was evaluating all if conditions and not short-circuiting, but on re-reading, that doesn’t seem to be the case. I guess I misread the code.

I think the extra initialization stuff could be solved with some inlining… I’m also wondering about using a Flow or Sequence where you provide all the conditions and it evaluates them lazily and then only returns a result at the end. Of course, that’s more allocations and stuff.

I guess there is no way to solve this - without adding extra allocations, memory usage, degraded performance - in the language as-is. That said… does it even need to be solved? I actually don’t think there’s much difference between the two, visually.

fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? {
    ensure(tag.hasIdx(2))
    ensure(tag[0] == TAG_NAME)
    ensure(tag[1].length == 64)
    ensure(tag[2].length == 32)
    ensure(tag[1].isBase64())
    ensure(tag[2].isHex())
    return Base64AndHexEncodedAuthor(tag[1], tag[2])
}
fun parse(tag: Array<String>): Base64AndHexEncodedAuthor? = if (
    tag.hasIdx(2) &&
    tag[0] == TAG_NAME &&
    tag[1].length == 64 &&
    tag[2].length == 32 &&
    tag[1].isBase64() &&
    tag[2].isHex()
)
    Base64AndHexEncodedAuthor(tag[1], tag[2])
else
    null

Is the second one really that much harder to read than the first one? I think the only confusing/cluttering thing about it is the &&.

One additional approach:

fun main(): String? {
    ensureAll({ return null }) {
        ensure { cond1() }
        ensure { cond2() }
        ensure { cond3() }
    }
    return "hello"
}

data class EnsureScope(var result: Boolean)

inline fun ensureAll(ifFail: () -> Nothing, block: EnsureScope.() -> Unit) {
    EnsureScope(true).apply {
        block()
        if (!result) ifFail()
    }
}

inline fun EnsureScope.ensure(predicate: () -> Boolean) {
    if (result) {
        result = predicate()
    }
}

Or alternatively:

ensureAll {
    ensure { cond1() }
    ensure { cond2() }
    ensure { cond3() }
}.ifFail { return null }

It compiles to something like:

val scope = Scope(result = true)
if (scope.result && !cond1()) scope.result = false
if (scope.result && !cond2()) scope.result = false
if (scope.result && !cond3()) scope.result = false
if (!scope.result) return null

So it is fully inlined, it performs a single allocation and it short-circuits, but only partially.

If the code is single threaded, we could use a singleton and skip the allocation. If multi-threaded, we could potentially use thread-locals to avoid allocations, but I have no idea what’s the associated performance.

Generally speaking, I think most of the above is an overkill. It would be nice if inlined functions could return from outer functions, but I believe this is not possible, so a regular if or ensure(cond) { return null } make the most sense to me.

1 Like

Comparing the bytecode between

fun main(args: Array<String>) {
    if (args[1] == "2") return
    if (args[1] == "3") return
    if (args[1] == "4") return
    println("hello")
}

and

fun main(args: Array<String>) {
    ensureAll {
        ensure { args[1] != "2" }
        ensure { args[1] != "3" }
        ensure { args[1] != "4" }
    }.ifFail { return }

    println("hello")
}

class EnsureScope(var passing: Boolean = true) {
    inline fun ensure(predicate: () -> Boolean) {
        if (passing) passing = predicate()
    }

    inline fun ifFail(ifFail: () -> Unit) {
        if (!passing) ifFail()
    }
}

inline fun ensureAll(block: EnsureScope.() -> Unit) = EnsureScope().apply(block)

Puts the ensure solution with 10x more bytecode instructions (57 to 441 lines). It works, but as expected it includes extra object + extra boolean + extra lambdas + boolean checks in each statement. And I am not sure if the readability is actually improved :frowning: