Add a NOT Type

I think it would be incredibly useful to make the type system say, “This method can never accept an X” or “This implementation can never be used as a Y”.

My specific use case is a TaintedString class. It represents untrusted input from a user which can be stored in a database, but must be encoded different ways before being written out to HTML, PDF, CSV, a log file, in email, etc… This class has identical methods to CharSequence and String, but must never be interchangeable with either. It would be nice to (an)notate it with !String, !CharSequence to prevent it from being passed as, or implicitly converted to either one. This is particularly important with the .toString() method, though I guess I could make that throw an exception at runtime. Still, I’d like to catch this at compile time.

This is maybe not a fully-formed idea. Just wanted to throw it out there and see what people thought.

What would that give you over the existing solution of making it an unrelated type?

After all, you probably don’t want it to have all the same methods as CharSequence or String, as so many of those methods return one of those types.

String is a particularly unrepresentative example, as it’s the only type to which every other type can be converted (thanks to Any.toString()). (Having .toString() throw an exception would break the contract and be likely to cause lots of collateral damage; probably better to have it return one of your special encodings.)

you probably don’t want it to have all the same methods as CharSequence or String , as so many of those methods return one of those types.

The TaintedS version of the CharSequence/String methods all return TaintedS results instead of Strings, and many are overloaded to take either a String or a TaintedS. They are just there so that you can use a TaintedS in the most familiar possible way (like a String) but safely.

Having .toString() throw an exception would break the contract and be likely to cause lots of collateral damage; probably better to have it return one of your special encodings.

The .toString() implementation is just as you suggest:

   @Deprecated("Don't use this accidentally - Encode before using!")
    override fun toString(): String = "⛔$str⛔"

    /**
     * Using this trusts user input, defeating the purpose of this
     * class, but if you really want that, here it is.  Tip: search
     * your project for usages of this method!
     */
    override fun unsafeRaw(): String = str

I still wish this could be prevented at compile-time because inevitably, someone writes:

"Hello " + myTaintedString

And you have to look for the No-Entry signs at runtime to know anything has gone wrong. I guess somehow preventing the implicit String conversion on a class would solve my strongest motivating issue. But there are other motivations…

What would that give you over the existing solution of making it an unrelated type?

Consider a library with an Interface: AlternatingCurrent. The library has sub-interfaces TwoPhase and ThreePhase, Ac120V and Ac240V. You are writing a method .foo() and want to allow any combination of these except ThreePhase. If you only cared about runtime, you could use an exception:

fun foo(ac: AlternatingCurrent) {
    if (ac is ThreePhase) {
        throw IllegalStateException("Can't handle ThreePhase.");
    }
    // do stuff...
}

To see this at compile time, you could use an imaginary Union typealias, essentially allowing one of the three types:

typealias AcNot3Phase = (TwoPhase | Ac210V | Ac240V)
fun foo(ac: AcNot3Phase) {
    // do stuff...
}

But if there is ever another AlternatingCurrent sub-interface, such as Ac110V, this method won’t handle it. Using an intersection type with the imaginary NOT would allow additional sub-interfaces to pass:

fun <T> foo(ac: T) where T : AlternatingCurrent, T : !ThreePhase {
    // do stuff...
}

Like with the Expression Problem, both of the above solutions have loopholes. Still, I think they could be useful. I guess I’m asking for intersection types and a NOT type. Or an alternative that I can’t imagine at the moment…

Let your toString() function return “TaintedString($value)”. This will reduce the likelihood of the toString() function being misused.

@fatjoe79 Thanks - I’m starting to implement most .toString() methods that way! For this one case though, I found that the symbols often show up red and are easier to notice than regular text, especially when they appear in a block of text (such as a log file).

This feature (union with exclude) exists in typescript with the Exclude<T, U> type.

could be nice to have union type with excludes, like

 fun foo(ac: AlternatingCurrent exclude ThreePhase) 

 // or equivalent while AlternatingCurrent has 4 extends
 fun foo(ac: TwoPhase | Ac210V | Ac240V)
1 Like

Hmm…, the point is, what is !ThreePhase for a type, it is more a collection of types.
What is if I state object:!ThreePhase. What about equivalence operators == and != just like in Swift.

The only problem I see so far is Java compatibility. How could Java understand this kind of function signature?

1 Like

I don’t think there is any solution in Java for union type or Not Type.

Or may be with a specific annotation :

 void foo(@union(or={TwoPhase.class, Ac210V.class, Ac240V.class}) Object ac);
 void foo(@union(or={AlternatingCurrent.class}, not={ThreePhase.class}) Object ac);
2 Likes

As with @NotNull and @Nullable, I think Java compatibility will pose problems. I’ve been using “OneOf_” classes for a while in Java. I think the most critical component of the implementation is the match method. That ensures that when you use the Union type that the compiler forces you to consider all the possible actual types. If you redefine your Union type later, the compiler forces you to update all dependent code. I think Scala provides something like this. Kotlin (or any JVM language really) could generate Java classes like this when Union types are used.

I left a note in that code that there’s a javax.lang.model.type.UnionType but admit I haven’t done much more than glance at that class.

@christophels

Even if we annotate Java have to understand the semantics involved. But because annotations must not change the semantics except some AOP hackery, we can’t extend semantics with annotations.

@TrombaMarina
I had implemented something similar for union types:

package org.utils.union;

import org.utils.function.Function1;

public class Union2<Param0, Param1>
{
    public Object elem;
    private ActiveType activeType;

    public Union2(Param0 elem0, Param1 elem1)
    {
        if (null == elem1)
        {
            this.activeType = ActiveType.Param1;
            this.elem = elem1;
        } else
        {
            this.activeType = ActiveType.Param0;
            this.elem = elem0;
        }

    }

    @SuppressWarnings("unchecked")
    public <Result> Result apply(Function1<Param0, Result> lambdaParam0, Function1<Param1, Result> lambdaParam1)
    {

        if (this.activeType == ActiveType.Param0)
        {
            return lambdaParam0.apply((Param0) this.elem);
        } else
        {
            return lambdaParam1.apply((Param1) this.elem);
        }
    }

    @SuppressWarnings("unchecked")
    public <Result0, Result1> Union2<Result0, Result1> applyH(Function1<Param0, Result0> lambdaParam0,
            Function1<Param1, Result1> lambdaParam1)
    {
        if (this.activeType == ActiveType.Param0)
        {
            return new Union2<Result0, Result1>(lambdaParam0.apply((Param0) this.elem), null);
        } else
        {
            return new Union2<Result0, Result1>(null, lambdaParam1.apply((Param1) this.elem));
        }
    }

It is neither bugfree, complete nor does it work for null values, but there are more serious problems in Java to get that work.

You can’t construct an Union Type from a single value, you need to pass the others.
Why?
Because you can’t express that two generic type parameter don’t overlap. Therefore, you can’t overload the constructor by type parameters.

Further, you can’t express subytping relationship for UnionN to Union(N+1) types.
You can’t express implicit conversion or subtyping between union types and it’s type parameters.
You can’t destructure union types existentially into their type parameters, that’s why I implemented apply and applyH here as a workaround.

I thought about annotation more in the way of Java Checker Framework, with compilation checks, but wihout AOP feature.

But i have to admit that i didn’t try it and i don’t even know if it’s possible.

Yes, compiler plugins would do the trick, but they still feel foreign.

The problem is that you can still ignore the required compiler plugin.

I think this is turning into 2 questions. One is about Kotlin supporting Union types, with or without a NOT clause. The other is about this specific use case, where I need to prevent implicit conversion of two types (TaintedS and TaintedB) to a String. Both Java an Kotlin allow you to say "Hello " + myObject and I’d like to be able to prevent, or at least find all places where that happens at compile time. I’ll take a look at the Java Checker Framework. They have a Tainting example specifically! I wonder if it works with Kotlin?

For the tainted question, you might be able to write a lint rule (e.g. Detekt plugin) to catch misuse of your TaintedString.

I’d love to see union types in Kotlin, but as long as Java interop is a priority I don’t see it or any other more powerful type feature such as advanced type constraints like LiquidHaskell making it to Kotlin. Inline classes almost didn’t make it because of it

I didn’t study your idea in depth, but it sure sounds like a case that is better handled as a wrapper class. Instead of a TaintedString class, would it make more sense to have a generic Tainted wrapper class sort of like Optional in Java?

While java.lang.Object java class and Any Kotlin class have a public toString method, i don’t see how you can prevent it’s usage at compile time by a wrapper class (and this is the original problem with TaintedString which is already a wrapper).
May be toString() function should be exited of the Any class and integrate to an interface (Stringable), to solve this.

You will never be able to prevent anyone from calling a toString on any class. Period. Any more discussion on that point is useless.

BUT you are fully in control of what the implementation of toString does when it is called. There is no reason that toString actually has to return anything of the content. By default it will return ClassName@hashcode or you can chose some other placeholder that doesn’t let the user do anything useful. If you want to be super strict you can override toString to have a return type of Nothing and throw an exception. That way if any one tries to use it directly they will get warnings if they try to do anything after that because the compiler knows that method never returns. Won’t give the warning in all cases, however and has no effect on Java code.

And the wrapper lets you better control access to the value. I would probably do something like this which requires passing a lambda to get access to the value:

class Tainted<T>(private val value: T) {
     fun useTaintedValue(function: (T) -> Unit) = this.also { function(value) }

     override fun toString() = "Tainted Value"
     // override fun toString() = throw Error("Not allowed to convert to a String")
}

You could even make the lambda an extension lambda on a class that defines some context to prevent you making certain calls. Imagine if there is some method Bar.foo(String) that you do not want the value passed to. You can do something like this:

class Tainted<T>(private val value: T) {
    object Context {
        fun Bar.foo(s: String) = throw Error("Cannot call Bar.foo here")
    }

    fun useTaintedValue(function: Context.(T) -> Unit) = this.also { Context.function(value) }

    override fun toString() = "Tainted Value"
}
2 Likes

For the union types problem - namely exclusions - what’s wrong with overloads that take the specific types (minus the excluded types) and delegate to a common private function?

With the AC example above, you’d have three methods defined with the same name that take the appropriate value. The type that’s excluded just doesn’t get a method. The actual functionality is in a private method that takes the base type. Optionally throw if it somehow gets an instance of the excluded type.

(I’m on my phone, so please excuse the lack of an example, but I think it should be easy enough to imagine.)

Sure it’d be tedious to do this manually and some kind of “not type” shorthand or operator would be nice; the compiler can surely generate that though.

Am I missing something?