Kotlin Serialization

I’ve been thinking a bit about the problem of flexible/multiple serialization output/input formats. Probably the best way to do that would be to leave the specifics to a user defined handler that could support multiple formats. For simple POJO’s (or annotation only serialization) this would mean that the handler would just get the fields to store (name, type, value) and would do whatever it wants (format v1, format v2, json, xml,…).

As long as there is the ability to do this, no specific support for multiple formats needs to built into the system. There will probably also be serialization formats that are incompatible with others so they would not support multi-format.

1 Like

A use-case that I’ve ran into is in HTTP (POST) request handling: creating an instance of a form backing bean and populating it with values of the POST request (and applying validation rules), and serializing a form backing bean into HTTP form parameters for making a POST request.

I don’t know if this is a kind of use-case that is considered for the Kotlin seralization framework?

Especially in the former case there will many cases of receiving invalid data, yet where it is extremely useful to have control over the error handling. For instance, in order to be able to report back to the user which fields were invalid and for what reason.
Moreover, it is useful to have access to the invalid object in code.

Right now my form backing classes have nullable fields witth @NotNull validation annotations which looks weird, next to other annotations regarding what data is considered validly formatted.
Controller code handles the invalid form, Spring has a binding-result instance which is populated with the errors and these are rendered in HTML. The controller code however also has an instance of the (invalid) form bean, which is part of the Model, and which is also used to populate the HTML with previously entered values (in same template as a valid instance would be rendered).

There is here a disconnect between compiler-valid and application-valid state of objects, where one needs to make application-invalid states valid for the compiler in order to report on those states in a user-friendly way.

It is not a very huge issue to me, especially since it so far has come up for me only in HTTP form handling.
I do not see a clean solution for such use cases - but if Kotlin serialization could / would address this in a clean way without impacting the application (Spring) error handling, it would be nice.
There might also be other cases where control over the error handling, and having (partially) invalid objects available to application code during error handling, could be useful.

Is there any update to this effort?

Will we see a preview soon?

Thank you

I can confirm that we still plan to implement cross-platform Kotlin Serialization as a part of overall effort into cross-platform (JVM/JS/Native) Kotlin. There is no update nor the timeline we’ve committed to, yet. Stay tuned.

2 Likes

You can play with the prototype implementation as explained here: GitHub - elizarov/KotlinSerializationPrototypePlayground
See README.md in the repo on how to get started and what are limitations. It is very far from being feature-complete and supports only JVM backend, but it shows the general direction.

2 Likes

I’m using the serialization prototype in a sandbox projects of mine. I found it fairly easy to just use gradle with the installed Kotlin 1.1.2-2 – instead of installing a substitute plugin – and add the gradle plugin you provide on the prototype site to the compiler.

There’s just one issue with that: I had to add the gradle plugin to the buildscript dependencies as well to get it to compile, this is not listed in the readme.md, which might get people confused.

Aside from that I think it’s pretty neat to have most of the repetitive stuff generated, implementing the binary output was straightforward!

1 Like

It seems to be related to the fact that in my project I’m configuring Kotlin via plugin DSL:

plugins {
    id "org.jetbrains.kotlin.jvm" version "1.1.2"
}

If you do it this way, then you don’t need to have buildscript section at all.

I’ve added that info to the readme. Thanks a lot.

I’m wondering, is there a generic way to obtain the KSerializer? I’m
currently detecting it through some magic (basically checking for
primitives and for the companion object to be a KSerializers), but this way
I can’t get to the serializers for Lists and Maps.

There is no direct way for obtaining serialisers for generic classes like List<T> yet and serialisation of user-defined generic classes is not supported yet either. Here is the planned approach. Assume that you have a generic class and some other serializable class:

@Serializable class MyBox<T>(val value: T)
@Serializable class MyData(val a: Int, val b: Int) // whatever

In order to obtain it’s serialiser for a particular type substitution you’d use a plugin-generated function serializer on its companion object like this: MyBox.serializer(MyData)

Ideally, we’d like the following to work, too:

val box: MyBox<MyData> = JSON.parse(s)

However, the latter requires quite complex changes into the inner workings of Kotlin inline functions with reified type parameters. See also https://youtrack.jetbrains.com/issue/KT-15992

I shared some ideas about serialisation here: Generated JSON-Serialisation for Kotlin | by Fabian Zeindl | Medium

1 Like

It’d be great to be able to combine deserialization with immutability and non-nullability in an elegant way. This gets a little ugly at the moment when you add properties to a class over time but need to be able to deal with objects that were serialized before the new properties were added. For example:

class MySerializableClass : Serializable {
  // We start off with just this property.
  var someValue: String? = null

  // Later we add an immutable, non-null property.
  val listOfThings: List<Thing> = LinkedList()

  // Deserializing an older serialized object without a listOfThings property
  // set it to null by default. The way we're supposed to initialize them is
  // in a readResolve() method, but we can't.
  fun readResolve(): Any {
    if (listOfThings == null) {    // Warning - listOfThings isn't nullable
      listOfThings = LinkedList()  // Error - listOfThings is immutable
    }
    return this
  }
}

We end up having to make all newly-added properties nullable and mutable because readResolve() needs to be able to do a null test and initialize them. This means sprinkling our code with ?. even though we know the property will, in reality, never be null.

It gets a little uglier: In production code, in order to make it harder for ourselves to forget to deal with older objects as well as to get rid of code duplication, we don’t autoinitialize the property but instead only initialize in our readResolve() method, which is also called from our init {} block.

class MySerializableClass : Serializable {
  var someValue: String? = null
  var listOfThings: List<Thing>? = null

  init {
    readResolve()
  }

  fun readResolve(): Any {
    if (listOfThings == null) {
      listOfThings = LinkedList()
    }
    return this
  }
}

One possible cleaner solution would be some way to mark a method as a post-deserialization initializer, and have the compiler automatically insert code to initialize any properties that don’t already have values. So we’d end up with

class MySerializableClass : Serializable {
  var someValue: String? = null
  val listOfThings: List<Thing> = LinkedList()

  @PostDeserialization
  fun readResolve(): Any = this
}

Alternately, or in addition, this could cause the method in question to be treated as an initializer or constructor in cases where you want to do something other than just initialize a property with a simple default value, e.g., if you need to compute the value of a non-nullable transient property that would normally be computed in a constructor.

This is off-topic here (since the thread is about “Kotlin Serialization”), but there is also an open issue on better support of “Java Serialization” in Kotlin that is supposed to somewhat address the problem you’ve outlined https://youtrack.jetbrains.com/issue/KT-14528 Feel free to add your comments there.

Admittedly I used Java-style serialization examples in the hopes of making my comment more concrete, but I actually don’t think the things I’m talking about are specific to Java serialization at all. These seem like they apply generally to any Kotlin serialization system, even when running in JavaScript or native environments:

  • You can encounter a serialized representation of an object that lacks one or more non-nullable properties, e.g., because the properties were added recently. They need to be initialized to some non-null value.
  • If the serialization system doesn’t call constructors (reasonable since constructors can have side effects) and supports some notion of transient properties, then transient vals need to get their values from somewhere.
  • Ideally you’d like to solve both problems with a minimum of code duplication or boilerplate.

In Kotlin serialization we are generating a “deserializing constructor” that takes care to properly initialize missing fields with the corresponding initializer from the source code.

I implemented an annotation processing based framework inspired by the serialization prototype. I think I found some solutions that might be interesting to include in this effort.

Github page: GitHub - lukashaertel/kfunnels: Annotation processor generating serialization and deserialization from Kotlin classes.
Reddit post: https://redd.it/6x7h5s

Nice work. We’ve also significantly moved past the latest published prototype (support optional fields with defaults, class hierarchies, JS backend and more) and are now busy pushing the corresponding extension points into the upstream Kotlin compiler, so that to enable a use of Kotlin Serialization plugin without having to build your own version of Kotlin compiler. We plan to publish details soon.

3 Likes

Awesome! One of the key issues why I started the annotation based approach was that I had trouble getting the serialization to work with the 1.1.4, as one of the used methods has been removed (FindClassInModuleKt).

I currently parse the kotlin.Metadata using the JvmProtoBufUtils, which limits the processor to JVM, and also some data is missing. In kfunnels I try to find out if there’s just one type that is applicable for a field, which in Kotlin is conveniently very common as classes are final by default.

The Metadata does not include all of that information (types that are interfaces are not flagged as interfaces). Luckily the annotation processor environment can reproduce some of that information, but I bet that plugging into the compiler with all the resolved types has quite some possibilities that would be nice to have (e.g. reduce the search scope of serializers for sealed classes). I just hope I can ship around that lack of date with some educated guesses.

Anyway, looking forward on the updates! I’ve been a big fan of the serializers, I loathe handrolling code for such trivial efforts, and I think being able to have a serial representation allows for some nice features like easier hashing and equality (which probably requires more inlining), proper pretty printing and positional parser generators (which is one of the topics that I want to trying kfunnels out on next).

It has arrived. Kotlin version 1.1.50 was released and it has all the extension points that enables the prototype of Kotlin serialization to work both on JVM and on JS. The usage of the prototype compiler plugin and runtime library is explained here: GitHub - Kotlin/kotlinx.serialization: Kotlin multiplatform / multi-format serialization

It comes with JSON, CBOR and Protobuf support out of the box. More to come. Most of the planned use-cases for serialization of Kotlin classes are supported with a notable exception on generic user-defined classes, but all the standard collections are fully supported in a type-safe way.

Your feedback is welcome. Please, submit your issues and suggestions directly to the GitHub project issues.

9 Likes

Great work!

You mentioned supporting class hierarchies. AFAICT this blurs the line between ‘static’ and ‘dynamic’ serialisation further, because now you can have a static type Foo and the stream contains sub-class Bar, which is not statically referenced anywhere, yet it’s valid to construct a Bar and assign it to the field of type Foo. This is not necessarily a problem, but does imply either some sort of whitelisting again, or the assumption that a type found in a serialised type automatically whitelists all sub-types.

We indeed blur the line between static and dynamic somewhat. Moreover, we already support a certain level of “dynamicity” for JSON in a non-standard, but Jackson-complian way. It is still a work in progress (only on JVM and with many limitations), but consider this case anyway:

@Serializable open class Base { var a: Int = 42 }
@Serializable class Derived : Base()
@Serializable class Container(val b: Base)

Now, evaluating JSON.stringify(Container(Derived())) produces {"b":["Derived",{"a":42}]}. As you can see, “dynamic” runtime type of “Derived” was represented with an array of the type name and the actual value, so deserializing it produces the original types which can be verified by evaluating JSON.parse<Container>("{b:[Derived,{a:42}]}").b::class which produces class Derived.

On JVM this kind of deserialization is currently implemented via reflection, which, indeed, opens the same can of worms and any other form a dynamic serialization, however, you don’t have to use it and, unlike a standard JVM deserialization (which is always reflective), here reflection only kicks in if you have explicitly declared your serializable class to be open. We also plan to support “closed world” dynamic serialization where you explicitly list all the serializable classes. On Kotlin/Native and Kotlin/JS that is going to be the only way, as both platforms are currently designed with “closed world” assumption in mind and perform dead code elimination (DCE) based on that assumption.

1 Like