Deserialize pom.xml using kotlinx.serialization and xmlutil

Hi :slightly_smiling_face: I am writing a POM solver and I need to deserialize POM files. I am using @pdvrieze/xmlutil, but I am stuck with the pom properties.

I need to deserialize both cases:

<properties>
  <a.propery>1</a.property>
  <property>
    <name>another.property</name>
    <value>2</value>
  </property>
</properties>

I was going for (tho it has mainly written by ChatGPT4 based on here):

@Serializable
@XmlSerialName(
    value = "project",
    namespace = POM_XML_NAMESPACE,
)

data class ProjectModel(
  // ...
  val properties: Properties? = null,
  // ...
)

@Serializable(with = PropertiesSerializer::class)
@XmlSerialName(
    value = "properties",
    namespace = POM_XML_NAMESPACE,
)

data class Properties(
    val properties: Map<String, String> = emptyMap()
)

object PropertiesSerializer : KSerializer<Properties> {
    override val descriptor: SerialDescriptor = MapSerializer(String.serializer(), String.serializer()).descriptor
    override fun deserialize(decoder: Decoder): Properties {
        decoder as? XML.XmlInput
            ?: throw SerializationException("${this::class.simpleName} can be used only by XML")
        val reader = decoder.input as? XmlBufferedReader
            ?: throw SerializationException("${this::class.simpleName} can be used only by XmlBufferedReader")
        val properties = mutableMapOf<String, String>()
        while (reader.hasNext() && reader.eventType != EventType.END_ELEMENT) {
            if (reader.eventType == EventType.START_ELEMENT) {
                val localName = reader.localName
                if (localName == "property") {
                    var name: String? = null
                    var value: String? = null
                    while (reader.hasNext() && !(reader.eventType == EventType.END_ELEMENT && reader.localName == "property")) {
                        if (reader.eventType == EventType.START_ELEMENT) {
                            when (reader.localName) {
                                "name" -> name = reader.consecutiveTextContent()
                                "value" -> value = reader.consecutiveTextContent()
                            }
                        }
                        reader.nextTag()
                    }
                    name ?: throw SerializationException("Property name is not specified")
                    value ?: throw SerializationException("Property value is not specified")
                    properties[name] = value
                } else {
                    properties[localName] = reader.consecutiveTextContent()
                }
            }
            reader.nextTag()
        }
        return Properties(properties)
    }
    override fun serialize(encoder: Encoder, value: Properties) {
        val output = encoder as? XML.XmlOutput
            ?: throw SerializationException("This class can be saved only by XML")
        val writer = output.target
        writer.startTag(POM_XML_NAMESPACE, "properties", "")
        value.properties.forEach { (k, v) ->
            writer.startTag(POM_XML_NAMESPACE, k, "")
            writer.text(v)
            writer.endTag(POM_XML_NAMESPACE, k, "")
        }
        writer.endTag(POM_XML_NAMESPACE, "properties", "")
    }
}

But no luck. It says that it cannot find a match for the tag and errors with:

nl.adaptivity.xmlutil.serialization.UnknownXmlFieldException: Could not find a field for name (org.jetbrains.packagesearch.maven.ProjectModel) {http://maven.apache.org/POM/4.0.0}project/{http://maven.apache.org/POM/4.0.0}properties (Element)
  candidates: {http://maven.apache.org/POM/4.0.0}modelVersion (Element), {http://maven.apache.org/POM/4.0.0}groupId (Element), {http://maven.apache.org/POM/4.0.0}artifactId (Element), {http://maven.apache.org/POM/4.0.0}version (Element), {http://maven.apache.org/POM/4.0.0}name (Element), {http://maven.apache.org/POM/4.0.0}description (Element), {http://maven.apache.org/POM/4.0.0}url (Element), {http://maven.apache.org/POM/4.0.0}organization (Element), {http://maven.apache.org/POM/4.0.0}parent (Element), {http://maven.apache.org/POM/4.0.0}packaging (Element), {http://maven.apache.org/POM/4.0.0}value (Element), {http://maven.apache.org/POM/4.0.0}dependencies (Element), {http://maven.apache.org/POM/4.0.0}dependencyManagement (Element), {http://maven.apache.org/POM/4.0.0}licenses (Element), {http://maven.apache.org/POM/4.0.0}developers (Element), {http://maven.apache.org/POM/4.0.0}scm (Element), {http://maven.apache.org/POM/4.0.0}issueManagement (Element) at position Line number = 47
Column number = 15
System Id = null
Public Id = null
Location Uri= null
CharacterOffset = 1907

where line 47 is:

    <properties> 

Do you have any clue?

This one is very weird. The list of candidates appears to indicate that you are at the top level of a POM document. What is clear is that the structure doesn’t even trigger the custom serializer for properties. What may help with figuring this one out is to temporarily remove all the other properties from the model, that could help (make a small reproductive case). I don’t see anything wrong with the code you’ve included, but assuming this is not the second occurrence of properties, it appears that the error is outside the custom serializer.

Through debugging I figured out where is the issue:

private fun XmlDescriptor.toNonTransparentChild(): XmlDescriptor {
    var result = this
    while (result is XmlInlineDescriptor || // Inline descriptors are only used when we actually elude the inline content
        (result is XmlListDescriptor && result.isListEluded)
    ) { // Lists may or may not be eluded

        result = result.getElementDescriptor(0)
    }
    if (result is XmlMapDescriptor && result.isListEluded && result.isValueCollapsed) { // some transparent tags
        return result.getElementDescriptor(1).toNonTransparentChild()
    }
    return result
}

I declared the custom serializer descriptor as a MapSerializer.descriptor. This function replaces the descriptor of the map with the value’s one! As such the QName to match will be {...}value instead of {..}properties. All the parsing logic that follows will fail.

This function is used in:

nl.adaptivity.xmlutil.serialization.XmlDecoderBase.TagDecoder#init#L461
init {
    val polyMap: MutableMap<QName, PolyInfo> = mutableMapOf()
    val nameMap: MutableMap<QName, Int> = mutableMapOf()

    for (idx in 0 until xmlDescriptor.elementsCount) {
        //                                                 ⬇ here!!!
        val child = xmlDescriptor.getElementDescriptor(idx).toNonTransparentChild()

        if (child is XmlPolymorphicDescriptor && child.isTransparent) {
            for ((_, childDescriptor) in child.polyInfo) {
                /*
                 * For polymorphic value classes this cannot be a multi-value inline. Get
                 * the tag name from the child (even if it is inline).
                */

                val tagName = childDescriptor.tagName.normalize()
                polyMap[tagName] = PolyInfo(tagName, idx, childDescriptor)
//                        nameMap[tagName] = idx
            }
        } else {
            nameMap[child.tagName.normalize()] = idx
        }
    }
    polyChildren = polyMap
    nameToMembers = nameMap

}

I do not have enough XML knowledge to understand why such logic is needed, but for my current use case, I believe it is bugged and I have no solution to propose.

Additionally, here is the min repro:

@Serializable
@XmlSerialName(
    value = "project",
    namespace = POM_XML_NAMESPACE,
)
data class PropertiesRepro(
    @Serializable(with = MavenPomPropertiesXmlSerializer::class)
    @XmlSerialName(
        value = "properties",
        namespace = POM_XML_NAMESPACE,
    )
    @XmlElement
    val properties: Map<String, String>? = null
)

object MavenPomPropertiesXmlSerializer : KSerializer<Map<String, String>> {
    private val mapSerializer = MapSerializer(String.serializer(), String.serializer())
    override val descriptor: SerialDescriptor = mapSerializer.descriptor

    private val Decoder.xmlReaderOrNull
        get() = (this as? XML.XmlInput)?.input as? XmlBufferedReader

    private val Encoder.xmlWriterOrNull
        get() = (this as? XML.XmlOutput)?.target

    private fun XmlWriter.encodeProperties(properties: Map<String, String>) {
        properties.forEach { (k, v) ->
            startTag(POM_XML_NAMESPACE, k, "")
            text(v)
            endTag(POM_XML_NAMESPACE, k, "")
        }
    }

    private fun XmlBufferedReader.decodeProperties(): Map<String, String> = buildMap {
        while (hasNext() && eventType != EventType.END_ELEMENT) {
            if (eventType == EventType.START_ELEMENT) {
                val localName = localName
                if (localName == "property") {
                    var name: String? = null
                    var value: String? = null
                    while (hasNext() && !(eventType == EventType.END_ELEMENT && localName == "property")) {
                        if (eventType == EventType.START_ELEMENT) {
                            when (localName) {
                                "name" -> name = consecutiveTextContent()
                                "value" -> value = consecutiveTextContent()
                            }
                        }
                        nextTag()
                    }
                    name ?: throw SerializationException("Property name is not specified")
                    value ?: throw SerializationException("Property value is not specified")
                    set(name, value)
                } else {
                    set(localName, consecutiveTextContent())
                }
            }
            nextTag()
        }
    }

    override fun deserialize(decoder: Decoder) =
        decoder.xmlReaderOrNull?.decodeProperties() ?: mapSerializer.deserialize(decoder)

    override fun serialize(encoder: Encoder, value: Map<String, String>) =
        encoder.xmlWriterOrNull?.encodeProperties(value)
            ?: mapSerializer.serialize(encoder, value)

}
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
  <properties>
    <maven.version>3.0.5</maven.version>
    <maven.compiler.source>1.7</maven.compiler.source>
    <maven.compiler.target>1.7</maven.compiler.target>
    <classWorldsVersion>2.6.0</classWorldsVersion>
    <commonsCliVersion>1.4</commonsCliVersion>
    <commonsLangVersion>3.8.1</commonsLangVersion>
  </properties>
</project>

This is not actually a valid way to create your descriptor. The “correct” way is to use the SerialDescriptor function that allows you to create a derived descriptor with a new name. In this case the “issue” is that the map is used in transparent mode (leave out the wrapper tag properties). Possibly the easiest way is to change that configuration option.

Oh I see. Well fml it was easier than expected:

object MavenPomPropertiesXmlSerializer : KSerializer<Properties> {
    private val fallbackSerializer = MapSerializer(String.serializer(), String.serializer())

    override val descriptor: SerialDescriptor = buildClassSerialDescriptor("maven.properties") {
        element("properties", fallbackSerializer.descriptor)
    }

    private val Decoder.xmlReaderOrNull
        get() = (this as? XML.XmlInput)?.input as? XmlBufferedReader

    private val Encoder.xmlWriterOrNull
        get() = (this as? XML.XmlOutput)?.target

    private fun XmlWriter.encodeProperties(properties: Map<String, String>) {
        startTag(POM_XML_NAMESPACE, "properties", "")
        properties.forEach { (k, v) ->
            startTag(POM_XML_NAMESPACE, k, "")
            text(v)
            endTag(POM_XML_NAMESPACE, k, "")
        }
        endTag(POM_XML_NAMESPACE, "properties", "")
    }

    private fun XmlBufferedReader.decodeProperties(): Map<String, String> = buildMap {
        if (localName != "properties") throw SerializationException("Expected properties tag")
        nextTag()
        while (hasNext()) {
            if (localName == "properties" && eventType == EventType.END_ELEMENT) break
            if (eventType == EventType.START_ELEMENT) {
                if (localName == "property") {
                    var name: String? = null
                    var value: String? = null
                    while (hasNext() && !(eventType == EventType.END_ELEMENT && localName == "property")) {
                        if (eventType == EventType.START_ELEMENT) {
                            when (localName) {
                                "name" -> name = consecutiveTextContent()
                                "value" -> value = consecutiveTextContent()
                            }
                        }
                        nextTag()
                    }
                    name ?: throw SerializationException("Property name is not specified")
                    value ?: throw SerializationException("Property value for name '$name' is not specified")
                    set(name, value)
                } else {
                    set(localName, consecutiveTextContent())
                }
            }
            nextTag()
        }
    }

    override fun deserialize(decoder: Decoder) =
        Properties(decoder.xmlReaderOrNull?.decodeProperties() ?: fallbackSerializer.deserialize(decoder))

    override fun serialize(encoder: Encoder, value: Properties) =
        encoder.xmlWriterOrNull?.encodeProperties(value.properties)
            ?: fallbackSerializer.serialize(encoder, value.properties)

}

It works now :slight_smile: Thank you a lot!

2 Likes

Great to see it works. And to some degree having the transparent map as element should also “work”. In any case due to using the custom serializer even a descriptor without children would “work” (as the format is not actually doing any serialization/deserialization). It just has to treat things like maps/lists/etc. special (which puts expectations on their behaviour). The behaviour in the default policy is the same as for list so the @XmlChildrenName annotation on use would also have worked to mark the need to have the container tag.