kaml icon indicating copy to clipboard operation
kaml copied to clipboard

YamlScalar loses type information

Open pschichtel opened this issue 2 years ago • 5 comments

Describe the bug

I've written a tiny function to convert YamlNode's into JsonElement's (the stuff I'm reading is always JSON compatible). As part of that I have to convert YamlScalar to JsonPrimitive.

I generally I don't have an issue with kaml's approach to scalars, however it is a little unfortunate that it even loses whether the scalar was a string (as-in: the value was quoted). I'm not sure how much type information snakeyaml profiles, but it would be great to retain as much as possible to make format conversions easier.

My current converter tries to convert the string value from the scalar into the various number types until one of them work. If everything fails I assume it to be a string. However that will lead to scalars like '1' to be interpreted as a number on the JSON side, even through it was originally explicitly quoted.

Reproduction repo

private inline fun <T : Any> tryConversion(conversion: () -> T): T? {
    return try {
        conversion()
    } catch (e: YamlScalarFormatException) {
        null
    }
}


private fun yamlNodeToJsonElement(yamlNode: YamlNode): JsonElement {
    return when (yamlNode) {
        is YamlList -> JsonArray(yamlNode.items.map(::yamlNodeToJsonElement))
        is YamlMap -> JsonObject(yamlNode.entries.map { (key, value) -> Pair(key.content, yamlNodeToJsonElement(value)) }.toMap())
        is YamlNull -> JsonNull
        is YamlScalar -> {
            tryConversion { JsonPrimitive(yamlNode.toBoolean()) }
                ?: tryConversion { JsonPrimitive(yamlNode.toByte()) }
                ?: tryConversion { JsonPrimitive(yamlNode.toShort()) }
                ?: tryConversion { JsonPrimitive(yamlNode.toInt()) }
                ?: tryConversion { JsonPrimitive(yamlNode.toLong()) }
                ?: tryConversion { JsonPrimitive(yamlNode.toFloat()) }
                ?: tryConversion { JsonPrimitive(yamlNode.toDouble()) }
                ?: JsonPrimitive(yamlNode.content)
        }
        is YamlTaggedNode -> yamlNodeToJsonElement(yamlNode.innerNode)
    }
}

Steps to reproduce

use a document like

some-string: '1'

parse it into a YamlNode and convert it to a JsonElement of the same type.

Expected behaviour

Some why to extract type information from the original yaml document.

Actual behaviour

All type information is lost.

Version information

0.46.0

Any other information

No response

pschichtel avatar Jul 06 '22 10:07 pschichtel

kaml is not intended to be a general-purpose YAML parsing library: its job is to provide support for YAML in kotlinx.serialization. If you need detailed information about the structure or format of the source YAML document, I'd encourage you to use a YAML parsing library like SnakeYAML directly.

charleskorn avatar Jul 06 '22 10:07 charleskorn

ah ok, so the argument is that since kotlinx.serialization is focused on object mapping, type info would be taken from the object being deserialized. But I wonder if this would make it hard to write e.g. an Any deserializer. JsonPrimitive is essentially similar to YamlScalar, but they have a isString flag that can be supplied to indicate that it really was a string (and a field to check it). all other types have (mostly) disjoint representations.

pschichtel avatar Jul 06 '22 10:07 pschichtel

straight from JsonPrimitive's code:

public sealed class JsonPrimitive : JsonElement() {

    /**
     * Indicates whether the primitive was explicitly constructed from [String] and
     * whether it should be serialized as one. E.g. `JsonPrimitive("42")` is represented
     * by a string, while `JsonPrimitive(42)` is not.
     * These primitives will be serialized as `42` and `"42"` respectively.
     */
    public abstract val isString: Boolean

pschichtel avatar Jul 06 '22 10:07 pschichtel

If you'd like to add a PR for isString to YamlScalar, I'd be open to that.

However, I suspect you'll run into other similar issues for other scenarios, so something like SnakeYAML will likely be better suited to your use case.

charleskorn avatar Jul 06 '22 11:07 charleskorn

Yeah I definitely agree that directly using SnakeYAML is probably the better approach for my use-case, but the improvement might still be nice.

pschichtel avatar Jul 06 '22 12:07 pschichtel

This issue has been automatically marked as stale because it has not had any activity in the last 60 days. It will automatically be closed if no further activity occurs in the next seven days to enable maintainers to focus on the most important issues. If this issue is still affecting you, please comment below within the next seven days. Thank you for your contributions.

stale[bot] avatar Sep 04 '22 14:09 stale[bot]

This issue has been automatically closed because it has not had any recent activity.

stale[bot] avatar Sep 11 '22 17:09 stale[bot]