kotlinx.serialization icon indicating copy to clipboard operation
kotlinx.serialization copied to clipboard

Strings are converted to Int in non-lenient mode

Open spand opened this issue 1 year ago • 4 comments

Describe the bug It seems that strings will be automatically converted to Int by default in Json. That is highly surprising and especially so when lenient is set to false.

To Reproduce

    @Serializable
    data class Asd(
        val itsAnInt: Int,
    )

    @Test
    fun asd() {
        Json.decodeFromString<Asd>("""{"itsAnInt":"1"}""")
    }

Expected behavior I expected it to throw an exception

Environment

  • Kotlin version: 1.9.10
  • Library version: 1.6.0
  • Kotlin platforms: JVM
  • Gradle version: 7.5.1

spand avatar May 31 '24 11:05 spand

Can you explain why it is problematic for strings to be parsed as integers? From my experience, people find it more convenient when they don't have to think whether a particular value was quoted, since the serialization side is not always in their control. kotlinx.serialization itself always produces unquoted integers on encodeToString though. Also it simplifies working with maps (see #2438)

sandwwraith avatar May 31 '24 12:05 sandwwraith

While this is for 32-bit integers, I've also seen integers larger than 53 bits quoted so as to not lose precision with JS clients/servers/proxies. Having them transparently unwrapped when a proper 64-bit integer type is available is convenient.

JakeWharton avatar May 31 '24 13:05 JakeWharton

Concretely we had a bug where a developer had used Int in the type definition where it should have been a String since the possible values were more than just quoted numbers. You could put that on the developer but at least if this feature had not existed the developer would have been forced to use String and handle the "not a number" case explicitly.

In the abstract it is quite odd that such behavior is standard because

  • It forces users to support behavior they might not have expected or wouldnt want to (Hyrum's Law). Concretely if we would need another tool to support the same json then we probably couldnt or it would be annoying like it would have been in kotlinx.serialization without this feature. All because we are forced to support this feature for every number.
  • Principle of Least Surprise. No one in our team expected such a feature to exist. I really find it odd to have such a feature enabled by default. Especially in contrast to ignoreUnknownKeys that we enable so many places to be forward compatible (which still has the correct default imo).
  • There is something off about not being able generate the same json as whats encoded. I assume people quote numbers because they cannot fit into a JS number despite json having no such restriction. So why cannot we generate the same ? A @JsonQuoted opt-in seemed more appropriate ?

I believe I have made my point here so I will stop unless you want more.

Obviously I understand the convenience for such features given the amount of crazy json out there but to have it anything other than opt-in I dont really get.

spand avatar Jun 03 '24 13:06 spand

@JakeWharton I can see the point of quoting for long, but I also agree with @spand. I've seen too much going wrong with "almostJson", "almostXML" etc. formats. It should not be too challenging to have a configuration flag for this behaviour (say strictInts).

@sandwwraith The "problematic" aspect is that it creates a dialect (non-serialization) producers of messages get away with an incorrect message without this being signalled

pdvrieze avatar Jun 03 '24 14:06 pdvrieze

Duplicates #1042

sandwwraith avatar Jan 06 '25 15:01 sandwwraith