kryo-macros
kryo-macros copied to clipboard
OOM on some BigDecimal inputs during serialization
For a serializer generated for a case class which contains a BigDecimal field, if a number with sufficiently big scale supplied (like BigDecimal("3E1000000000")
), serialization logic blows in memory causing OOM.
This could be a possible source for DoS attacks.
It seems that using of toPlainString
wasn't the best idea: https://github.com/evolution-gaming/kryo-macros/blob/efe51b446763d5ac702c54c68248b3ee119b6958/macros/src/main/scala/com/evolutiongaming/kryo/Serializer.scala#L123
But even in case of using of toString
users can be affected after parsing of such values. Just try to pass the parsed value of that big decimal number to the following function and see what will happen:
scala> val f = (x: BigDecimal) => x + 1
f: BigDecimal => scala.math.BigDecimal = $$Lambda$1210/93981118@790ac3e0
So, we should avoid returning of parsed BigDecimal with too big exponent or with MathContext.UNLIMITED.
As example to mitigate possible DoS attacks jsoniter-scala uses safe defaults that limit a scale value and number of significant digits. Also, it sets MathContext.DECIMAL128
by default:
https://github.com/plokhotnyuk/jsoniter-scala/blob/41573be3df9df4a6f9d29c8b5931cec0c951c7ce/jsoniter-scala-macros/src/main/scala/com/github/plokhotnyuk/jsoniter_scala/macros/JsonCodecMaker.scala#L65-L67
@plokhotnyuk I think safe or unsafe to use is out of scope for a serializer. I believe any valid value of any supported type should be reversely serializable without loosing data, that is:
deserialize(serialize a) shouldEqual a
Using toString
solves this issue, also the change is perfectly format-backward-compatible.
@migesok so will there be a pull request ? :)
@t3hnar what about changing the representation from textual to binary?
As example, will it be acceptable if BigDecimal
values be serialized as byte arrays of mantissas followed by int of scale?
In any case, the value of MathContext
and a limit for scale can be configured by some system property or by introducing some configuration parameters for the macros, isn't it?
@plokhotnyuk why not? it is a binary format at the end.