jsoniter-scala
jsoniter-scala copied to clipboard
Consider separation of decoding and encoding traits
Sometime only serialization or parsing is required.
As w/a use ??? operator for non required encoding or decoding methods in custom codecs.
@plokhotnyuk
I'm running into this issue too. But it's bigger than just using ??? for implementing codecs. E.g. I have certain fields I don't want to emit into the JSON. So the codec configs probably needs to be separate too.
Do you want parse them but their values should not be serialized back to JSON?
Sometimes that's the case, yes. Other times the opposite.
Why not just use different case classes or codecs for that?
Main idea behind generation of paired encoder and decoder is that with them you can make sure that sequential serialization and parsing do not loose any information.
If your case do not share this value than creation of a custom codec or an alternative macro which can generate such cases can be options...
Also, I'm thinking about how make the make macro to be open for extensions....
Why not just use different case classes or codecs for that?
Sure, more code can solve almost anything. But it's boilerplate copying of near identical structures, just really bothersome. Right now I have around 50 sealed case classes I would have to duplicate, for just one response.
Main idea behind generation of paired encoder and decoder is that with them you can make sure that sequential serialization and parsing do not loose any information.
Agreed, this is essential for two-way codecs.
Right now, I need to output an ADT content as just a struct, e.g.
{
"foo": 5,
"bar": "BAR"
}
However, that doesn't seem possible, because the Codec is two-way, thus must include some type information. I'm left with no choice other than reflectively convert to a Map[String, Any] and manually supply the needed codecs.
@nilskp please describe your initial problem: show an ADT that you have and provide an example of JSON representation that need to be archived for it.
E.g.
sealed abstract class Foo
case class Bar(a: Int, b: Float) extends Foo
case class Baz(x: String) extends Foo
So, given a Seq[Foo], I'd like to end up with this (one-way) grouped representation:
{
"bars": [
{"a": 5, "b": 54.23},
{"a": 98, "b": 7.3},
],
"bazs": [
{"x":" Hello"},
{"x": "World"}
]
}
If you need the most efficient implementation then generate sources for the leaf case classes and manually create a custom codec for sequence like this:
import com.github.plokhotnyuk.jsoniter_scala.core._
import com.github.plokhotnyuk.jsoniter_scala.macros._
sealed abstract class Foo
case class Bar(a: Int, b: Float) extends Foo
case class Baz(x: String) extends Foo
abstract class JsonValueEncoder[A] extends JsonValueCodec[A] {
def nullValue: A = null.asInstanceOf[A]
def decodeValue(in: JsonReader, default: A): A = ???
}
object Examples {
def main(args: Array[String]): Unit = {
implicit val barCodec: JsonValueCodec[Bar] = JsonCodecMaker.make(CodecMakerConfig())
implicit val bazCodec: JsonValueCodec[Baz] = JsonCodecMaker.make(CodecMakerConfig())
implicit val codec: JsonValueCodec[Seq[Foo]] = new JsonValueEncoder[Seq[Foo]] {
def encodeValue(xs: Seq[Foo], out: JsonWriter): Unit = {
out.writeObjectStart()
encodeSeqField[Bar]("bars", xs, out)
encodeSeqField[Baz]("bazs", xs, out)
out.writeObjectEnd()
}
}
def encodeSeqField[A](name: String, as: Seq[Any], out: JsonWriter)(implicit codecA: JsonValueCodec[A]): Unit = {
out.writeNonEscapedAsciiKey(name)
out.writeArrayStart()
as.foreach {
case a: A =>
out.writeComma()
codecA.encodeValue(a, out)
case _ =>
}
out.writeArrayEnd()
}
println(new String(writeToArray(Seq[Foo](Bar(5, 54.23f), Baz("Hello"), Bar(98, 7.3f), Baz("World")),
WriterConfig(indentionStep = 2)), "UTF-8"))
}
}
And output:
{
"bars": [
{
"a": 5,
"b": 54.23
},
{
"a": 98,
"b": 7.3
}
],
"bazs": [
{
"x": "Hello"
},
{
"x": "World"
}
]
}
Right, I already have a workaround. But should all that code be necessary? I have a sealed trait with over 30 subclasses that I haven't yet gotten to. It should be a simple configuration for one-way use cases, i.e. separation of encoding and decoding.
@nilskp I have generalized it, the last version above should be +2 lines per leaf class now.
Won't this lead to [,?
out.writeArrayStart()
xs.foreach {
case bar: Bar =>
out.writeComma()
barCodec.encodeValue(bar, out)
case _ =>
}
Shouldn't this be possible with a config option, where the decodeValue method simply ends up ??? (Nothing)?
I.e. CodecMakerConfig(discriminatorType: DiscriminatorType), replacing discriminatorFieldName?
sealed abstract class DiscriminatorType
case class Field(name: String = "type") extends DiscriminatorType
case object Key extends DiscriminatorType
case object Bare extends DiscriminatorType
@nilskp
writeComma()is a smart routine which works together withwriteArrayStart(), so it will not emit unwanted,when used properly.- to close this issue you can create and use the following abstract class (I have updated an example snippet above to show its usage):
abstract class JsonValueEncoder[A] extends JsonValueCodec[A] {
def nullValue: A = null.asInstanceOf[A]
def decodeValue(in: JsonReader, default: A): A = ???
}
- Your feedback is valuable and I'm happy to help you, but please create a separate issue for code generation of your case with sum types that need to be grouped by class.
I know I'm pretty late to the party. I have the same issue with split encoders/decoders. It would be nice if JsonValueCodec[A] would extend a JsonValueEncoder[A] and a JsonValueDecoder[A] because I'd like to document the fact in code that there exists only a decoder or an encoder for a certain type. Could that be a solution?
@martinpallmann Hi, Martin! Unfortunately, no any solution yet... Initially it looks like a pain, but it pays greatly if for all custom codecs you provide both encoding and decoding implementations and test them with round trip conversions.
@plokhotnyuk yes. And I'm not challenging that. I would just like to have for some cases the possibility to have only one of them where I may have already a different test strategy. The reason is this PR for caliban: https://github.com/ghostdogpr/caliban/pull/471. There are no decoders needed for some of the values and they are also very difficult if not impossible to implement.
@plokhotnyuk Wow!
Will just having abstract classes or traits with ??? as an implementation for redundant methods be acceptable for your case? It just an internal API, isn't it?
I mean it's probably acceptable but the skin on my back crawls up when I do that. But if you're not willing to change the API of jsoniter then I have to do it. I'm not happy about it but life will go on. :)
This is an old issue, but as far as I can tell, the status quo is still the same?
For me the main problem is that the generated codecs are non-trivial in terms of generated code size. This matters a lot in Scala.js., because all that code needs to be downloaded and parsed, even if it's not executed. That takes time, especially on mobile devices, as our JS code size goes further and further into megabytes.
If it's not possible to derive just a Decoder, it means that for every JSON API response type that we decode on the frontend, we also have to include an Encoder in the generated JS, which will never be used, because we only ever need to encode this JSON on the backend, not on the frontend. And similarly for every JSON API request type, we will need to include a decoder in the JS bundle, even though that decoder will only be used on the backend.
In other JSON libraries like upickle, I can derive encoders and decoders separately, and put them in two separate lazy vals, and even though the code for both is generated, their definitions are entirely separate, and so encoders and decoders that are never used / referenced in frontend code will be dead-code-eliminated by Scala.js. However, if I understand correctly, it is not possible to do this with jsoniter-scala?
This is a bummer because for cases when both encoding and decoding is needed, jsoniter generally generates more compact codecs than circe or upickle, although my testing on that has been very limited so far. I am actually not sure how refactoring jsoniter to separate the encoding and decoding functionality would affect the code size of the derived codecs for those cases when you need both an encoder and a decoder.
Earlier in this thread you've asked if using ??? as no-ops would suffice. I'm not sure what you meant, because I don't know how to derive such a codec for e.g. a case class or a GADT without having to write it entirely manually. But if jsoniter had makeEncoder / makeDecoder functions that generated an encode-only or decode-only codec with ??? as the no-op implementation, that would be fine by me. I don't know anything about how macros work, so I'm not sure how hard it is to avoid generating some code like this.
@raquo Thanks for you feedback!
I can add configuration for make to do not generate implementation for encoder or decoder.
Also, I'm going to update documentation to point that for similar case as yours users can save on generated code by using codec generation for intermediate data structures (like a codec per class defiled in the companion object).
That would be great, I think, thanks! I just want to underscore that I don't know what the technical solution should be, because I don't understand macros at all, I just wanted to convey the need. As far as I can tell, what you said sounds like it should help.
I do know how to check for bundle size difference using source-map-explorer npm package though, although it's a manual process, I've never tried automating it.
And yes, such a doc as you mentioned would be great 👍
@raquo I've tried the commit above with jsoniter-scala-benchmarkJS module and it allows to save ~100K using withEncodingOnly(true) option in resulting fully optimized *.js file for more than 100 codecs that are mostly small but some of them like for OpenRTBBenchmark and TwitterAPIBenchmark are quite big.
Nice, thank you! That's a good shave.
May I suggest, instead of a literal ???, to throw an exception with a message like: jsoniter-scala: codec encoder is missing because decodingOnly = true. I'm assuming that no info about the codec (name / type / etc.) is available at this point in the code, but if I'm wrong, including that would help debugging too.
The problem with a literal ??? is that people don't expect it from libraries, so they will be confused when they will go looking for it in their own code.
@raquo Thanks for the review!
The goal of these options is to minimize code of generated codecs and I hope they will be used wisely only in rare use cases as yours.
Please peek the latest release, it is already available on the Maven Central: https://github.com/plokhotnyuk/jsoniter-scala/releases/tag/v2.24.0
It would be nice if encoder and decoder are separate at the type-level. The workaround for custom codecs is unsafe where the compiler cannot detect that a decode-only codec is being used in encoding scenarios.