Bidirectional marshalling between Haskell and Dhall
In https://github.com/dhall-lang/dhall-haskell/pull/1485#discussion_r339856234 @jiegillet brought up the idea of representing decoders and encoders in a single type inspired by tomland's Codec:
About module for encoding and decoding, it's an out-there suggestion but, the way
tomlandencodes and decodestomlis with a bi-directional codec, something likedata Codec a = Codec { fromDhall :: Expr Src Void -> Extractor Src Void a toDhall :: a -> Expr Src Void dhallType :: Expr Src Void }It's pretty nice to keep the symmetry, and this way you always have to implement both encoders and decoders at the same time.
What I'm wondering right now is what we would do about types where there is no bijection between Dhall and Haskell, like Word.
I also wonder how the types would work out for the combined record and union de-and-encoders and their helpers like constructor and encodeConstructor.
What is the motivation for having a bi-directional Codec type?
What is the motivation for having a bi-directional
Codectype?
@jiegillet has more experience with this, but from my perspective there would be the following advantages:
- If you need both directions, you don't have to define both the
Encoderand theDecoder. A singleCodecis enough. - The
dhalllibrary wouldn't have to export as many separate utilities for encoding and decoding, making it easier to navigate and to use. - The "asymmetrical" naming of
fieldandencodeField,constructorandencodeConstructorwould no longer keep me awake at night. ;) (There are other potential solutions for this particular problem though.)
What I'm wondering right now is what we would do about types where there is no bijection between Dhall and Haskell, like Word.
My impression is that when using tomland you tend to write these codecs by themselves, not as part of a typeclasse. This in turns means you don't actually need a proper bijection. You could have both Haskell's Int and Word map to Dhall's Integer for example by simply having two different codecs. I quite like this solution but I don't know how suitable that is for Dhall's users. Maybe it could be a second separate API on top of the current one?
@basile-henry I wonder whether we're talking about the same problem with bijections:
In the case of e.g. Word, we currently offer only a ToDhall instance because given a Dhall Natural, you don't know whether its < 2^64. So it seems that we couldn't offer a "total" Codec Word.
Right, in tomland the conversion can fail on both directions:
data BiMap e a b = BiMap
{ forward :: a -> Either e b
, backward :: b -> Either e a
}
I'm still concerned that bundling the two directions will introduce more complexity to the API. We'd need to introduce a large new family of "bi-" combinators and types for working with codecs like tomland does. I'm more comfortable following the pattern from the aeson package (e.g. {From,To}Dhall) since that's a well-trodden path
Coincidentally I just came across a thread about bidirectional JSON serialization. Seems like the topic is en vogue! ;)
But I don't want to push the issue – I just wanted to give the idea some space for discussion. Shall we close the issue?
@sjakobi: I don't think we need to close this yet. I'm still not convinced that I'm right and maybe some other people might want to chime in
I stumbled on this issue and decided to provide some input as the primary author or the bidirectional serialisation in tomland. I've been using this approach for more than a year, and I'm quite happy with it. In a few projects I'm working I need to not only parse config from a file but also create textual configs from Haskell values. So the approach with bidirectional codecs is convenient here.
In addition to the fact that it solves a real use case, I can tell about several more benefits of this solution:
- Encoder and decoder at the same time out-of-the-box in a single place. They are correct by construction, so usually, you don't need to have roundtrip property-based tests to make sure you decoders and encoders are consistent with each other.
- Naming problem is resolved. Instead of
encodeInt/decodeIntortoInt/fromIntthere's justintcodec. - In
tomlandwe're using the value-based solution for codecs instead of typeclass-based, so it's effortless to have multiple codecs for a data type likeByteString. - You have a clean separation between parsing/pretty-printing and encoding/decoding. These stages can be modified independently.
- Bidirectional approach is based on
Monads andProfunctors which allows to reuse the rest of the ecosystem. This is just a nice feature since you can build abstract codecs and use them in multiple places.
There were some challenges down the road (like Generics, decoding of sum types, decoding of Map-like data structures, TOML-specific difficulties), since not a lot of people have been using this approach, and there were no ready solutions. But in @kowainik we solved all these problems, and now bidirectional codecs are as powerful and convenient enough as classic ToJSON/FromJSON typeclasses.
But in @kowainik we solved all these problems, and now bidirectional codecs are as powerful and convenient enough as classic
ToJSON/FromJSONtypeclasses.
Would love to see a blogpost on this!It’s here https://kowainik.github.io/posts/2019-01-14-tomland
FYI, in @kowainik we have plans to play with bidirectional Dhall serialization in a separate package, like dhall-codec. This might be a good opportunity to test how viable the approach is and experiment in a separate independent package before changing the internals of the main library dramatically.