dhall-haskell icon indicating copy to clipboard operation
dhall-haskell copied to clipboard

Bidirectional marshalling between Haskell and Dhall

Open sjakobi opened this issue 6 years ago • 11 comments

In https://github.com/dhall-lang/dhall-haskell/pull/1485#discussion_r339856234 @jiegillet brought up the idea of representing decoders and encoders in a single type inspired by tomland's Codec:

About module for encoding and decoding, it's an out-there suggestion but, the way tomland encodes and decodes toml is with a bi-directional codec, something like

data Codec a = 
    Codec { fromDhall  :: Expr Src Void -> Extractor Src Void a
            toDhall :: a -> Expr Src Void
            dhallType :: Expr Src Void
            }

It's pretty nice to keep the symmetry, and this way you always have to implement both encoders and decoders at the same time.

What I'm wondering right now is what we would do about types where there is no bijection between Dhall and Haskell, like Word.

I also wonder how the types would work out for the combined record and union de-and-encoders and their helpers like constructor and encodeConstructor.

sjakobi avatar Oct 30 '19 14:10 sjakobi

What is the motivation for having a bi-directional Codec type?

Gabriella439 avatar Oct 30 '19 14:10 Gabriella439

What is the motivation for having a bi-directional Codec type?

@jiegillet has more experience with this, but from my perspective there would be the following advantages:

  • If you need both directions, you don't have to define both the Encoder and the Decoder. A single Codec is enough.
  • The dhall library wouldn't have to export as many separate utilities for encoding and decoding, making it easier to navigate and to use.
  • The "asymmetrical" naming of field and encodeField, constructor and encodeConstructor would no longer keep me awake at night. ;) (There are other potential solutions for this particular problem though.)

sjakobi avatar Oct 30 '19 15:10 sjakobi

What I'm wondering right now is what we would do about types where there is no bijection between Dhall and Haskell, like Word.

My impression is that when using tomland you tend to write these codecs by themselves, not as part of a typeclasse. This in turns means you don't actually need a proper bijection. You could have both Haskell's Int and Word map to Dhall's Integer for example by simply having two different codecs. I quite like this solution but I don't know how suitable that is for Dhall's users. Maybe it could be a second separate API on top of the current one?

basile-henry avatar Oct 30 '19 16:10 basile-henry

@basile-henry I wonder whether we're talking about the same problem with bijections:

In the case of e.g. Word, we currently offer only a ToDhall instance because given a Dhall Natural, you don't know whether its < 2^64. So it seems that we couldn't offer a "total" Codec Word.

sjakobi avatar Oct 30 '19 18:10 sjakobi

Right, in tomland the conversion can fail on both directions:

data BiMap e a b = BiMap
    { forward  :: a -> Either e b
    , backward :: b -> Either e a
    }

jiegillet avatar Oct 31 '19 00:10 jiegillet

I'm still concerned that bundling the two directions will introduce more complexity to the API. We'd need to introduce a large new family of "bi-" combinators and types for working with codecs like tomland does. I'm more comfortable following the pattern from the aeson package (e.g. {From,To}Dhall) since that's a well-trodden path

Gabriella439 avatar Nov 01 '19 04:11 Gabriella439

Coincidentally I just came across a thread about bidirectional JSON serialization. Seems like the topic is en vogue! ;)

But I don't want to push the issue – I just wanted to give the idea some space for discussion. Shall we close the issue?

sjakobi avatar Nov 02 '19 05:11 sjakobi

@sjakobi: I don't think we need to close this yet. I'm still not convinced that I'm right and maybe some other people might want to chime in

Gabriella439 avatar Nov 02 '19 18:11 Gabriella439

I stumbled on this issue and decided to provide some input as the primary author or the bidirectional serialisation in tomland. I've been using this approach for more than a year, and I'm quite happy with it. In a few projects I'm working I need to not only parse config from a file but also create textual configs from Haskell values. So the approach with bidirectional codecs is convenient here.

In addition to the fact that it solves a real use case, I can tell about several more benefits of this solution:

  1. Encoder and decoder at the same time out-of-the-box in a single place. They are correct by construction, so usually, you don't need to have roundtrip property-based tests to make sure you decoders and encoders are consistent with each other.
  2. Naming problem is resolved. Instead of encodeInt/decodeInt or toInt/fromInt there's just int codec.
  3. In tomland we're using the value-based solution for codecs instead of typeclass-based, so it's effortless to have multiple codecs for a data type like ByteString.
  4. You have a clean separation between parsing/pretty-printing and encoding/decoding. These stages can be modified independently.
  5. Bidirectional approach is based on Monads and Profunctors which allows to reuse the rest of the ecosystem. This is just a nice feature since you can build abstract codecs and use them in multiple places.

There were some challenges down the road (like Generics, decoding of sum types, decoding of Map-like data structures, TOML-specific difficulties), since not a lot of people have been using this approach, and there were no ready solutions. But in @kowainik we solved all these problems, and now bidirectional codecs are as powerful and convenient enough as classic ToJSON/FromJSON typeclasses.

chshersh avatar Dec 30 '19 11:12 chshersh

But in @kowainik we solved all these problems, and now bidirectional codecs are as powerful and convenient enough as classic ToJSON/FromJSON typeclasses.

Would love to see a blogpost on this!It’s here https://kowainik.github.io/posts/2019-01-14-tomland

Profpatsch avatar Jan 19 '20 00:01 Profpatsch

FYI, in @kowainik we have plans to play with bidirectional Dhall serialization in a separate package, like dhall-codec. This might be a good opportunity to test how viable the approach is and experiment in a separate independent package before changing the internals of the main library dramatically.

chshersh avatar Feb 07 '20 13:02 chshersh