json-to-haskell
json-to-haskell copied to clipboard
Support for sum types
I know of two different ways to encode sum types in json.
First encoding:
{
"values": [
{"type": "left", "value": 1},
{"type": "right", "value": "foo"}
]
}
json-to-haskell's output for the first encoding:
data Model = Model
{ values :: [Values]
} deriving (Show, Eq, Ord)
data Values = Values
{ type :: Text
, value :: Int
} deriving (Show, Eq, Ord)
Second encoding:
{
"values": [
{"left": 1},
{"right": "foo"}
]
}
json-to-haskell's output for the second encoding:
data Model = Model
{ values :: [Values]
} deriving (Show, Eq, Ord)
data Values = Values
{ left :: Int
} deriving (Show, Eq, Ord)
Here's the output I would have liked to see instead:
data Model = Model
{ values :: [Values]
} deriving (Show, Eq, Ord)
data Values
= Left Int
| Right Text
} deriving (Show, Eq, Ord)
Or at least a warning that the generated parser is not able to parse the entirety of the given json example.
I have a plan for implementing this, but there are open questions.
Yeah; I have my own ideas of how to implement it, but the tricky part is deciding when to unify vs when to split a sum; e.g. what do each of the following objects result in?
[1, "string", {"name": "bob"}]
[{"a": 1}, {"b": 2}]
[{"name": "bob", "age": 23}, {"name": "Alice"}]
There's the question of whether to unify non-intersecting objects into a single record type with "Maybe" for potentially missing fields, or whether to split them into completely different records using a Sum type. A Sum type is required for non-compatible types like Number and String; but records are the tricky one.
Open to ideas here, there doesn't seem to be "one clear way". I can make it a configuration option, but it's a bit tough to communicate.
For now, for simplicity sake, it just looks at the first element of the list, which should be good enough for most applications. For others, perhaps a simple record unification with "Maybe" types would cover the other 90%. I'm sure there are apis that accept completely arbitrary list objects out there (though I'd like for someone to provide an example if they have one).
Maybe I just "bail" and default to Value in that case?
I'm happy to implement one of these options, but would love to have some discussion about the right path to take first.
I would love to see this functionality and appreciate the difficulty of giving a generic solution.
You've thought about this problem a lot more than I have, so take this with a grain of salt. What about giving a config option sum-encoding-field-name, which corresponds with Aeson's TaggedObject SumEncoding property?
In my head, this would look something like:
{
"actions": [
{ "tag": "SendAction", "to": "Someone", "from": "Someone else", "text": "hello" },
{ "tag": "MarkUnreadAction", "id": "Some ID" }
]
}
json-to-haskell --sum-encoding-field-name=tag < actions.json
And this would result in the final Haskell output creating something along the lines of:
data Model = { actions :: [Action] }
data Action = SendAction { to :: String, from :: String, text :: String } | MarkUnreadAction { id :: String }
... according to the normal rules. (I know arrays are handled a little different than the above.)
Because Aeson has this behavior baked in by default for "tag", if you used "tag" explicitly, then perhaps this could work even without specifying a name.
Is that a workable / interesting option?