aeson icon indicating copy to clipboard operation
aeson copied to clipboard

Instance for Maybe loses information

Open cgibbard opened this issue 9 years ago • 21 comments

Prelude Data.Aeson> encode (Just Nothing :: Maybe (Maybe Integer))
"null"
Prelude Data.Aeson> encode (Nothing :: Maybe (Maybe Integer))
"null"

This just cost me way more time than I'd care to admit. Personally, I think it'd be worth fixing it by using an instance similar to the one used for Either or other sum types. Just letting TH generate the instance would be fine I think. You might want to do a major version increment to prepare people for the fact that the representation is different, but I think it's important not to have default ToJSON/FromJSON instances losing information like this.

cgibbard avatar Mar 20 '16 00:03 cgibbard

This instance has been in aeson for over 4 years (since 0.6.0.0). Changing it would silently break all usages and require a new type to restore the old behavior.

I think a first step towards pushing this change through can be to release a library that introduces a new type with an appropriate instance. If it becomes popular we could move towards a migration plan for changing the instance.

bergmark avatar Mar 21 '16 14:03 bergmark

It is possible to write an instance for Maybe which ensures its argument is not Maybe: https://github.com/ghcjs/ghcjs-base/pull/34/commits/6bf43075849fdd551f04a5b9d8a36e4d769e6fc2#diff-e09b67a14d374b52d6408c4ebcb9b630R7 . There, I'm using closed type families, but I've just realized associated types with defaults would be better:

class FromJSON a where
    type FromJSONUsesNull = 'False
    ...

type FromJSONDoesn'tUseNull a = (FromJSONUsesNull a ~ 'False)

instance FromJSONDoesn'tUseNull a => FromJSON (Maybe a) where
...

-- Same thing for ToJSON

class ToJSON a where
    type ToJSONUsesNull = 'False
    ...

type ToJSONDoesn'tUseNull a = (ToJSONUsesNull a ~ 'False)

instance ToJSONDoesn'tUseNull a => ToJSON (Maybe a) where

This is better because types other than Maybe might use null to mean something. Also, Here I'm using DataKinds, but just ConstraintKinds could be used instead, like in the ghcjs code I linked earlier. The DataKinds approach is more flexible, but I'm not sure what use the (ToJSONUsesNull a ~ 'True) constraint would have.

So, it may be possible to keep the behavior but statically ensure that there isn't aliased encoding of null. Yes, this could cause compilation errors, but these error may reflect actual potential errors.

mgsloan avatar Mar 21 '16 20:03 mgsloan

Yeah, making things fail to compile in cases where this loss of information occurs is probably a good first step -- it would have saved me a fair amount of trouble tracking down where some subtle glitchy behaviour at runtime was coming from.

In our case, we tend to use Aeson on both the backend and frontend of our applications and just use TH-generated instances without much thought about the precise representation of everything in terms of JSON, so changing the instances would break none of our code, but I realise a lot of people using it wouldn't have that luxury. I'd sort of come to expect all communication of Haskell data structures between our frontend and backend to be totally transparent, so it took me a while before I even considered that it might be the JSON encoding and decoding being unfaithful. It showed up as elements being deleted from some Map data structures which encoded patches -- roughly, patches to something of type Map k (Maybe v) were being represented as Map k (Maybe (Maybe v)) and that's what was being transmitted over the wire, so I had a bit of patching/diffing stuff to wonder about on both ends, and then realised that the patches weren't actually arriving intact. Anyway, for now we're using Either () in place of Maybe, and it's okay, if a bit ugly.

cgibbard avatar Mar 21 '16 21:03 cgibbard

This sounds like a good approach!

bergmark avatar Mar 24 '16 11:03 bergmark

Problem reoccurring all over: https://github.com/bos/aeson/issues/323 (https://github.com/bos/aeson/compare/master...tolysz:Missing_v10 ) really is the only sane answer :) Basically whan we have ommit nothing Nothing -> Ommited otherwise Nothing -> Null And it simply works with any user types...

tolysz avatar May 24 '16 09:05 tolysz

@tolysz No. There is already a suggested solution here. You also never explained why the suggestions in #323 are not sufficient.

bergmark avatar May 24 '16 09:05 bergmark

Sorry, I answered it there (now). I simply lost entropy to produce more constructive arguments. But the basic idea is to always omit the new constructor and always map Null -> null. This solution will compose nicely in generic way.

tolysz avatar May 24 '16 10:05 tolysz

With the proposed additional value

Nothing -> Ommited
Just Just a -> toJson a
Just Nothing -> Null

But I guess a newtype wrapper would be better (eg. `Possible`)

tolysz avatar May 24 '16 10:05 tolysz

Maybe you'd think I'm blaming the victim here, wouldn't it be better to encode a Maybe (Maybe a)) as something like:

data MyValue a = OK a | NotOK | SomethingMissing

Maybe then, you'd be able to pattern-match and make it easier to deal with than nested Maybes.

ardamose123 avatar Jan 09 '17 13:01 ardamose123

Well, in the original setting where it came up, we didn't explicitly have Maybe (Maybe a) appearing as such in the user written code.

It simply came up as a consequence of some type family instances which used Maybe in an entirely appropriate way.

Of course an even simpler thing would also work: data Maybe' a = Nothing' | Just' a (And then derive the relevant aeson instances using TH) But there are advantages to using Prelude types sometimes.

Our actual solution was just to stick an Identity wrapper in the way.

The main thing which would make me want to change this instance isn't that it's really hard to work around. It's that it is a case which can show up and quietly lose data without anyone knowing about it.

On Mon, Jan 9, 2017, 08:56 Ariel D. Moya Sequeira [email protected] wrote:

Maybe you'd think I'm blaming the victim here, wouldn't it be better to encode a Maybe (Maybe a)) as something like:

data MyValue a = OK a | NotOK | SomethingMissing

Maybe then, you'd be able to pattern-match and make it easier to deal with than nested Maybes.

— You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub https://github.com/bos/aeson/issues/376#issuecomment-271289970, or mute the thread https://github.com/notifications/unsubscribe-auth/AGtSbKHIwyWD1Ak4vtsRmJX-0a2ZaDq0ks5rQjyOgaJpZM4H0kLy .

cgibbard avatar Jan 09 '17 14:01 cgibbard

You are right: Possible is the exact type you say.

My only problem is: Maybe is special in aeson. Records (Maps) are removing Nothing for all keys if some template settings are set.

tolysz avatar Jan 09 '17 14:01 tolysz

Of course, it's quite possible that my position on this is a little unique, since both the producer and consumer of all the JSON I'm generating are always Haskell programs using Aeson, so practically the only thing I care about is that the representation is faithful (that what I put in on one end is what I get out on the other).

On Mon, Jan 9, 2017, 09:11 Marcin Tolysz [email protected] wrote:

You are right: Possible is the exact type you say.

My only problem is: Maybe is special in aeson. Records (Maps) are removing Nothing for all keys if some template settings are set.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bos/aeson/issues/376#issuecomment-271293340, or mute the thread https://github.com/notifications/unsubscribe-auth/AGtSbF58Mc5TjyTFa9p3CajRo-EWUU2Rks5rQj_6gaJpZM4H0kLy .

cgibbard avatar Jan 09 '17 14:01 cgibbard

I kind of like @mgsloan approach. One possibilty is to make Value gadt, so we can control whether it has Null constructor. And change ToJSON to have functional dependency:

data NotNull

data Value t where
  Bool :: Bool -> Value t
  Null :: Value Null

class ToJSON t a | a -> t where
    toJSON :: a -> Value t

instance ToJSON t a => ToJSON t [a] where
   ...

instance ToJSON NotNull Bool where
    toJSON = Bool

instance ToJSON NotNull a => ToJSON Null (Maybe a) where
    toJSON Nothing = Null
    toJSON (Just v)  = retag (toJSON v)

I''m not sure whether this is worth the trouble.


Alternatively in encoding, one can use Fix (Compose ValueF Maybe) (ValueF, so on getsObject :: HashMap Text (Maybe Value)like structure, thenNothing` on that level can be omitted.

I don't like having top-level Omitted, it makes sense only inside Object (and Array).


Another alternative is to have:

Then one could have class ToJSONField a where toJSONField :: a -> Maybe Value and use it for Map / record serialisation. It would be cool to have IntristicSuperClasses or something, to provide default implementation ToJSONField = Just . toJSON.


In summary: there are ideas, but one have to try them (and benchmark!). The second option makes possible to build on existing stuff, so one could explore that.

phadej avatar Jan 09 '17 14:01 phadej

Note: remind me, as at work we use own generics for ToJSON, so we could use own class all together exploring ValueF. I have not-so-nice fix to make omittableNothing with generics-sop, and I don't really like it.

(more important stuff to do atm)

phadej avatar Jan 09 '17 14:01 phadej

Yeah, just having our own classes as a replacement for Aeson's ToJSON and FromJSON would work as well, seeing as we're also using Generics or TH to generate essentially every instance. But if we ever go to the trouble of doing that, it practically amounts to forking aeson in a way, which would be a bit of a shame over such a small thing. It also wouldn't do much to protect people who are unaware of this sneaky case of data loss. Still, there are some other cases where the instances aren't quite uniform where I'd prefer something different. I also found myself wanting to sidestep the instances for Map to effectively encode Maps the same way as lists of pairs. (We already have our own wrapper around Map to fix the Monoid/Semigroup instances and so doing that isn't so hard in our case.)

On Mon, Jan 9, 2017, 09:21 Oleg Grenrus [email protected] wrote:

Note: remind me, as at work we use own generics for ToJSON, so we could use own class all together exploring ValueF. I have not-so-nice fix to make omittableNothing with generics-sop, and I don't really like it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bos/aeson/issues/376#issuecomment-271295628, or mute the thread https://github.com/notifications/unsubscribe-auth/AGtSbB-BzoFkuE8DeeJq3LlgtP2LZDa8ks5rQkJfgaJpZM4H0kLy .

cgibbard avatar Jan 09 '17 14:01 cgibbard

FYI: You can encode maps as a list of pairs if you use wrapper key type with ToJSONKey functionality (the default instance btw uses ToJSON)

I see aeson using Maybe as a control structure. If you want no-data-loss, you'd need newtype around. That's the problem with not obviously better (or general) instance.

Imho it's important that aeson is convenient, therefore "as a control structure" makes sense as a default

Sent from my iPhone

On 9 Jan 2017, at 16.41, cgibbard [email protected] wrote:

Yeah, just having our own classes as a replacement for Aeson's ToJSON and FromJSON would work as well, seeing as we're also using Generics or TH to generate essentially every instance. But if we ever go to the trouble of doing that, it practically amounts to forking aeson in a way, which would be a bit of a shame over such a small thing. It also wouldn't do much to protect people who are unaware of this sneaky case of data loss. Still, there are some other cases where the instances aren't quite uniform where I'd prefer something different. I also found myself wanting to sidestep the instances for Map to effectively encode Maps the same way as lists of pairs. (We already have our own wrapper around Map to fix the Monoid/Semigroup instances and so doing that isn't so hard in our case.)

On Mon, Jan 9, 2017, 09:21 Oleg Grenrus [email protected] wrote:

Note: remind me, as at work we use own generics for ToJSON, so we could use own class all together exploring ValueF. I have not-so-nice fix to make omittableNothing with generics-sop, and I don't really like it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bos/aeson/issues/376#issuecomment-271295628, or mute the thread https://github.com/notifications/unsubscribe-auth/AGtSbB-BzoFkuE8DeeJq3LlgtP2LZDa8ks5rQkJfgaJpZM4H0kLy .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

phadej avatar Jan 09 '17 14:01 phadej

Even if I make a newtype I will still have to deal with omittableNothing i.e. I will have to modify the type which imports my new one. Either way this is a bit accademic at this point. However I have a working (0.8-is) implementation with a ritcher internal type where all becomes simple. I will not make a promise, but I can estimate it is about 2-3 evenings of porting... and the change is compatibile with virtually all packages. The idea is to have an additional value which always gets removed from objects. Thus allows you to construct your own instnces with 3state logic. I used Possible for that.

For me this is a highly usefull feature, e.g. when comunicating with APIs where one can request a subset of fields, or send update only for subset of them (and we have null as removing property), especially with GoogleApis (eg. https://github.com/tolysz/google-api/blob/master/src/Network/Google/Api/Types/GoogleUser.hs )

tolysz avatar Jan 09 '17 16:01 tolysz

To repeat myself: I'm strongly against, :-1: adding Omitted or Missing to Value. There is no such constructor in JSON,

It's an encoding detail for narrow use case. Especially I dislike the idea of decode . encode being not Just, as it will lose information.

One can add handling of Possible to own generic implementation of record serialisation.

phadej avatar Jan 09 '17 19:01 phadej

@phadej Agreed: it is extremely surprising if decode . encode is not Just.

ryantrinkle avatar Jan 09 '17 19:01 ryantrinkle

Generic or not I will not be able to use "toJSON", but I will be able to do it via toEncoding, thus it is a partial answer. I.e. the way we serialize "Null" depends on some additional setting thus sometimes it is encoded as "" or "null", And you can not possibly know how your instance will be encoded (all that is depending on setting is in use on the outside record) and this worries me ;) And all that strong opposition to not encode this very information inside type itself (rather than in the function) is really surprising, and we only need one extra constructor (used only internally)

The specification does not say that some arbitrary key alway has to be removed or always has to be serialized with "null"... it does not say it can not be present for the same key for different cases, to possibly mean something different. The specification only says how all possible strings have to look like after they are encoded, not how the internal structure should look like. Because of the omittableNothing one can not possibly know, by looking at the type how Maybe will behave, as one can set it to some arbitrary value - we need to know this setting. Thus you need to know: Null Bool. I only propose to split this into 2 constructors, transparent and without that extra setting.

decode . encode is not Just.

I am lost here Maybe will have exactly the same properties as it has now(i.e. settable via ommitNothing), the only difference will be: one could create newtypes AlwaysNullMaybe, AllwaysMissingMaybe with appropriate Value representation and rendering.

The Change I propose does not change anything in the current behavior, it simplifies things, it is not really visible, and one can be happily ignoring its existence... Exactly like Imaginary part of complex numbers: the world description does not show imaginary numbers, to paraphrase the JSON RFC argument.

type Bla = Maybe Int

How will It be serialized? If used on its own and if used inside a record.

Foo = Foo { bla1 :: Bla, bla2 :: Bla}
Bar = Bar { bla1 :: Bla, bla2 :: Bla}

VS.

Foo = Foo { bla1:: AlwaysNullMaybe Int, bla2 :: AllwaysMissingMaybe Int}
Foo = Foo { bla1:: Possible Int, bla2:: Possible Int}

tolysz avatar Jan 10 '17 00:01 tolysz

I'll just add that it's not only a problem with nested Maybes. It also breaks when using Maybe Value:

fromJSON . toJSON $ Just Null :: Result (Maybe Value) = Success Nothing

mskiptr avatar May 20 '21 20:05 mskiptr