argonaut icon indicating copy to clipboard operation
argonaut copied to clipboard

Strictly decode to case class disallowing extraneous fields in JSON

Open LeifW opened this issue 11 years ago • 17 comments

So that

case class Person(name: String)
Parse.decode[Person]("""{"name": "Bob", "foo": "dunno"}""")

will yield something about "Error: unexpected field "foo"".

Is this feasible?

Example use case: When people call our API, it would be helpful to tell them if the misspelled a key or added junk that doesn't do anything - tell them that's not key that means anything to us and it'll be ignored / rejected as invalid.

LeifW avatar May 20 '14 23:05 LeifW

@LeifW You can use validate to do that: https://github.com/argonaut-io/argonaut/blob/master/src/main/scala/argonaut/DecodeJson.scala#L55-L65

markhibberd avatar May 20 '14 23:05 markhibberd

I will leave this open, as a reminder to add an example in the docs. Let me know if this doesn't address what you need.

From your example:

  implicit def PersonDecodeJson: DecodeJson[Person] = 
   casecodec1(Person.apply, Person.unapply).validate(....)

Also let me know if there is a general formulation that would let you not specify the validation function explicitly. It is worth noting that it used to be the default, but it was changed because some found it not desirable - but also because it generated fairly poor error messages - if we could address the poor errors part, I would be happy to add some more combinators to make it easy to validate without specifying a custom error message.

markhibberd avatar May 20 '14 23:05 markhibberd

@markhibberd The URL you pasted is dead. Do you have any examples on how to use validate?

lukasz-golebiewski avatar Jan 29 '16 09:01 lukasz-golebiewski

To make sure that I understand this problem, @LeifW, let's please consider an example.

Given:

import argonaut._, Argonaut._

case class Person(name: String)

implicit def decode: DecodeJson[Person] =
  DecodeJson ( c => 
    for {
      name <- (c --\ "name").as[String]
    } yield Person(name)
  )

scala> Parse.decode[Person]("""{"name": "Bob", "foo": "dunno"}""")
res5: Either[Either[String,(String, argonaut.CursorHistory)],Person] = Right(Person(Bob))

you would expect, either by default in Parse#decode or an optional Parse#decodeExactly, that the extra foo field should result in a Left rather than Right?

kevinmeredith avatar Apr 02 '17 17:04 kevinmeredith

Exactly, yes, thanks for the example. I guess I'd lean towards the optional decodeExactly; I think people are used to the present behavior. A use case I was thinking of was better error messages. For large complex structures it can be easy to misspell a field, and the "missing field foo" would be easier to figure out when accompanied by a "unexpected field foob". Also sometimes people pass extra fields in their JSON API calls, and then they just get cargo culted in successive generations, even though they never did anything - people are afraid to remove them for fear of breaking something. I'd like to head that kind of thing off explicitly at the beginning.

On Apr 2, 2017 10:35 AM, "Kevin Meredith" [email protected] wrote:

To make sure that I understand this problem, @LeifW https://github.com/LeifW, let's consider an example.

Given:

import argonaut., Argonaut.

case class Person(name: String)

implicit def decode: DecodeJson[Person] = DecodeJson ( c => for { name <- (c --\ "name").as[String] } yield Person(name) )

scala> Parse.decode[Person]("""{"name": "Bob", "foo": "dunno"}""") res5: Either[Either[String,(String, argonaut.CursorHistory)],Person] = Right(Person(Bob))

you would expect, either by default in Parse#decode or an optional Parse#decodeExactly, that the extra foo field should result in a Left rather than Right?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/argonaut-io/argonaut/issues/104#issuecomment-291001334, or mute the thread https://github.com/notifications/unsubscribe-auth/AABwJyntLvpVufcOEZ05MRnMO0f0xLsrks5rr9xIgaJpZM4B8veS .

LeifW avatar Apr 02 '17 20:04 LeifW

DecodeJson.validate should be able to achieve that (albeit at a cost to the caller). It does chime slightly with an idea I've had in the past for a different way to handle parsing completely but behaviour like this would sit weirdly in our current model for decoding.

seanparsons avatar Apr 02 '17 21:04 seanparsons

Thanks, @seanparsons, for pointing me to that method!

Locally, I added the following:

case Some(w) => println(a); DecodeResult.ok(w)

to this line.

Then, I re-ran the above example via sbt console:

JSON contains exact # of fields, one, as case class to which it's decoded

scala> Parse.decode[Person]("""{"name": "Bob"}""")
HCursor(CObject(CJson({"name":"Bob"}),false,object[(name,"Bob")],(name,"Bob")),CursorHistory(List(El(CursorOpDownField(name),true))))
res0: Either[Either[String,(String, argonaut.CursorHistory)],Person] = Right(Person(Bob))

JSON has one extra field, i.e. not in the Person

scala> Parse.decode[Person]("""{"name": "Bob", "foo": "dunno"}""")
HCursor(CObject(CJson({"name":"Bob","foo":"dunno"}),false,object[(name,"Bob"),(foo,"dunno")],(name,"Bob")),CursorHistory(List(El(CursorOpDownField(name),true))))
res1: Either[Either[String,(String, argonaut.CursorHistory)],Person] = Right(Person(Bob))

After decoding, i.e. JSON => Person, is it possible to figure out if the *Cursor has any extra, i.e. not walked, fields?

I looked at the scaladocs for HCursor and ACursor, but I'm not sure. Please help me out?

kevinmeredith avatar Apr 06 '17 13:04 kevinmeredith

@kevinmeredith Not walked? I think you'd be better off just checking that fieldSet matches what you expect.

seanparsons avatar Apr 09 '17 12:04 seanparsons

matches what you expect

Could you please say more?

Taking your reply, @seanparsons, I posted this StackOverflow question - http://stackoverflow.com/questions/43433054/retrieving-traversed-json-fields.

I appreciate any help, please.

kevinmeredith avatar Apr 16 '17 23:04 kevinmeredith

@kevinmeredith This is what you want to check against correct?

scala> json.right.get.hcursor.fieldSet
res1: Option[Set[argonaut.Json.JsonField]] = Some(Set(foo))

seanparsons avatar Apr 17 '17 12:04 seanparsons

@seanparsons Given:

case class Person(name: String)

and:

scala> Parse.parse("""{"name": "Bob", "foo": "dunno"}""").toEither.right.get.hcursor.fieldSet
res6: Option[Set[argonaut.Json.JsonField]] = Some(Set(name, foo))

How would you use fieldSet to determine if there's any extraneous/extra fields in the input JSON? As far as I understand, there's no getFields on a case class, but please correct me if there is.

Also, my original thought was add an def decodeExactly function that did:

  1. Get list of fields not walked/traversed - http://stackoverflow.com/questions/43433054/retrieving-traversed-json-fields
  2. Decode JSON to case class
  3. if step 2 succeeds, check if 1 is empty. If it's non-empty, then return an error that there's extra fields

Thanks for your continued help.

kevinmeredith avatar Apr 18 '17 17:04 kevinmeredith

Well surely the fieldSet you want is Some(Set("name")) right?

Unless I'm mistaken that means in the DecodeJson.validate call the function HCursor => Boolean would look something like _.fieldSet == Some(Set("name")).

To support a decodeExactly we'd need to do a non-trivial amount of changes because of the way the existing code is written.

seanparsons avatar Apr 18 '17 23:04 seanparsons

Given all of the fields of a case class A and all of the Json's keys, assuming we decode the Json to an A, we can compare A's fields to Json's keys to implement this decodeExactly behavior, no?

https://stackoverflow.com/a/27445097/409976 shows how, using shapeless, we can get all of a case classes' fields w/ LabelledGeneric.

With respect to the second problem, i.e. getting all keys from a Json, I'm not sure if such a method exists. I don't believe that Cursor#fields would work, as is:

scala> Json.obj(
     "a" -> jString(""), 
     "b" -> jBool(false), 
     "c" -> Json.obj("a" -> jNull) 
).cursor.fields
res27: Option[List[argonaut.Json.JsonField]] = Some(List(a, b, c)) // where's `c`'s a?

Does a allFields method, which would find all fields from the Json recursively, exist? If not, perhaps a PR could implement that method, and then, using LabelledGeneric and the aforementioned method, we could address this issue?

Finally, if you all find this approach to be technically sound, then perhaps it belongs in https://github.com/alexarchambault/argonaut-shapeless?

kevinmeredith avatar Jul 08 '17 22:07 kevinmeredith

@kevinmeredith The fields and fieldsSet methods gives you the fields of the object currently in focus. So if you want to check that it has only fields a, b and c you can validate against that.

Hence this:

scala> val testDecoder: DecodeJson[(String, Boolean)] = jdecode2L((a: String, b: Boolean) => (a, b))("a", "b")
testDecoder: argonaut.DecodeJson[(String, Boolean)] = argonaut.DecodeJson$$anon$4@3f59fa4e

scala> val improvedTestDecoder = testDecoder.validate(_.fieldSet == Some(Set("a", "b")), "Wrong fields")
improvedTestDecoder: argonaut.DecodeJson[(String, Boolean)] = argonaut.DecodeJson$$anon$4@45ed447c

scala> improvedTestDecoder.decodeJson(Json("a" := "q", "b" := true))
res7: argonaut.DecodeResult[(String, Boolean)] = DecodeResult(Right((q,true)))

scala> improvedTestDecoder.decodeJson(Json("a" := "q", "b" := true, "c" := 1))
res8: argonaut.DecodeResult[(String, Boolean)] = DecodeResult(Left((Wrong fields,CursorHistory(List()))))

seanparsons avatar Jul 09 '17 07:07 seanparsons

As I understand your reply, @seanparsons, you believe the LabelledGeneric approach that I proposed in https://github.com/argonaut-io/argonaut/issues/104#issuecomment-313884079 to be unnecessary given the solution that you've provided with DecodeJson#validate?

If so, perhaps I should ask if the application of LabelledGeneric would be appropriate to https://github.com/alexarchambault/argonaut-shapeless, i.e. to derive such a DecodeJson without the need to call DecodeJson#validate manually?

kevinmeredith avatar Jul 09 '17 12:07 kevinmeredith

@kevinmeredith I was pointing out the manual solution, a generic solution would likely want to invoke the same kind of code. It would definitely be a nice thing to have in argonaut-shapeless but that's not really a question for me! :)

seanparsons avatar Jul 09 '17 12:07 seanparsons

That's definitely do-able with argonaut-shapeless. It has some JsonProductCodec, that can be customized to check for any non-decoded field. One should just roughly adapt back these lines to argonaut-shapeless, and ensure the implicit they define is in scope when the DecodeJson are derived. Then decoding would fail if any extra field is present.

alexarchambault avatar Jul 11 '17 17:07 alexarchambault