chromiumoxide icon indicating copy to clipboard operation
chromiumoxide copied to clipboard

Stop using `#[serde(untagged)]` for Message enum

Open ryo33 opened this issue 1 year ago • 5 comments

~~This is a draft because this uses not merged change from #204, which parses only textual message.~~ it's merged and now this is ready for reviews

Related to:

  • #167 and #194
  • https://github.com/serde-rs/serde/issues/2661

#[serde(untagged)] does not work correctly with serde_json and some numeric types in the schema, without activating the arbitrary_precision feature of serde_json and using serde_json::Number instead of all usage of f64 or else. In this change, I replaced the derived Deserialize with an implementation of FromStr by using serde_json in it.

Also, with this change, the error message will be more detailed compared to just "did not match any variant of untagged enum". Currently, it reports an error only about T for Message::Event.

ryo33 avatar Feb 14 '24 15:02 ryo33

Why not implement a custom serde deserializer instead? It would be more idiomatic than going through a FromStr. Granted this is currently annoying to do without a clone because ContentRefDeserializer is private (https://github.com/serde-rs/serde/issues/1947).

I would maybe throw the first error that is recorded. If we hit an "invalid" Response, it would be nicer to have a serde error of it instead of the a generic error because it tried to deserialize a Response into an Event (but that would mean an "invalid" event would get a generic error, so not super either). Something like:

let mut error = None;
match serde_json::from_str::<Response>(s) {
  Ok(v) => return Ok(v),
  Err(e) => { error = Some(e); }
}
match serde_json::from_str::<Event>(s) {
  Ok(v) => Ok(v),
  Err(e) => Err(error.unwrap_or(e))
}

Sytten avatar Feb 14 '24 20:02 Sytten

As the mention on the issue, I should provide a test data that emits error without this changes and ok with this change. I prefer the FromStr one, because Message never be parsed as other format than JSON in real usage, and the implementation way that uses serde_json::Value supports only JSON while it looks like supporting many formats. But I'm not sure it's whether a conventional design choice in Rust or not. Also, I've not considered using ContentRefDeserializer or something works like it yet, and I may need to learn more about it, but I believe this change is sufficient as a permanent fix.

ryo33 avatar Feb 15 '24 00:02 ryo33

I see this state as the result of the following potential process:

  1. Implement the deserializer
  2. Se it clones, see issues with ContentRefDeserializer
  3. Implement FromStr as a workaround
  4. Now the deserializer is not used, drop it

So, it looks fine to me, although some additional TODOs and comments explaining this design might be useful.

The proper implementation, in my view, should use the deserializer, because then it is simply a lower level compared to the FromStr - but the FromStr could still be present and implemented atop the deserializer. But this is the ideal world so to speak.

MOZGIII avatar Feb 15 '24 00:02 MOZGIII

I've marked this as non-draft since the base change is merged now.

I've tried to reproduce https://github.com/mattsse/chromiumoxide/issues/167#issuecomment-1676157559 but failed, and I've not found any specific test case that ensures that this change fixes it yet.

I would maybe throw the first error that is recorded. If we hit an "invalid" Response, it would be nicer to have a serde error of it instead of the generic error because it tried to deserialize a Response into an Event (but that would mean an "invalid" event would get a generic error, so not super either).

I forgot to answer that. IMO there are many options for reporting an error.

  1. Returns both with custom error type.
  2. Always return the Response one (proposed one)
  3. Always return the Event one (the current implementation in this pull request)
  4. Return a custom error says "did not match any variant of untagged enum" (like the derived deserializer on the main branch)
  5. Return one that reached to end of the input more than another, using line() and column() method on serde_json::Error.
  6. Use heuristics like returning if s.contains("\"method\"") { event_error } else { response_error }

I prefer the 3 of always returning the Event error, as I suppose it never fails to parse a JSON that is sent from Chromium and intended as a response, not an event, that means I can approximate "failure on parsing a Message equals to failure on parsing an Event". The reasoning is that the schema of Response is simple and clear, so it seems not to have any fragile part that source of unexpected compatibility issues.

The main branch is almost the same as the 4, so it should return a more detailed error after this was merged.

ryo33 avatar Feb 15 '24 02:02 ryo33

I've implemented Deserialize without additional allocation using the RawValue feature in serde_json. I left the FromStr implementation because it's better in performance and error reporting than the Deserialize impl.

ryo33 avatar Feb 15 '24 08:02 ryo33

This is not needed, I did some tests. Please close @mattsse

Sytten avatar Oct 24 '24 18:10 Sytten