num-bigint icon indicating copy to clipboard operation
num-bigint copied to clipboard

Deserializing BigInt with serde_json

Open entropidelic opened this issue 2 years ago • 7 comments

Hi guys, I am using serde_json to deserialize a JSON file into a struct, MyStruct

use std::{io::BufReader, fs::File};
use serde::Deserialize;
use num_bigint::BigInt;

#[derive(Deserialize, Debug)]
struct MyStruct {
    num: BigInt,
}

fn main() {
    let file = File::open("example.json").unwrap();
    let mut reader = BufReader::new(file);

    let my_struct: MyStruct = serde_json::from_reader(&mut reader).unwrap();

    println!("{:?}", my_struct);
}

The JSON file looks like this,

{
    "num": 0 
}

and I want the value asociated with the "num" key to be deserialized as a BigInt. I am using the serde feature of the crate.

When I run this program, it panics with the following error

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("invalid type: integer `0`, expected a tuple of size 2", line: 2, column: 13)', src/main.rs:15:68

I assume that the deserializer is expecting a tuple with the sign and the magnitude of the number? Is this the expected behaviour? How should I make this work?

Thank you very much in advance

entropidelic avatar Aug 18 '22 15:08 entropidelic

The provided serialization of BigInt is a pretty raw format -- essentially (/*sign: */ i8, /* magnitude: */ [u32]). If you want to use a more "natural" number format in JSON, you could use the field attribute deserialize_with.

I think serde_json may limit you to u64, i64, or f64 (with precision loss from the mantissa), according to its Number type, but I'm not sure about that. If your JSON data contains larger numbers as a string type, then you could parse them however you wish. Maybe we could provide a helper module for custom with formats like this.

cuviper avatar Aug 18 '22 17:08 cuviper

Thank you so much for your quick response @cuviper :). The JSONs I am working with at the moment do contain large integer literals and I am not able to change this format. If they were strings this could be easily solved parsing as you mentioned, but it is not the case. I think I will have to find another solution. Anyway, thanks again!

entropidelic avatar Aug 18 '22 17:08 entropidelic

FWIW, here's an example parsing via Number: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=47b1118ed9db1f235d92e3e7a671a42e

If I change that input to a very large integer, it looks like Number does parse it as an imprecise f64. I don't know if you can implement your own number parsing for large numbers via serde_json.

cuviper avatar Aug 18 '22 17:08 cuviper

Hi @cuviper! an @entropidelic coworker here. We end up adding the serde_json arbitrary_precision feature, so let n = Number::deserialize(deserializer)? don't truncate Number.

Adding this feature to the following code deserializes the number without losing precision

[dependencies]
serde_json = { version = "1.0", features = ["arbitrary_precision"] }

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=fb1217ab5df630c3cfdf9d82c7d212c0

pefontana avatar Sep 14 '22 15:09 pefontana

Great! That doesn't need any changes in num-bigint, right?

Maybe we could provide a similar serde helper though, which does something like what Number is doing internally, without actually using/depending on serde_json here.

cuviper avatar Sep 14 '22 15:09 cuviper

I didn't make any changes in num-bigint. Yes, that would be really great!

pefontana avatar Sep 14 '22 18:09 pefontana

I looked into that, but serde_json uses a bit of an internal hack for "arbitrary_precision", parsing numbers into a map with a private key, which its Number knows how to recover. That's not something we can really bypass from num-bigint.

cuviper avatar Sep 16 '22 21:09 cuviper