go-toml icon indicating copy to clipboard operation
go-toml copied to clipboard

feat(unmarshaler): add support for fields to unmarshal themselves

Open pkarakal opened this issue 1 year ago • 3 comments

This adds the Unmarshaler interface to the v2 package. This way types that implement this interface can define the way they are unmarshalled.

This is a first attempt to fix #873


goos: linux
goarch: amd64
pkg: github.com/pelletier/go-toml/v2/benchmark
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
                                  │ /tmp/tmp.HX8I2C7g6n-v2 │       /tmp/tmp.4y6TDyGRqa-HEAD       │
                                  │         sec/op         │    sec/op     vs base                │
UnmarshalDataset/config-2                     26.76m ±  3%   29.36m ± 18%   +9.71% (p=0.000 n=10)
UnmarshalDataset/canada-2                     105.2m ±  3%   115.2m ±  2%   +9.54% (p=0.000 n=10)
UnmarshalDataset/citm_catalog-2               35.86m ± 11%   38.82m ±  8%   +8.27% (p=0.009 n=10)
UnmarshalDataset/twitter-2                    15.44m ±  4%   16.22m ±  3%   +5.01% (p=0.005 n=10)
UnmarshalDataset/code-2                       117.0m ±  1%   119.3m ±  2%   +1.98% (p=0.000 n=10)
UnmarshalDataset/example-2                    223.7µ ±  5%   236.9µ ± 10%   +5.91% (p=0.009 n=10)
Unmarshal/SimpleDocument/struct-2             776.9n ±  1%   806.3n ±  2%   +3.78% (p=0.000 n=10)
Unmarshal/SimpleDocument/map-2                1.089µ ± 13%   1.105µ ± 18%        ~ (p=0.101 n=10)
Unmarshal/ReferenceFile/struct-2              62.64µ ±  1%   66.18µ ±  1%   +5.65% (p=0.002 n=10)
Unmarshal/ReferenceFile/map-2                 97.20µ ±  5%   99.87µ ± 13%        ~ (p=0.089 n=10)
Unmarshal/HugoFrontMatter-2                   15.98µ ±  1%   16.67µ ±  1%   +4.32% (p=0.002 n=10)
Marshal/SimpleDocument/struct-2               551.2n ±  3%   548.4n ± 17%        ~ (p=0.810 n=10)
Marshal/SimpleDocument/map-2                  735.4n ±  9%   737.9n ±  1%        ~ (p=0.218 n=10)
Marshal/ReferenceFile/struct-2                48.14µ ±  2%   51.41µ ± 12%        ~ (p=0.247 n=10)
Marshal/ReferenceFile/map-2                   62.74µ ± 11%   69.28µ ± 46%  +10.43% (p=0.009 n=10)
Marshal/HugoFrontMatter-2                     11.70µ ±  1%   11.78µ ± 22%        ~ (p=0.353 n=10)
geomean                                       147.1µ         154.0µ         +4.70%

                                  │ /tmp/tmp.HX8I2C7g6n-v2 │       /tmp/tmp.4y6TDyGRqa-HEAD       │
                                  │          B/s           │      B/s       vs base               │
UnmarshalDataset/config-2                    37.37Mi ±  3%   34.07Mi ± 15%  -8.85% (p=0.000 n=10)
UnmarshalDataset/canada-2                    19.97Mi ±  3%   18.22Mi ±  2%  -8.72% (p=0.000 n=10)
UnmarshalDataset/citm_catalog-2              14.84Mi ± 10%   13.71Mi ±  7%  -7.62% (p=0.009 n=10)
UnmarshalDataset/twitter-2                   27.29Mi ±  4%   25.99Mi ±  3%  -4.77% (p=0.005 n=10)
UnmarshalDataset/code-2                      21.88Mi ±  1%   21.45Mi ±  2%  -1.96% (p=0.000 n=10)
UnmarshalDataset/example-2                   34.54Mi ±  4%   32.62Mi ±  9%  -5.56% (p=0.008 n=10)
Unmarshal/SimpleDocument/struct-2            13.50Mi ±  1%   13.01Mi ±  2%  -3.64% (p=0.000 n=10)
Unmarshal/SimpleDocument/map-2               9.637Mi ± 12%   9.494Mi ± 15%       ~ (p=0.101 n=10)
Unmarshal/ReferenceFile/struct-2             79.80Mi ±  1%   75.53Mi ±  1%  -5.35% (p=0.002 n=10)
Unmarshal/ReferenceFile/map-2                51.42Mi ±  5%   50.05Mi ± 12%       ~ (p=0.089 n=10)
Unmarshal/HugoFrontMatter-2                  32.58Mi ±  1%   31.24Mi ±  1%  -4.13% (p=0.001 n=10)
Marshal/SimpleDocument/struct-2              20.76Mi ±  3%   20.87Mi ± 14%       ~ (p=0.810 n=10)
Marshal/SimpleDocument/map-2                 15.56Mi ±  8%   15.51Mi ±  1%       ~ (p=0.224 n=10)
Marshal/ReferenceFile/struct-2               40.75Mi ±  2%   38.17Mi ± 11%       ~ (p=0.247 n=10)
Marshal/ReferenceFile/map-2                  30.49Mi ± 10%   27.61Mi ± 31%  -9.43% (p=0.007 n=10)
Marshal/HugoFrontMatter-2                    42.64Mi ±  1%   42.36Mi ± 18%       ~ (p=0.353 n=10)
geomean                                      26.73Mi         25.53Mi        -4.49%

                                  │ /tmp/tmp.HX8I2C7g6n-v2 │       /tmp/tmp.4y6TDyGRqa-HEAD        │
                                  │          B/op          │     B/op      vs base                 │
UnmarshalDataset/config-2                     5.502Mi ± 0%   5.502Mi ± 0%       ~ (p=0.912 n=10)
UnmarshalDataset/canada-2                     76.31Mi ± 0%   76.31Mi ± 0%       ~ (p=0.567 n=10)
UnmarshalDataset/citm_catalog-2               32.90Mi ± 0%   32.90Mi ± 0%  +0.00% (p=0.019 n=10)
UnmarshalDataset/twitter-2                    12.05Mi ± 0%   12.05Mi ± 0%  +0.01% (p=0.009 n=10)
UnmarshalDataset/code-2                       20.29Mi ± 0%   20.29Mi ± 0%       ~ (p=0.445 n=10)
UnmarshalDataset/example-2                    180.8Ki ± 0%   180.8Ki ± 0%       ~ (p=0.541 n=10)
Unmarshal/SimpleDocument/struct-2               805.0 ± 0%     805.0 ± 0%       ~ (p=1.000 n=10) ¹
Unmarshal/SimpleDocument/map-2                1.106Ki ± 0%   1.106Ki ± 0%       ~ (p=1.000 n=10) ¹
Unmarshal/ReferenceFile/struct-2              19.99Ki ± 0%   19.99Ki ± 0%       ~ (p=1.000 n=10)
Unmarshal/ReferenceFile/map-2                 36.86Ki ± 0%   36.86Ki ± 0%       ~ (p=0.381 n=10)
Unmarshal/HugoFrontMatter-2                   7.267Ki ± 0%   7.267Ki ± 0%       ~ (p=1.000 n=10) ¹
Marshal/SimpleDocument/struct-2                 240.0 ± 0%     240.0 ± 0%       ~ (p=1.000 n=10) ¹
Marshal/SimpleDocument/map-2                    272.0 ± 0%     272.0 ± 0%       ~ (p=1.000 n=10) ¹
Marshal/ReferenceFile/struct-2                27.64Ki ± 0%   27.64Ki ± 0%       ~ (p=1.000 n=10) ¹
Marshal/ReferenceFile/map-2                   27.28Ki ± 0%   27.28Ki ± 0%       ~ (p=1.000 n=10) ¹
Marshal/HugoFrontMatter-2                     5.672Ki ± 0%   5.672Ki ± 0%       ~ (p=1.000 n=10) ¹
geomean                                       74.24Ki        74.24Ki       +0.00%
¹ all samples are equal

                                  │ /tmp/tmp.HX8I2C7g6n-v2 │       /tmp/tmp.4y6TDyGRqa-HEAD       │
                                  │       allocs/op        │  allocs/op   vs base                 │
UnmarshalDataset/config-2                      219.2k ± 0%   219.2k ± 0%       ~ (p=1.000 n=10)
UnmarshalDataset/canada-2                      670.4k ± 0%   670.4k ± 0%       ~ (p=1.000 n=10) ¹
UnmarshalDataset/citm_catalog-2                178.0k ± 0%   178.0k ± 0%  +0.00% (p=0.005 n=10)
UnmarshalDataset/twitter-2                     54.75k ± 0%   54.76k ± 0%  +0.00% (p=0.006 n=10)
UnmarshalDataset/code-2                        1.001M ± 0%   1.001M ± 0%       ~ (p=1.000 n=10) ¹
UnmarshalDataset/example-2                     1.305k ± 0%   1.305k ± 0%       ~ (p=1.000 n=10) ¹
Unmarshal/SimpleDocument/struct-2               9.000 ± 0%    9.000 ± 0%       ~ (p=1.000 n=10) ¹
Unmarshal/SimpleDocument/map-2                  13.00 ± 0%    13.00 ± 0%       ~ (p=1.000 n=10) ¹
Unmarshal/ReferenceFile/struct-2                160.0 ± 0%    160.0 ± 0%       ~ (p=1.000 n=10) ¹
Unmarshal/ReferenceFile/map-2                   619.0 ± 0%    619.0 ± 0%       ~ (p=1.000 n=10) ¹
Unmarshal/HugoFrontMatter-2                     141.0 ± 0%    141.0 ± 0%       ~ (p=1.000 n=10) ¹
Marshal/SimpleDocument/struct-2                 8.000 ± 0%    8.000 ± 0%       ~ (p=1.000 n=10) ¹
Marshal/SimpleDocument/map-2                    9.000 ± 0%    9.000 ± 0%       ~ (p=1.000 n=10) ¹
Marshal/ReferenceFile/struct-2                  317.0 ± 0%    317.0 ± 0%       ~ (p=1.000 n=10) ¹
Marshal/ReferenceFile/map-2                     429.0 ± 0%    429.0 ± 0%       ~ (p=1.000 n=10) ¹
Marshal/HugoFrontMatter-2                       85.00 ± 0%    85.00 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                        1.060k        1.060k       +0.00%
¹ all samples are equal

pkarakal avatar Dec 31 '23 20:12 pkarakal

Thank you for the pull request!

At the moment I think this interface should live in unstable. The argument to UnmarshalTOML is a *unstable.Node, which would bring unstable into the main (stable) toml package. This is also a partial implementation: it does not work compound values like tables and array tables. To support it, it would need to coalesce multiple nodes.

I'm not comfortable bringing in a partial feature with a ~4-5% impact on speed. I'd rather see this behavior as opt-in until it is fully implemented and the overall performance impact is close to <1% when not using the interface. Maybe an unstable.Decoder, which has the same API as toml.Decoder but with an extra EnableUnmarshaler() function or something like that would allow people to opt-in?

pelletier avatar Jan 09 '24 17:01 pelletier

Thank you for the pull request!

At the moment I think this interface should live in unstable. The argument to UnmarshalTOML is a *unstable.Node, which would bring unstable into the main (stable) toml package. This is also a partial implementation: it does not work compound values like tables and array tables. To support it, it would need to coalesce multiple nodes.

I'm not comfortable bringing in a partial feature with a ~4-5% impact on speed. I'd rather see this behavior as opt-in until it is fully implemented and the overall performance impact is close to <1% when not using the interface. Maybe an unstable.Decoder, which has the same API as toml.Decoder but with an extra EnableUnmarshaler() function or something like that would allow people to opt-in?

Thanks for the feedback. I will try to scratch something up, along these lines. In the meantime I will mark this as draft

pkarakal avatar Jan 20 '24 15:01 pkarakal

Re: making the behavior opt-in:

Maybe an unstable.Decoder, which has the same API as toml.Decoder but with an extra EnableUnmarshaler() function or something like that would allow people to opt-in?

As of #923 I've added a precedent to put knobs that should not be considered stable as part of the public v2 API: https://github.com/pelletier/go-toml/blob/2e087bdf5f676bd8c3cf0ea9ed3e8bf1fdc6a72c/marshaler.go#L95-L97

I suspect you would be facing similar issues I did trying to create a unmarshal.Decoder (it would involve moving half the lib to internal/). Don't worry about this, feel free to make it opt-in with a new method on the toml.Decoder to turn the behavior on/off. As long as the method has a disclaimer like SetMarshalJsonNumbers does, I'm happy with it.

pelletier avatar Jan 31 '24 00:01 pelletier

any progress on this?

rszyma avatar Mar 14 '24 21:03 rszyma

I made a PR continues work on this. Addressed the concerns about using unstable.Node #940

rszyma avatar Mar 14 '24 22:03 rszyma

Closing because #940 has been merged. Thank you for your help!

pelletier avatar Mar 19 '24 16:03 pelletier