protobuf
protobuf copied to clipboard
encoding/protojson: poor serialization performance
https://github.com/tonybase/benchmark/tree/main/benchmark-protobuf-json:
➜ benchmark-protobuf-json git:(main) go test -bench=. -benchmem
goos: darwin
goarch: amd64
pkg: github.com/tonybase/benchmark/benchmark-protobuf-json
cpu: Intel(R) Core(TM) i7-8557U CPU @ 1.70GHz
BenchmarkOrderProto3Marshal-8 3230240 352.5 ns/op 112 B/op 1 allocs/op
BenchmarkOrderJSONMarshal-8 1411366 858.9 ns/op 240 B/op 1 allocs/op
BenchmarkOrderJSONPBMarshal-8 259860 4432 ns/op 1016 B/op 29 allocs/op
BenchmarkOrderProto3Unmarshal-8 1879772 630.8 ns/op 368 B/op 11 allocs/op
BenchmarkOrderJSONUnmarshal-8 288832 3992 ns/op 696 B/op 20 allocs/op
BenchmarkOrderJSONPBUnmarshal-8 181722 6217 ns/op 1088 B/op 56 allocs/op
PASS
ok github.com/tonybase/benchmark/benchmark-protobuf-json 9.430s
It doesn't seem to optimize performance very well, and Encoder doesn't use a similar standard library EncodeStatePool to reduce object allocation
The protojson package uses its own implementation of JSON since it needs to comply with a stricter version of JSON (RFC 7493) than what encoding/json supports (RFC 7159). The internal implementation does not benefit from years of optimizations that have been added to the standard library. There's some work going on to investigate major changes to the encoding/json package which may allow us to have native RFC 7159 support, in which case we will delete the internal library and use the standard one instead (and benefit from its optimizations).
The main reason that encoding/json is not used is because google/protobuf/duration.proto, int64(string) and oneof marshal are inconsistent with the standard library JSON and there is no good solution.
https://developers.google.com/protocol-buffers/docs/proto3#json
I want to use encoding/json. Can I implement MarshalJSON and UnmarshalJSON interfaces for known types?
https://github.com/protocolbuffers/protobuf-go/tree/master/types/known
The main reason that encoding/json is not used is because google/protobuf/duration.proto, int64(string) and oneof marshal are inconsistent with the standard library JSON and there is no good solution.
That's not true. There are many reasons why encoding/json is not used including the representation of oneofs, well-known types, and so forth.
related: protocolbuffers/protobuf#8331
Let's keep this topic about improving performance. Trying to serialize int64 as a JSON number is an unrelated concern.
Another potential performance optimization is to allow to disable field sorting. Any/random order is fine in most situations.
Are there any current plans for this optimization?
Hi, afaik the issue tracker continues to be this one both for this and the newer protobuf-go repo, let me know if that's not the case.
I've submitted a PR to improve marshaling performance at https://go-review.googlesource.com/c/protobuf/+/373356
name old time/op new time/op delta
Marshal_ScalarsUnordered-8 3.49µs ± 8% 2.06µs ± 1% -40.84% (p=0.000 n=29+27)
Marshal_Scalars-8 4.18µs ± 7% 2.80µs ± 1% -32.94% (p=0.000 n=29+30)
name old alloc/op new alloc/op delta
Marshal_ScalarsUnordered-8 1.47kB ± 0% 0.14kB ± 0% -90.76% (p=0.000 n=30+30)
Marshal_Scalars-8 1.63kB ± 0% 0.30kB ± 0% -81.87% (p=0.000 n=30+30)
name old allocs/op new allocs/op delta
Marshal_ScalarsUnordered-8 29.0 ± 0% 4.0 ± 0% -86.21% (p=0.000 n=30+30)
Marshal_Scalars-8 34.0 ± 0% 9.0 ± 0% -73.53% (p=0.000 n=30+30)
Boring details: I've added the 'Unordered' marshalopt on the old impl to sample unordered and ordered benchmarks.
With benchstat try running with -count=10 to get ten samples each, and then the delta will provide you with details about how much better/worse the metric is, and the confidence of that value. You’ll notice all of these have a p-value of 1, which is not particularly convincing on its own.
updated
going back to this in a bit, on vacation rn