protobuf
protobuf copied to clipboard
encoding/{protojson,prototext}: add MarshalWriter/UnmarshalReader functionality
JSON serialization is very commonly used in HTTP servers where use of net/http
means that the user typically starts with a io.Reader
and io.Writer
on hand. As such, it is slightly inconvenient using protojson
since it operates primarily on []byte
.
Perhaps we should add MarshalWriter
that takes in a io.Writer
as input and a UnmarshalReader
that takes in a io.Reader
as input.
Consideration needs to be given to:
- #609 which considers adding APIs to the
proto
package to support streaming read/write and/or scatter/gather functionality. - https://github.com/golang/go/issues/36225, where the typical pattern of doing
json.NewDecoder(r).Decode(...)
is subtly buggy since it permits trailing trash data to go undetected.
(We would probably do the equivalent change to prototext
since the two packages have nearly identical API surface despite having obviously different semantic behavior).
Yeah, the subtle buggy behavior has come up a couple times in some of the code I‘ve maintained. One test even required the buggy behavior in order to pass, because of a typoed extra %s
at the end of the format string in the test, resulting in something like {"valid":"json"}%!s(MISSING)
being the test input.
So, it would be better if we didn‘t repeat this behavior, that is we would want to avoid using a parallel json.NewDecode(r).Decode(…)
type of form. Or maybe, it is better to match the form but not the buggy unexpected behavior, because it‘s so unexpected? But then, maybe we end up making the unexpected behavior all the more unexpected?
:thinking: I think the best choice might to stay away from the pattern after all, so that we neither reinforce unexpected behavior, or the unexpectedness of the behavior.?
thinking I think the best choice might to stay away from the pattern after all, so that we neither reinforce unexpected behavior, or the unexpectedness of the behavior.?
It seems that part of this is figuring out what we want to do with encoding/json
related to this issue. It's been on my TODO list to think more heavily about golang/go#36225 (and many other encoding/json
issues discovered over the years).
One test even required the buggy behavior in order to pass, because of a typoed extra %s at the end of the format string in the test, resulting in something like {"valid":"json"}%!s(MISSING) being the test input.
😲
How is it suggested to use protojson with streaming data? Use encoding/json
, like:
var data interface{}
d := json.NewDecoder(os.Stdin)
_ = d.Decode(&data)
b := json.Marshal(data)
protojson.Umarshal(b, &m)
It would be simpler as:
b, err := io.ReadAll(os.Stdin)
... // handle err
err = protojson.Unmarshal(b, &m)
... // handle err
The problem with ioutil.ReadAll
is it doesn't allow for a stream, and requires an EOF to stop reading:
ReadAll reads from r until an error or EOF and returns the data it read. A successful call returns err == nil, not err == EOF. Because ReadAll is defined to read from src until EOF, it does not treat an EOF from Read as an error to be reported.
Where, previously I could decode from a stream, or even messages like:
{}
{}
If you need to decode a JSON stream, then something similar to what you suggested is the right way to go. You can actually do it more efficiently with json.RawMessage
:
var b json.RawMessage
dec := json.NewDecoder(os.Stdin)
err := dec.Decode(&b)
... // check err
err = protojson.Umarshal(b, &m)
... // check err
BTW, parsing a JSON stream is actually different from what this issue about, which is regarding adding for parsing JSON from an IO stream.