protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

protoc-gen-go: generate MarshalBinary/UnmarshalBinary methods for messages

Open phifty opened this issue 9 years ago • 19 comments

If the generated code would include the methods MarshalBinary and UnmarshalBinary from Go's stdlib, the corresponding interfaces from the encoding package would be satisfied. Storage/Transmitting-Code could be unified and a dependency to the protobuf package could be removed. The methods would look like:

func (m *Model) MarshalBinary() ([]byte, error) {
    return proto.Marshal(m)
}

func (m *Model) UnmarshalBinary(data []byte) error {
    return proto.Unmarshal(data, m)
}

phifty avatar Aug 02 '16 15:08 phifty

Hi, may we get more information about the use case for this? The encoding.BinaryMarshaler and encoding.BinaryUnmarshaler were intended to be used by data serialization packages. At least in the standard library, only gob uses that interface. Even without Model implementing those interfaces, gob should be able serialize them just fine. Is there some other serialization package that we're not aware of?

dsnet avatar Aug 02 '16 16:08 dsnet

Wow, thanks for the prompt reply. My use case is pretty simple. Let's assume, there is function that stores a given model somewhere. Since we want to separate concerns, it should only store the model and not serialize it. The signature would look like:

func store(key string, model encoding.BinaryMarshaler) error { ... }

This way it takes a model that somehow serialize to a binary format and the function does not have to worry about how this is done. Further the storage-package doesn't have any dependency on the proto, gob or json package, so if the serialization format changes, the storage code stay the same.

Currently, I'm using Protocol Buffers for the models and manually implement the MarshalBinary and UnmarshalBinary methods on them. It would easier to have an option in the .proto file to make the generator create them.

phifty avatar Aug 03 '16 06:08 phifty

Just wondering, wouldn't store be more versatile with the following?

func store(key string, model interface{}) error { ... }

And under the hood, store uses gob? It does add a dependency on gob, but gob is fairly versatile in being able to serialize to binary format nearly any Go data structure, regardless of whether is a BinaryMarshaler or not.

I'm not opposed to adding MarshalBinary/UnmarshalBinary, but I would like to explore all options first.

dsnet avatar Aug 04 '16 18:08 dsnet

I agree that this would work on 90% of the use cases. But what if I have a model that should serialize all of it's fields, except it's Password field? A per-model decision how the serialization is done seems more convenient here. Or if all the existing models has been serialized with gob and the new ones should use the proto package.

To me it's seem like a better design if a store function just stores a thing that can be serialized and don't have an opinion about how this is done.

My other argument is, that I find

func store(key string, model encoding.BinaryMarshaler) error { ... }

a bit more expressive. I can tell what's going to happend just by looking at the function's signature. If interface{} would be allowed, I would not know what happens if a chan or func is passed. I would have to read the function's statements and the gob documentation.

I think implicit implemented interfaces is the most powerful feature of Go and as a consequence the stdlib-interfaces became kind of a common language for libraries. io.Reader had it's success-story and encoding.BinaryMarshaler could have one too ;-)

phifty avatar Aug 05 '16 06:08 phifty

I think there's a valid use case for add encoding.BinaryMarshaler and encoding.BinaryUnmarshaler.

dsnet avatar Aug 05 '16 17:08 dsnet

Great! Thanks for considering the proposal. This would help me a lot.

I've started to implement some helpers for BoltDB that implement such interfaces: https://github.com/simia-tech/boltx

phifty avatar Aug 06 '16 07:08 phifty

To provide further motivation for this 4.5-year-old feature request, gocloud.dev/docstore is another example of a package which is aware of MarshalBinary and UnmarshalBinary methods. Fields on types passed to that API must either be of a primitive type or must implement encoding.BinaryMarshaler or encoding.TextMarshaler.

Note that I am not suggesting that messages should also implement encoding.TextMarshaler (with e.g. textproto), as there's probably a bunch of cases where that would or might result in undesirable behavior, the most obvious being json (unless it also implemented json.Marshaler). I don't believe that encoding.BinaryMarshaler poses similar risks, however.

adam-azarchs avatar Jan 22 '21 03:01 adam-azarchs

This would also benefit Go-Redis, which relies on BinaryMarshaler to marshal values.

atombender avatar Aug 03 '21 13:08 atombender

Came here for the go-redis usecase. We're caching grpc responses in the raw and this would save us some boilerplate code.

Alternatively this could be done trivially w/ a protoc plugin, like this one that implement the encoding/json iface: https://github.com/mitchellh/protoc-gen-go-json

sgtsquiggs avatar Dec 20 '21 15:12 sgtsquiggs

Is this just waiting for a PR, or does an official decision need to be reached?

derekperkins avatar Aug 08 '23 17:08 derekperkins

I went ahead and mailed a commit to add this, open to feedback, but it's pretty simple https://go-review.googlesource.com/c/protobuf/+/517315

derekperkins avatar Aug 08 '23 19:08 derekperkins

Please keep in mind that those methods can easily be provided by an extra protoc plugin and may also provide a different a binary serialisation that isn't compatible with proto3 wire format at all. That would also allow steering via extra options designed for that usecase (e.g. declaring password immission as mentioned above).

Thinking further by relying on encoding.MarshalBinary for protobuf, encoding may accidentally generate a binary serialisation that isn't compatible with the wire format or accidentally omit valuable data like the password in aboves example.

So I am not sure that is needed for the task of generating models usable for protobuf serialisation/deserialisation.

So I have a hard time understanding why that change is in the scope of this tool.

nightlyone avatar Aug 09 '23 06:08 nightlyone

I would strongly suggest that this be performed by an additional protoc plugin, rather than working it into the official library. As mentioned in the review of the change request, “The problem is that this leads to a non-trivial binary size increase.”

puellanivis avatar Aug 09 '23 10:08 puellanivis

This adds a dependency on the proto package from the generated packages. I don't think we currently have that dependency link. I'm not sure what the implications of adding it are. Might be fine, anything that imports generated proto packages probably imports proto somewhere.

This could also break existing code that depends on these methods not existing. I'm not sure how plausible that is.

Neither point is necessarily an argument against the change, but something to consider.

You could probably use m.ProtoReflect().ProtoMethods().Marshal(...) instead of proto.Marshal (since the generated message knows that it supports the fast path), which would avoid the proto dependency and might be a bit faster. Not sure if that's worth it.

neild avatar Aug 09 '23 15:08 neild