Extend the canonical json encoding spec for maps to allow the "non-shortcut" structure
Protobuf allows for defining map fields using a shortcut notation: map<key_type, value_type> map_field = N;. It is mentioned in the language guide that the on-wire format for this is designed in such a way that it is backwards compatible with the following structure:
message MapFieldEntry {
key_type key = 1;
value_type value = 2;
}
repeated MapFieldEntry map_field = N;
It is explicitly mentioned that: "Any protocol buffers implementation that supports maps must both produce and accept data that can be accepted by the above definition."
Unfortunately this is currently not the case when using the canonical json encoding. That is: if an old service sends the non-shortcut map structure to a service that uses the shortcut notation instead, it will break.
This problem is best described with some code. It is tested in Go, not sure about other languages. Given the following proto definitions:
syntax = "proto3";
package examples.v1;
// ShortcutMap message use the map shortcut for our map field
message ShortcutMap {
map<string, string> map_field = 1;
}
// LegacyMap uses the legacy setup to describe a map
message LegacyMap {
// We repeat entries to make a map
message MapFieldEntry {
string key = 1;
string value = 2;
}
repeated MapFieldEntry map_field = 1;
}
And the following Go code:
package main
import (
"fmt"
examplesv1 "github.com/crewlinker/some-internal-protobuf-go-dir/v1"
"google.golang.org/protobuf/encoding/protojson"
"google.golang.org/protobuf/proto"
)
func main() {
// send messages from our legacy code
send1 := &examplesv1.LegacyMap{MapField: []*examplesv1.LegacyMap_MapFieldEntry{{Key: "foo", Value: "bar"}}}
json1, _ := protojson.Marshal(send1)
fmt.Printf("legacy json: %s\n", json1)
proto1, _ := proto.Marshal(send1)
fmt.Printf("legacy wire: %x\n", proto1)
// receive messages in our new code
var rcv1, rcv2 examplesv1.ShortcutMap
proto.Unmarshal(proto1, &rcv1)
fmt.Println("receive 1:", rcv1.MapField) // prints the map as expected
protojson.Unmarshal(json1, &rcv2) // proto: syntax error (line 1:13): unexpected token [
fmt.Println("receive 2:", rcv2.MapField) // prints an empty map (because it couldn't parse)
}
It shows that the Json encoded message will fail to be received while the binary wire format is received successfully.
What language does this apply to?
This applies to any version of the proto spec that supports the shorthand map<T,T> notation.
Describe the problem you are trying to solve. The summary above describes the problem shortly. For a full description of my use case please check the following issue in the Go project: https://github.com/golang/protobuf/issues/1511#issuecomment-1374423995
Describe the solution you'd like
Ideally, I would like to change the map<K,V> specification for protobuf's canonical json encoding.
Similar to how a float may be encoded as a number or a string. A map should allow encoding as {"k":"v"} OR as [{"key": "k", "value":"v"}].
I'm aware that this is a big change. But as far as I understand it can be added transparently without breaking existing messages. Json decoders in various languages can first check for the current encoding, if it fails it can check for the "non-shortcut" encoding format.
Describe alternatives you've considered Please see the alternatives in the issue linked above. They are specific to the go implementation.