protobuf.js icon indicating copy to clipboard operation
protobuf.js copied to clipboard

De-serialization of unknown enum values in toObject is inconsistent

Open henrinormak opened this issue 3 years ago • 0 comments

protobuf.js version: 6.11.2

Generated toObject functions should assign the string version of enum value 0 if they see unknown values when enums: String option is provided.

Given this protobuf 3 type

package company;

message User {
  enum Role {
    UNKNOWN_ROLE = 0;
    ADMIN = 1;
    REGULAR_USER = 2;
  }

  uint32 user_id = 1;
  Role role = 2;
}

The generated toObject looks something like this:

User.toObject = function toObject(message, options) {
    if (!options)
        options = {};
    var object = {};
    if (options.defaults) {
        object.user_id = 0;
        object.role = options.enums === String ? "UNKNOWN_ROLE" : 0;
    }
    if (message.user_id != null && message.hasOwnProperty("user_id"))
        object.user_id = message.user_id;
    if (message.role != null && message.hasOwnProperty("role"))
        object.role = options.enums === String ? $root.company.User.Role[message.role] : message.role;
    return object;
};

Following is true:

  • If no value is sent for role at all, then UNKNOWN_ROLE is set if options.defaults is true - this is correct and expected
  • If a value is sent for role and options.enums === String is set, then the output is undefined if the value is not present in the enum, however, if options.enums !== String, then the raw unknown value is returned. So for example, if the value on the wire is 3, then the output with options.enums === String is undefined, whilst with options.enums !== String it is 3

This suggests, that according to docs of protobuf3, JS is considered as a language with open enum types if options.enums !== String, but when options.enums === String, the behaviour is neither that of a language with open enums nor that with closed enums.

During deserialization, unrecognized enum values will be preserved in the message, though how this is represented when the message is deserialized is language-dependent. In languages that support open enum types with values outside the range of specified symbols, such as C++ and Go, the unknown enum value is simply stored as its underlying integer representation. In languages with closed enum types such as Java, a case in the enum is used to represent an unrecognized value, and the underlying integer can be accessed with special accessors.

Excerpt from the proto3 docs

Expected outcome would be that if options.enums === String is provided to toObject, it defaults to the 0 case in case it sees an unknown value, with a possibility of seeing the raw value as well (via some special accessor).

This change would possibly be considered as breaking, as it changes what toObject currently returns for unknown enum values with options.enums === String from being undefined to being whatever the value of the default case is.

henrinormak avatar Mar 17 '22 08:03 henrinormak