protovalidate icon indicating copy to clipboard operation
protovalidate copied to clipboard

[BUG] Groups (proto2-only feature) are not consistently handled

Open jhump opened this issue 1 year ago • 3 comments

Groups are not handled uniformly in the various implementations of protovalidate.

In particular, it looks like the Go and Java implementations do not correctly support groups and will skip validating the data inside a group field. However, Python and C++ implementations do not appear to have this defect.

Test cases should likely be added to the conformance tests to verify they work correctly across all implementations.

In particular, both Go and Java implementations only do "embedded message" validation when the field's type is "message". However, group fields will have a type of "group". The Go code uses an thin abstraction over google.protobuf.protobuf.FieldDescriptorProto.Type named protoreflect.Kind. Similarly, the Java code uses a thin abstraction in the Java enum Descriptor.FieldDescriptor.Type. But they both map to the type enum in descriptor.proto, where message and group each have a distinct value (necessary because though they are structurally the same, representing nested messages, they use different wire encoding formats).

  • Relevant Go code: https://github.com/bufbuild/protovalidate-go/blob/main/internal/evaluator/builder.go#L303
  • Relevant Java code: https://github.com/bufbuild/protovalidate-java/blob/main/src/main/java/build/buf/protovalidate/internal/evaluator/EvaluatorBuilder.java#L242

The Python code, however, simply asks if the field descriptor has an associated message type, which it will for both groups and messages. And the C++ code instead looks at the field descriptor's cpp_type(), which will be CPPTYPE_MESSAGE for both messages and groups. So these two implementations appear to correctly support groups. (A new conformance test would be the best way to verify and assert this to be true.)

  • Relevant Python code: https://github.com/bufbuild/protovalidate-python/blob/main/protovalidate/internal/constraints.py#L713
  • Relevant C++ code: https://github.com/bufbuild/protovalidate-cc/blob/main/buf/validate/validator.cc#L44

jhump avatar Oct 24 '23 19:10 jhump

Like PGV before it, PV does not officially support validation against groups (mainly due to their deprecation status that predates both of these projects). We could add support (or document/test for non-support), but over the past 5 years of these two projects, this is actually the first time they've been mentioned 😅.

rodaine avatar Oct 30 '23 17:10 rodaine

I suspect that we'll need to revisit this. While groups are long deprecated, "delimited encoding" is a thing that can be done in Editions files, and I think such fields may look like groups when inspected via the reflection APIs.

jhump avatar Jun 20 '24 15:06 jhump

Shouldn't be too big of a lift! The evaluator will likely not be too different from message fields

rodaine avatar Jun 20 '24 18:06 rodaine