protovalidate icon indicating copy to clipboard operation
protovalidate copied to clipboard

[Question] required behavioral differences between PGV and PV

Open ash2k opened this issue 1 year ago • 13 comments

Description

I'm migrating from PGV and used the migration tool. Source proto:

message AccessAsProxyAuthorization {
  oneof access_as {
    option (validate.required) = true;
    AccessAsAgentAuthorization agent = 1 [(validate.rules).message.required = true];
    AccessAsUserAuthorization user = 2 [(validate.rules).message.required = true];
  }
}

After automated migration:

message AccessAsProxyAuthorization {
  oneof access_as {
    option (buf.validate.oneof).required = true;
    AccessAsAgentAuthorization agent = 1 [(buf.validate.field).required = true];
    AccessAsUserAuthorization user = 2 [(buf.validate.field).required = true];
  }
}

protovalidate seem to validate field rules of all fields inside of oneof. What's the point? Only the set field should be validated (if one is set, which may not be required). Currently I cannot express what was expressed before, can I?

Steps to Reproduce

This is Go-specific, as that's what I'm using. Try validating the following message:

	x := &AccessAsProxyAuthorization{
		AccessAs: &AccessAsProxyAuthorization_User{
			User: &AccessAsUserAuthorization{},
		},
	}

	v, err := protovalidate.New()
	require.NoError(t, err)

	err = v.Validate(x)
	require.NoError(t, err)

This works fine/as expected with PGV.

Expected Behavior

No error.

Actual Behavior

validation error:
- agent: value is required [required]

Screenshots/Logs

N/A

Environment

  • Operating System: macOS
  • Version: 14.5
  • Compiler/Toolchain: Go 1.22.5
  • Protobuf Compiler & Version: protoc v5.27.2, protoc-gen-go v1.34.2
  • Protoc-gen-validate Version: N/A
  • Protovalidate Version: protovalidate-go v0.6.3

Possible Solution

Only validate the field that is set inside of oneof, don't validate other fields (they are not set, should not be used anyway).

Additional Context

I found this proto message in the conformance suite:

https://github.com/bufbuild/protovalidate/blob/cbbac69a78bee5821eddc3299f55513dd842429f/proto/protovalidate-testing/buf/validate/conformance/cases/oneofs.proto#L51-L58

And this test:

https://github.com/bufbuild/protovalidate/blob/cbbac69a78bee5821eddc3299f55513dd842429f/tools/protovalidate-conformance/internal/cases/cases_oneof.go#L86-L91

I wonder what is the point here? Current implementation/semantics make it impossible to use a oneof with any other field, apart from the one that is required=true. I don't see how this is useful at all.

I could remove required = true from both fields, but then the test below would pass, but that's not what I want. I want the validator to catch the missing field!

	x := &AccessAsProxyAuthorization{
		AccessAs: &AccessAsProxyAuthorization_User{
			// User field is not set, code will panic with NPE
		},
	}

	v, err := protovalidate.New()
	require.NoError(t, err)

	err = v.Validate(x)
	require.NoError(t, err)

ash2k avatar Jul 28 '24 07:07 ash2k

Actually, this is not limited to required. It applies to all validation in oneof as all fields are validated, even the unset ones. It's impossible to have validation on fields in oneof where the default value does not pass validation requirements.

ash2k avatar Jul 28 '24 11:07 ash2k

Hi @ash2k! Thanks for the detailed report. You're running up against a few pathological cases here.

I think you're adding required to the fields within your oneof because you want the semantics "if this field is the set field in the oneof, it cannot be nil." This will be true regardless in a serialized proto; said another way, there is no way to have a "set" nil field in a oneof. Now, Go might allow you to construct a proto message with the oneof set to a nil value, but will be treated as the empty value regardless passing through serialization:

msg := &AccessAsProxyAuthorization{ 
  AccessAs: &AccessAsProxyAuthorization_AccessAsAgentAuthorization{ AccessAsAgentAuthorization: nil } }, 
}
data, _ := proto.Marshal(msg)
out := &cases.Oneof{}
_ = proto.Unmarshal(data, out)
fmt.Println(prototext.Format(out))

// output:
// z: {}

I recommend removing the required's on the field unless you actually want to require one of those fields (more on that below).

Next, due to inconsistencies of PGV's behaviors around required (and ignore_empty), we have explicitly defined the semantics of it:

  // If `required` is true, the field must be populated. A populated field can be
  // described as "serialized in the wire format," which includes:
  //
  // - the following "nullable" fields must be explicitly set to be considered populated:
  //   - singular message fields (whose fields may be unpopulated/default values)
  //   - member fields of a oneof (may be their default value)
  //   - proto3 optional fields (may be their default value)
  //   - proto2 scalar fields (both optional and required)
  // - proto3 scalar fields must be non-zero to be considered populated
  // - repeated and map fields must be non-empty to be considered populated

Your claim "all fields in a oneof are validated" is not true. There are plenty of tests in the conformance suite that would fail if that were the case. It is just behaviorally different than other rules since it applies regardless of if the field is set or not (i.e., it would be contradictory to ignore a required rule if the field is unset).

Now, regarding the oneof conformance test labeled required/required_field/wrong_field, you are asking what the point is. There could be (and have been) cases where a oneof has existed, but only one of the choices is viable for whatever reason. Without making a breaking change to the proto, one can add required to the only viable field in the oneof and enforce that at validation time.

For your example, I'd suggest removing both required annotations from the fields, keeping the required on the oneof. Constructing a oneof message field with a nil value is technically an invalid construction (and a nuanced limitation of Go's type system).


All that said, I do think you've surfaces two interesting things we can make a bit better:

  • Verify that buf's lint rules for protovalidate catch the case where two fields in a oneof have required (cc @oliversun9). We already have a check that considers any required on oneof fields as bad, but if that's overridden, we should definitely still error if more than one field has required.

  • ~~We can make protovalidate-go not panic for the misconstructed oneof field. I'll open an issue to ameliorate that.~~ See https://github.com/bufbuild/protovalidate/issues/228#issuecomment-2254574189

rodaine avatar Jul 28 '24 16:07 rodaine

We can make protovalidate-go not panic for the misconstructed oneof field. I'll open an issue to ameliorate that.

I actually cannot reproduce this. Using the OneOf from the conformance tests, this test passes on protovalidate HEAD:

func TestValidator_MisconstructedOneof(t *testing.T) {
	t.Parallel()
	val, err := New()
	require.NoError(t, err)
	msg := &cases.Oneof{
		// explicitly set the message field to nil
		// message requires that its field is set to true
		O: &cases.Oneof_Z{Z: nil},
	}
	require.NotPanics(t, func() { err = val.Validate(msg) })
	valErr := &ValidationError{}
	require.ErrorAs(t, err, &valErr)
	assert.Equal(t, "bool.const", valErr.Violations[0].GetConstraintId())
}

rodaine avatar Jul 28 '24 16:07 rodaine

This will be true regardless in a serialized proto; said another way, there is no way to have a "set" nil field in a oneof. Now, Go might allow you to construct a proto message with the oneof set to a nil value, but will be treated as the empty value regardless passing through serialization:

Ok, that's what I suspected. However, I think it might still be valuable to somehow support this. I don't insist though, it's fine if that's not supported. The use case for this is validating constructed messages e.g. before marshaling them and sending/persisting/etc them. This is probably Go-specific (I don't know about other languages).

I recommend removing the required's on the field

Yes, I'm going to do that. Thanks!

Your claim "all fields in a oneof are validated" is not true.

Yes, sorry. I must have misread the code. I've just tested two int64 fields with validation in a oneof and it works as expected 👍

Verify that buf's lint rules for protovalidate catch the case where two fields in a oneof have required

Yes, that would have helped a lot. Perhaps the migration tool should print a link to a doc or explain what's going on and what is the difference vs PGV.

Thank you!

ash2k avatar Jul 29 '24 03:07 ash2k

This is probably Go-specific (I don't know about other languages).

This doesn't apply in most other languages, which use an "opaque" API for setting fields via setter methods instead of direct struct value manipulation. In these cases, you cannot set a oneof to an incomplete value like in your example. (It's possible that protobuf-es has similar issues since it also allows direct object access instead of requiring the use of accessor/mutator methods.)

There are related issues that might apply to some other languages though, such as a map<string,Message> field where the corresponding Go map[string]*Message map has a nil message value in it or a repeated Message field where the corresponding Go []*Message contains a nil element. Other languages (like Python I think) allow for using list and dictionary literals for such fields, which could lead to similar issues as the one you mention with oneofs.

We could consider adding add'l validation rules that are always enforced, i.e. not triggered via a custom option, because they are always indicative of a malformed message. Such messages can't round trip correctly, meaning that serializing them to bytes and then deserializing those bytes will result in a different message (with non-nil, empty messages in place of the nils).

jhump avatar Jul 29 '24 14:07 jhump

If a nil message is ignored by proto.Marshal and protojson.Marshal, maybe it would be least surprising to also ignore in it in protovalidate?

timostamm avatar Jul 29 '24 15:07 timostamm

maybe it would be least surprising to also ignore in it in protovalidate

It's not that it's ignored by serialization but that it gets quietly transformed into an empty message. This could be a source of surprise/confusion and is very likely to be a bug in the code that constructed the message. Perhaps protovalidate isn't the right place for that sort of check, but it seems like it would be nice to provide, at least on an opt-in basis, and also seems like what the user was trying to achieve 🤷

jhump avatar Jul 29 '24 16:07 jhump

However, I think it might still be valuable to somehow support this.

@ash2k For all the protovalidate implementations so far, we operate on the proto reflection representation of the message which hides details about misconstructions. The state of the original concrete message is opaque to protovalidate; in particular, protovalidate-go is not presented a message field in a oneof "set to nil," it's presented as set to the empty message.

What this means is, without using Go's reflection utilities prior to the rest of the protovalidate logic, we'd currently have no feasible way of checking for these misconstructions that the protobuf library handles fine without panicking, though user-level code might still panic if not using the Get* accessors diligently. I don't think the juice is worth the proverbial squeeze, particularly when I cannot reproduce any real misbehavior, panics or otherwise.

Perhaps the migration tool should print a link to a doc or explain what's going on and what is the difference vs PGV.

@ash2k The migration tool operates at the AST level, so has no sense of messages as a whole, but we can expand the existing migration tool docs to at least point folks to review the required documentation to make sure the semantics match expectations.

There are related issues that might apply to some other languages though

@jhump Testing at least repeated fields with nil values in Go, protovalidate handles this case fine, as the proto reflect library also treats these as the empty messages. I suspect maps would behave the same, too. Ultimately, it's up to the protobuf library to decide what is considered valid, and if it's not resulting in a panic or some other unrecoverable error in the reflect library, protovalidate is bound to follow. And if there's any question, the following passes (and passes through protovalidate cleanly):

	msg := &pb.CelMapOnARepeated{Values: []*pb.CelMapOnARepeated_Value{
		{Name: "foo"},
		nil,
	}}
	require.True(t, msg.ProtoReflect().IsValid())

That said, I do think the libraries can be explicit about what happens in these cases on a per implementation basis. Go's behavior feels consistent to me, and I suspect all the upb based runtimes also share similar behavior.

If a nil message is ignored by proto.Marshal and protojson.Marshal, maybe it would be least surprising to also ignore in it in protovalidate?

@timostamm Not sure what you mean by ignored. A nil message in a set oneof field, repeated, and (assuming) map value are treated by all the marshaling code in Go as an empty message. It's only "ignored" on singular message fields (which is the expected behavior). Protovalidate cannot even differentiate in the former cases as it's operating against the proto reflect representation of the message (which shows these fields as empty and not nil).

rodaine avatar Jul 29 '24 16:07 rodaine

The state of the original concrete message is opaque to protovalidate

@rodaine, this isn't entirely true: https://go.dev/play/p/OeTPMpkWgz_p

the following passes (and passes through protovalidate cleanly):

The IsValid() method does not perform a deep validity check. In practice, it generally only checks that the underlying pointer is not nil. If you take that example a step further, you'll see that the nil element inside the slice is invalid:

	msg := &pb.CelMapOnARepeated{Values: []*pb.CelMapOnARepeated_Value{
		{Name: "foo"},
		nil,
	}}
	repeatedFieldDescriptor := msg.ProtoReflect().Descriptor().Fields().ByName("values")
	require.True(t, msg.ProtoReflect().
		Get(repeatedFieldDescriptor).
		List().
		Get(1).
		Message().
		IsValid()) // Boom

(related: https://go.dev/play/p/OeTPMpkWgz_p)

we'd currently have no feasible way of checking for these misconstructions

As demonstrated above, at least in Go, it is feasible.

I don't think the juice is worth the proverbial squeeze

This seems like a reasonable take, too. While I can see value in some sort of validation to check for these sorts of misconstructed messages, I'm also happy to agree that protovalidate may not be the place for it, since it is really intended to provide custom validation. A check for a misconstructed message (with invalid/nil interior values that should not be nil) probably instead belongs in the core runtime.

jhump avatar Jul 29 '24 17:07 jhump

Oh, good catch! Interestingly it's opaque with get access (which is the only way we interact with it in protovalidate), even if it is invalid. Querying every field and value would be silly though. I agree this is better handled by the runtime library.

That said, adding the IsValid check everywhere wouldn't be too difficult, just adds overhead that in the vast majority of cases is not relevant.

rodaine avatar Jul 29 '24 17:07 rodaine

BTW, I just filed https://github.com/golang/protobuf/issues/1634

jhump avatar Jul 30 '24 00:07 jhump

Verify that buf's lint rules for protovalidate catch the case where two fields in a oneof have required

Yep, I just tested this. Running buf lint with the DEFAULT category with a relatively recent buf (v1.28.0 or newer) will catch these, regardless of how many fields in the oneof have required set.

With the access authorization example, you would get something like:

proto/a.proto:16:43:Field "agent" has (buf.validate.field).required but is in a oneof (access_as). Oneof fields must not have (buf.validate.field).required.
proto/a.proto:17:41:Field "user" has (buf.validate.field).required but is in a oneof (access_as). Oneof fields must not have (buf.validate.field).required.

oliversun9 avatar Jul 30 '24 20:07 oliversun9

It's not that it's ignored by serialization but that it gets quietly transformed into an empty message.

A nil message in a set oneof field, repeated, and (assuming) map value are treated by all the marshaling code in Go as an empty message.

Thanks for the context, I misremembered how oneof is represented :facepalm:

We will run into related situations with Protobuf-ES. For example, ECMAScript Number is used to represent several numeric Protobuf types, so it's possible to set the value -1 for a uint32 field. It's impossible to catch at compile time, and raises an error when serializing. This is similar to protobuf-go with invalid UTF-8 in a string field.

I guess most implementations will have similar cases. Maybe it should be left to the protovalidate implementation to offer additional validation for misconstructed messages at runtime (provided that the Protobuf runtime makes this feasible)?

timostamm avatar Jul 31 '24 15:07 timostamm