protoc-gen-validate Python: Support for `validate

This change implements the validate_all function through the ast module in the python standard library, which allows PGV to return all validate error messages at once.

//entities.proto
message Person {
  uint64 id    = 1 [(validate.rules).uint64.gt    = 999];

  string email = 2 [(validate.rules).string.email = true];

  string name  = 3 [(validate.rules).string = {
                      pattern:   "^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$",
                      max_bytes: 256,
                   }];

  Location home = 4 [(validate.rules).message.required = true];

  message Location {
    double lat = 1 [(validate.rules).double = { gte: -90,  lte: 90 }];
    double lng = 2 [(validate.rules).double = { gte: -180, lte: 180 }];
  }
}

from entities_pb2 import Person
from protoc_gen_validate.validator import validate, ValidationFailed, print_validate, validate_all


p = Person(id=1000, name="Foo", email="[email protected]", home=Person.Location(lat=0, lng=-999))
try:
    validate(p)
except ValidationFailed as err:
    print("validate(p):", err)
print("=" * 79)
try:
    validate_all(p)
except ValidationFailed as err:
    print("validate_all(p):", err)

print("=" * 79)
print(print_validate())

Result:

validate(p): p.email is not a valid email
===============================================================================
validate_all(p): 
p.email is not a valid email
p.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$
p.lng is not in range (180.0, -180.0)
===============================================================================
# Validates Person
def generate_validate(p):
    if p.id <= 999:
        raise ValidationFailed("p.id is not greater than 999")
    if not _validateEmail(p.email):
        raise ValidationFailed("p.email is not a valid email")
    if byte_len(p.name) > 256:
        raise ValidationFailed("p.name length is greater than 256")
    if re.search(r'^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$', p.name) is None:
        raise ValidationFailed("p.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$")
    if not _has_field(p, "home"):
        raise ValidationFailed("home is required.")
    if _has_field(p, "home"):
        embedded = validate(p.home)
        if embedded is not None:
            return embedded
    return None
# Validates Person All
def generate_validate_all(p):
    err = ''
    if p.id <= 999:
        err += '\np.id is not greater than 999'
    if not _validateEmail(p.email):
        err += '\np.email is not a valid email'
    if byte_len(p.name) > 256:
        err += '\np.name length is greater than 256'
    if re.search('^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$', p.name) is None:
        err += '\np.name pattern does not match ^[^[0-9]A-Za-z]+( [^[0-9]A-Za-z]+)*$'
    if not _has_field(p, 'home'):
        err += '\nhome is required.'
    if _has_field(p, 'home'):
        err += _validate_all(p.home)
    return err
# Validates Location All
def generate_validate_all(p):
    err = ''
    if p.lat < -90.0 or p.lat > 90.0:
        err += '\np.lat is not in range (90.0, -90.0)'
    if p.lng < -180.0 or p.lng > 180.0:
        err += '\np.lng is not in range (180.0, -180.0)'
    return err

Jun 21 '22 11:06 HaloWorld

Hi team. It Looks like the May update of Protocol Buffers (Changes made in May 2022) broke python tests in ci, according to the description:

Version 4.21.0 is a new major version, following 3.20.1. The new version is based on the upb library

Python upb requires generated code that has been generated from protoc 3.19.0 or newer.

Maybe set the python protobuf dependency version to protobuf>=3.6.1,<3.20?

Jul 12 '22 04:07 HaloWorld

All committers have signed the CLA.

Oct 06 '22 02:10 CLAassistant

A check for validate_all has been added to harness, which I have tested locally and passed, and currently has two checks:

Ensure that the result of validate_all is the same as validate
When validation fails, ensure that the first error message of validate_all is the same as the error message of validate

Also, since the ast.unparse function was not officially introduced until python 3.9, I had to bring in a third-party library (astunparse) to ensure compatibility.

Oct 07 '22 03:10 HaloWorld

Here is an example of the functions currently generated by validate_all.

proto(tests/harness/cases/kitchen_sink.proto)

syntax = "proto3";

package tests.harness.cases;
option go_package = "github.com/envoyproxy/protoc-gen-validate/tests/harness/cases/go;cases";
import "validate/validate.proto";

import "google/protobuf/wrappers.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/timestamp.proto";
import "google/protobuf/any.proto";

enum ComplexTestEnum {
    ComplexZero = 0;
    ComplexONE  = 1;
    ComplexTWO  = 2;
}

message ComplexTestMsg {
    string                              const  = 1 [(validate.rules).string.const = "abcd"];
    ComplexTestMsg                      nested = 2;
    int32                               int_const = 3 [(validate.rules).int32.const = 5];
    bool                                bool_const = 4 [(validate.rules).bool.const = false];
    google.protobuf.FloatValue          float_val = 5 [(validate.rules).float.gt = 0];
    google.protobuf.Duration            dur_val = 6 [(validate.rules).duration.lt = {seconds: 17}, (validate.rules).duration.required = true];
    google.protobuf.Timestamp           ts_val = 7 [(validate.rules).timestamp.gt = {seconds: 7}];
    ComplexTestMsg                      another = 8;
    float                               float_const = 9 [(validate.rules).float.lt = 8];
    double                              double_in = 10 [(validate.rules).double = {in: [456.789, 123]}];
    ComplexTestEnum                     enum_const = 11 [(validate.rules).enum.const = 2];
    google.protobuf.Any                 any_val = 12 [(validate.rules).any = {in: ["type.googleapis.com/google.protobuf.Duration"]}];
    repeated google.protobuf.Timestamp  rep_ts_val = 13 [(validate.rules).repeated = { items { timestamp { gte { nanos: 1000000}}}}];
    map<sint32, string>                 map_val = 14 [(validate.rules).map.keys.sint32.lt = 0];
    bytes                               bytes_val = 15 [(validate.rules).bytes.const = "\x00\x99"];
    oneof o {
        option (validate.required) = true;

        string       x = 16;
        int32        y = 17;
    }
}

message KitchenSinkMessage { ComplexTestMsg val = 1; }

functions genereated by validate

# Validates KitchenSinkMessage
def generate_validate(p):
    if _has_field(p, "val"):
        embedded = validate(p.val)
        if embedded is not None:
            return embedded
    return None
# Validates ComplexTestMsg
def generate_validate(p):
    present = False
    if _has_field(p, "x"):
        present = True
    if _has_field(p, "y"):
        present = True
    if not present:
        raise ValidationFailed("Oneof o is required")
    if p.const != "abcd":
        raise ValidationFailed("p.const not equal to abcd")
    if _has_field(p, "nested"):
        embedded = validate(p.nested)
        if embedded is not None:
            return embedded
    if p.int_const != 5:
        raise ValidationFailed("p.int_const not equal to 5")
    if p.bool_const != False:
        raise ValidationFailed("p.bool_const not equal to False")
    if p.HasField("float_val"):
        if p.float_val.value <= 0.0:
            raise ValidationFailed("p.float_val.value is not greater than 0.0")
    if not _has_field(p, "dur_val"):
        raise ValidationFailed("p.dur_val is required.")
    if _has_field(p, "dur_val"):
        dur = p.dur_val.seconds + round((10**-9 * p.dur_val.nanos), 9)
        lt = 17.0
        if dur >= lt:
            raise ValidationFailed("p.dur_val is not lesser than 17.0")
    if _has_field(p, "ts_val"):
        ts = p.ts_val.seconds + round((10**-9 * p.ts_val.nanos), 9)
        gt = 7.0
        if ts <= gt:
            raise ValidationFailed("p.ts_val is not greater than 7.0")
    if _has_field(p, "another"):
        embedded = validate(p.another)
        if embedded is not None:
            return embedded
    if p.float_const >= 8.0:
        raise ValidationFailed("p.float_const is not lesser than 8.0")
    if p.double_in not in [456.789, 123.0]:
        raise ValidationFailed("p.double_in not in [456.789, 123.0]")
    if p.enum_const != 2:
        raise ValidationFailed("p.enum_const not equal to 2")
    if _has_field(p, "any_val"):
        if p.any_val.type_url not in ['type.googleapis.com/google.protobuf.Duration']:
            raise ValidationFailed("p.any_val not in ['type.googleapis.com/google.protobuf.Duration']")
    for item in p.rep_ts_val:
        validate(item)
    for item in p.rep_ts_val:    
        if item:
            ts = item.seconds + round((10**-9 * item.nanos), 9)
            gte = 0.001
            if ts < gte:
                raise ValidationFailed("item is not greater than or equal to 0.001")
        pass
    for key in p.map_val:    
        if key >= 0:
            raise ValidationFailed("key is not lesser than 0")
        pass
    if p.bytes_val != b'\x00\x99':
        raise ValidationFailed("p.bytes_val not equal to b'\x00\x99'")
    return None

functions generated by validate_all

# Validates KitchenSinkMessage All
def generate_validate_all(p):
    err = []
    if _has_field(p, 'val'):
        err += _validate_all(p.val)
    return err
# Validates ComplexTestMsg All
def generate_validate_all(p):
    err = []
    present = False
    if _has_field(p, 'x'):
        present = True
    if _has_field(p, 'y'):
        present = True
    if (not present):
        err.append('Oneof o is required')
    if (p.const != 'abcd'):
        err.append('p.const not equal to abcd')
    if _has_field(p, 'nested'):
        err += _validate_all(p.nested)
    if (p.int_const != 5):
        err.append('p.int_const not equal to 5')
    if (p.bool_const != False):
        err.append('p.bool_const not equal to False')
    if p.HasField('float_val'):
        if (p.float_val.value <= 0.0):
            err.append('p.float_val.value is not greater than 0.0')
    if (not _has_field(p, 'dur_val')):
        err.append('p.dur_val is required.')
    if _has_field(p, 'dur_val'):
        dur = (p.dur_val.seconds + round(((10 ** (- 9)) * p.dur_val.nanos), 9))
        lt = 17.0
        if (dur >= lt):
            err.append('p.dur_val is not lesser than 17.0')
    if _has_field(p, 'ts_val'):
        ts = (p.ts_val.seconds + round(((10 ** (- 9)) * p.ts_val.nanos), 9))
        gt = 7.0
        if (ts <= gt):
            err.append('p.ts_val is not greater than 7.0')
    if _has_field(p, 'another'):
        err += _validate_all(p.another)
    if (p.float_const >= 8.0):
        err.append('p.float_const is not lesser than 8.0')
    if (p.double_in not in [456.789, 123.0]):
        err.append('p.double_in not in [456.789, 123.0]')
    if (p.enum_const != 2):
        err.append('p.enum_const not equal to 2')
    if _has_field(p, 'any_val'):
        if (p.any_val.type_url not in ['type.googleapis.com/google.protobuf.Duration']):
            err.append("p.any_val not in ['type.googleapis.com/google.protobuf.Duration']")
    for item in p.rep_ts_val:
        err += _validate_all(item)
    for item in p.rep_ts_val:
        if item:
            ts = (item.seconds + round(((10 ** (- 9)) * item.nanos), 9))
            gte = 0.001
            if (ts < gte):
                err.append('item is not greater than or equal to 0.001')
        pass
    for key in p.map_val:
        if (key >= 0):
            err.append('key is not lesser than 0')
        pass
    if (p.bytes_val != b'\x00\x99'):
        err.append("p.bytes_val not equal to b'\x00\x99'")
    return err

Oct 07 '22 03:10 HaloWorld

@HaloWorld, CI is unhappy with

ERROR: /go/src/github.com/envoyproxy/protoc-gen-validate/python/protoc_gen_validate/validator.py Imports are incorrectly sorted and/or formatted.

I'm interested in improving the dev experience on this repo, just out of interest did this not get picked up on your local?

Oct 18 '22 21:10 elliotmjackson

Sorry, I was previously unaware to the point that I could run ci directly locally with the make ci command, so I missed the lint check. Now I have fixed this issue and passed the ci test locally.

I'm relying on the Development section in README, I don't have bazel locally and I don't want to install bazel, so the only option for me is to build a docker image for testing.

docker build -t lyft/protoc-gen-validate .

Then I get to the container shell with the following command:

docker run --rm -ti -v $(pwd):/go/src/github.com/envoyproxy/protoc-gen-validate --entrypoint=/bin/bash lyft/protoc-gen-validate

Then I ran the make build command and got the following result: build

I followed the prompt and added the environment variable GOFLAGS="-buildvcs=false" with the following command:

docker run --rm -ti -e GOFLAGS="-buildvcs=false" -v $(pwd):/go/src/github.com/envoyproxy/protoc-gen-validate --entrypoint=/bin/bash lyft/protoc-gen-validate

Then ran make build again, after which I ran the python tests with:

bazel test //tests/harness/executor:executor_python_test

After the test passed, I committed and pushed the code.

There are two things that I find most troublesome in the whole development process.

finding the entry for python test. I am not very familiar with bazel and go syntax, so this took me some time. Maybe explain how to run different tests for developers of different languages? If I just modified the python code, I don't think running the whole test with bazel is necessary, even if I need to run the whole test, I can run it before committing the PR.
add test cases against validate_all. Honestly, the current test set is not complete, it only guarantees that validate_all behaves the same as validate, but it does not guarantee that validate_all can find all validation exceptions. I know that the go version of PGV implements validateAll, but I can only see the first error message compared in the go test code(harnesss.go):

Oct 19 '22 08:10 HaloWorld

Wanted to chime in to say thank you. This is a great change.

Nov 15 '22 20:11 dkunitsk

Python: Support for `validate_all` function