Bump cel-python version
Bump cel-python version, the latest one is 0.4.0
Hey @o-murphy, agreed, I'd like to do that, but have previously tried to upgrade to 0.4.0 and 0.3.0 and unfortunately there was a regression introduced in 0.3.0 that hasn't been fixed. Ideally we'd fix that upstream before pulling in the new version.
Going to leave this ticket open, though, so people can see this.
Is it possible to have cel-py as optional dependency, because it only applies if used in proto?
Currently, we still use https://github.com/bufbuild/protoc-gen-validate because when testing it is still a lot faster than protovalidate-python.
@mspiller effectively all of the protovalidate rules are backed by CEL, so unfortunately we can't make it optional. And protoc-gen-validate will likely always be faster than protovalidate-python, but protovalidate's architecture provides more flexibility and extensibility (and we still intend to speed up the internals as much as possible, which tends to be upstream speedups in cel-python).
I think we're pretty close to being able to upgrade to the latest cel-python version (the one issue above still needs to be fixed), at which point we should get some noticeable performance gains.
@stefanvanburen I see and understand. Would it make sense to go with rust version of google cel. https://github.com/cel-rust/cel-rust
This would make validation brutally fast. I see a lot of projects going in that (rust) direction (cryptography, orjson, ruff, uv, etc).
hi @mspiller, never say never, but I somewhat doubt it — adding a rust dependency would probably significantly complicate the packaging story for protovalidate-python (e.g., orjson: https://github.com/ijl/orjson?tab=readme-ov-file#packaging), and I'm not sure what other limitations it might impose.
Yes, it complicates packaging a bit. But for good benefit. Having fast C++ protobuf parser but then having it slow down by pure python validation (at least for our use case with tons of messages) :D
A bit of reading how the folder structure looks like and how to build it. Its really interesting. https://www.maturin.rs/
We actually did wrap one rust package to get it into python and it wasn't that hard but needs rust programming knowledge (that I don't have).
Maybe one day for 2.0 :D
Чи можливо використовувати cel-py як необов'язкову залежність, оскільки вона застосовується лише у випадку використання в proto?
Наразі ми все ще використовуємо https://github.com/bufbuild/protoc-gen-validate, оскільки під час тестування він все ще набагато швидший, ніж protovalidate-python.
In fact, almost any other spec-based validator will be several times faster, including Pydantic then protovalidate-python.
protovalidate-python is slowest at all I did tested. What is the real advantage of using protovalidate-python or protoc-gen-validate?
@o-murphy, consistency of validation across your stack is a big win we see with protobuf + protovalidate — you're defining your validation rules in a language-agnostic format. Otherwise, you'd have e.g. pydantic for python services, zod for your typescript frontend, etc.
In fact, almost any other spec-based validator will be several times faster, including
Pydanticthenprotovalidate-python.protovalidate-pythonis slowest at all I did tested.
Would love to see your approach / benchmarks if you have them. We do want to eke out as much performance as possible for protovalidate-python.
@o-murphy, consistency of validation across your stack is a big win we see with protobuf + protovalidate — you're defining your validation rules in a language-agnostic format. Otherwise, you'd have e.g. pydantic for python services, zod for your typescript frontend, etc.
In fact, almost any other spec-based validator will be several times faster, including
Pydanticthenprotovalidate-python.protovalidate-pythonis slowest at all I did tested.Would love to see your approach / benchmarks if you have them. We do want to eke out as much performance as possible for protovalidate-python.
Now I'm using simple self-written validator yupy based on callbacks chain It's results my util validates same list of files with defferent validators There validating 487 files contains 1-2kb of data each (md5 hash + protobuf encoded data) there only simple structures with data like lists doubles and strings the validation criteries the same, values limits, strings and list lengths, some regexps, type check, nullable, required, etc.
| validator | speed, it/s | time, s | avg, s/it |
|---|---|---|---|
| protovalidate-python | 19.69 | 24.74 | 0.0508 |
| yupy (self-written) | 1771.27 | 0.27 | 0.0005 |
| my old dumb validator | 228.70 | 2.13 | 0.0043 |
I know that my sollution is not an ideal and have a lot's of bugs but it works for me in my projects so I still use it
@o-murphy, consistency of validation across your stack is a big win we see with protobuf + protovalidate — you're defining your validation rules in a language-agnostic format. Otherwise, you'd have e.g. pydantic for python services, zod for your typescript frontend, etc.
In fact, almost any other spec-based validator will be several times faster, including
Pydanticthenprotovalidate-python.protovalidate-pythonis slowest at all I did tested.Would love to see your approach / benchmarks if you have them. We do want to eke out as much performance as possible for protovalidate-python.
I can't test Pydantic now but it's much faster than protovalidate-python
Maybe you have to use some existing fast validator for json-like data as a backend, with a json validity schemas support, then just generate validity schema for it from proto file declarations, then reuse it.
You also can use orjson for fast json-data conversions.