protovalidate-python icon indicating copy to clipboard operation
protovalidate-python copied to clipboard

Bump cel-python version

Open o-murphy opened this issue 2 months ago • 1 comments

Bump cel-python version, the latest one is 0.4.0

o-murphy avatar Oct 07 '25 21:10 o-murphy

Hey @o-murphy, agreed, I'd like to do that, but have previously tried to upgrade to 0.4.0 and 0.3.0 and unfortunately there was a regression introduced in 0.3.0 that hasn't been fixed. Ideally we'd fix that upstream before pulling in the new version.

Going to leave this ticket open, though, so people can see this.

stefanvanburen avatar Oct 08 '25 13:10 stefanvanburen

Is it possible to have cel-py as optional dependency, because it only applies if used in proto?

Currently, we still use https://github.com/bufbuild/protoc-gen-validate because when testing it is still a lot faster than protovalidate-python.

mspiller avatar Dec 03 '25 10:12 mspiller

@mspiller effectively all of the protovalidate rules are backed by CEL, so unfortunately we can't make it optional. And protoc-gen-validate will likely always be faster than protovalidate-python, but protovalidate's architecture provides more flexibility and extensibility (and we still intend to speed up the internals as much as possible, which tends to be upstream speedups in cel-python).

I think we're pretty close to being able to upgrade to the latest cel-python version (the one issue above still needs to be fixed), at which point we should get some noticeable performance gains.

stefanvanburen avatar Dec 03 '25 13:12 stefanvanburen

@stefanvanburen I see and understand. Would it make sense to go with rust version of google cel. https://github.com/cel-rust/cel-rust

This would make validation brutally fast. I see a lot of projects going in that (rust) direction (cryptography, orjson, ruff, uv, etc).

mspiller avatar Dec 03 '25 16:12 mspiller

hi @mspiller, never say never, but I somewhat doubt it — adding a rust dependency would probably significantly complicate the packaging story for protovalidate-python (e.g., orjson: https://github.com/ijl/orjson?tab=readme-ov-file#packaging), and I'm not sure what other limitations it might impose.

stefanvanburen avatar Dec 03 '25 17:12 stefanvanburen

Yes, it complicates packaging a bit. But for good benefit. Having fast C++ protobuf parser but then having it slow down by pure python validation (at least for our use case with tons of messages) :D

A bit of reading how the folder structure looks like and how to build it. Its really interesting. https://www.maturin.rs/

We actually did wrap one rust package to get it into python and it wasn't that hard but needs rust programming knowledge (that I don't have).

Maybe one day for 2.0 :D

mspiller avatar Dec 03 '25 17:12 mspiller

Чи можливо використовувати cel-py як необов'язкову залежність, оскільки вона застосовується лише у випадку використання в proto?

Наразі ми все ще використовуємо https://github.com/bufbuild/protoc-gen-validate, оскільки під час тестування він все ще набагато швидший, ніж protovalidate-python.

In fact, almost any other spec-based validator will be several times faster, including Pydantic then protovalidate-python. protovalidate-python is slowest at all I did tested. What is the real advantage of using protovalidate-python or protoc-gen-validate?

o-murphy avatar Dec 04 '25 08:12 o-murphy

@o-murphy, consistency of validation across your stack is a big win we see with protobuf + protovalidate — you're defining your validation rules in a language-agnostic format. Otherwise, you'd have e.g. pydantic for python services, zod for your typescript frontend, etc.

In fact, almost any other spec-based validator will be several times faster, including Pydantic then protovalidate-python. protovalidate-python is slowest at all I did tested.

Would love to see your approach / benchmarks if you have them. We do want to eke out as much performance as possible for protovalidate-python.

stefanvanburen avatar Dec 04 '25 14:12 stefanvanburen

@o-murphy, consistency of validation across your stack is a big win we see with protobuf + protovalidate — you're defining your validation rules in a language-agnostic format. Otherwise, you'd have e.g. pydantic for python services, zod for your typescript frontend, etc.

In fact, almost any other spec-based validator will be several times faster, including Pydantic then protovalidate-python. protovalidate-python is slowest at all I did tested.

Would love to see your approach / benchmarks if you have them. We do want to eke out as much performance as possible for protovalidate-python.

Now I'm using simple self-written validator yupy based on callbacks chain It's results my util validates same list of files with defferent validators There validating 487 files contains 1-2kb of data each (md5 hash + protobuf encoded data) there only simple structures with data like lists doubles and strings the validation criteries the same, values limits, strings and list lengths, some regexps, type check, nullable, required, etc.

validator speed, it/s time, s avg, s/it
protovalidate-python 19.69 24.74 0.0508
yupy (self-written) 1771.27 0.27 0.0005
my old dumb validator 228.70 2.13 0.0043

I know that my sollution is not an ideal and have a lot's of bugs but it works for me in my projects so I still use it

o-murphy avatar Dec 09 '25 16:12 o-murphy

@o-murphy, consistency of validation across your stack is a big win we see with protobuf + protovalidate — you're defining your validation rules in a language-agnostic format. Otherwise, you'd have e.g. pydantic for python services, zod for your typescript frontend, etc.

In fact, almost any other spec-based validator will be several times faster, including Pydantic then protovalidate-python. protovalidate-python is slowest at all I did tested.

Would love to see your approach / benchmarks if you have them. We do want to eke out as much performance as possible for protovalidate-python.

I can't test Pydantic now but it's much faster than protovalidate-python

Maybe you have to use some existing fast validator for json-like data as a backend, with a json validity schemas support, then just generate validity schema for it from proto file declarations, then reuse it. You also can use orjson for fast json-data conversions.

o-murphy avatar Dec 09 '25 16:12 o-murphy