cel-python
cel-python copied to clipboard
Implementation of `has` macro is incomplete
The has
macro is implemented by visiting the child nodes and returning true if there was not a CELEvalError
. However, this makes it impossible for Protobuf messages to be handled properly.
The docstring describes the algorithm:
If e evaluates to a protocol buffers version 2 message and f is a defined field:
If f is a repeated field or map field, has(e.f) indicates whether the field is non-empty.
If f is a singular or oneof field, has(e.f) indicates whether the field is set.
If e evaluates to a protocol buffers version 3 message and f is a defined field:
If f is a repeated field or map field, has(e.f) indicates whether the field is non-empty.
If f is a oneof or singular message field, has(e.f) indicates whether the field is set.
If f is some other singular field, has(e.f) indicates whether the field's value is its default value (zero for numeric fields, false for booleans, empty for strings and bytes).
However, as far as I can tell, this is impossible to implement correctly no matter what. If you have a MessageType
populated by omitting fields that are not present, you get incorrect behavior since unset fields are null instead of their default values. If you have a MessageType
populated by setting default values, you get incorrect behavior because has
returns true for fields that it shouldn't. What is needed is for has
to return false while still populating the default value.
I don't currently have a minimal reproduction, in part because the library I am currently working on does not actually use the cel-python MessageType
, but instead uses its own (though I did try to use cel-python MessageType
and ran into basically the same issue, so I am pretty sure, as long as I am not missing anything, that it is similarly applicable.) That said, I hope this explanation is enough to either point me in the right direction: it seems like I am either misunderstanding how to use cel-python, or it is simply missing support for this. I think that there needs to be a way for has
to be specialized for a given type; idiomatically it'd be best, in my opinion, if has
could somehow call __contains__
; then Python types, including MessageType
, could just implement __contains__
and __getitem__
to do the right thing.
I would be willing to try to submit a PR to rectify this in the future if this sounds like a real issue and my solution would be OK, but I don't currently have a good enough understanding of how lark-parser works, so it may not be something I have time to approach for the moment.