cel-python icon indicating copy to clipboard operation
cel-python copied to clipboard

Implementation of `has` macro is incomplete

Open jchadwick-buf opened this issue 7 months ago • 0 comments

The has macro is implemented by visiting the child nodes and returning true if there was not a CELEvalError. However, this makes it impossible for Protobuf messages to be handled properly.

The docstring describes the algorithm:

  1. If e evaluates to a protocol buffers version 2 message and f is a defined field:

    • If f is a repeated field or map field, has(e.f) indicates whether the field is non-empty.

    • If f is a singular or oneof field, has(e.f) indicates whether the field is set.

  2. If e evaluates to a protocol buffers version 3 message and f is a defined field:

    • If f is a repeated field or map field, has(e.f) indicates whether the field is non-empty.

    • If f is a oneof or singular message field, has(e.f) indicates whether the field is set.

    • If f is some other singular field, has(e.f) indicates whether the field's value is its default value (zero for numeric fields, false for booleans, empty for strings and bytes).

However, as far as I can tell, this is impossible to implement correctly no matter what. If you have a MessageType populated by omitting fields that are not present, you get incorrect behavior since unset fields are null instead of their default values. If you have a MessageType populated by setting default values, you get incorrect behavior because has returns true for fields that it shouldn't. What is needed is for has to return false while still populating the default value.

I don't currently have a minimal reproduction, in part because the library I am currently working on does not actually use the cel-python MessageType, but instead uses its own (though I did try to use cel-python MessageType and ran into basically the same issue, so I am pretty sure, as long as I am not missing anything, that it is similarly applicable.) That said, I hope this explanation is enough to either point me in the right direction: it seems like I am either misunderstanding how to use cel-python, or it is simply missing support for this. I think that there needs to be a way for has to be specialized for a given type; idiomatically it'd be best, in my opinion, if has could somehow call __contains__; then Python types, including MessageType, could just implement __contains__ and __getitem__ to do the right thing.

I would be willing to try to submit a PR to rectify this in the future if this sounds like a real issue and my solution would be OK, but I don't currently have a good enough understanding of how lark-parser works, so it may not be something I have time to approach for the moment.

jchadwick-buf avatar Jul 30 '24 22:07 jchadwick-buf