cel-rust feat(opaque): Adds support for `OpaqueValue`s

This adds an Opaque variant to the Value enum. Add the matching kind. An opaque value holds a struct with the name of the opaque value's type, along a Arc<dyn Any> to the actual rust value.

There's a few things that still need to happen here:

[x] ~~Fix the interpreter to properly deal with these opaque values (e.g. currently they are == to anything!)~~
[x] Handle == contract for Value::Opaques
[x] Add an example to use these which adds methods to these
[x] binding for json feature support

This came out of a chat with @howardjohn earlier this week. I got really nerd snipped on that, cause it would let me add support for Optional much faster than my initial plan. The one downside to this faster approach tho, is that we can't overload methods as of now really, without having the function used for a named method (or function, passing the target in, which is not really what the CEL spec says, but what we have today anyways), know about all the target types, as it would need to do the dynamic dispatch based of the runtime_type_name.

/cc @clarkmcc

This PR provides support for injecting Opaque values in the interpreter. There is nothing that can be done with Opaque values, but adding overloads on them (which we don't really support just now), e.g.:

#[derive(Debug, Serialize)]
struct MyStruct {
    field: String,
}

impl OpaqueValue for MyStruct {

    /// this is the only fn of the trait that has no default impl
    fn runtime_type_name(&self) -> &str {
        "my_struct"
    }

    fn eq(&self, other: &dyn OpaqueValue) -> bool {
        if self.runtime_type_name() == other.runtime_type_name() {
            if let Some(other) = other.downcast_ref::<MyStruct>() {
                return self.field.eq(&other.field);
            }
        }
        false
    }

    fn as_debug(&self) -> Option<&dyn Debug> {
        Some(self)
    }

    #[cfg(feature = "json")]
    fn json(&self) -> Option<serde_json::Value> {
        Some(serde_json::to_value(self).unwrap())
    }
}

Using Value::Opaque, an instance of MyStruct be injected as a binding in the ctx and used, e.g. thru a function myFn(), as then so mine.myFn():

  pub fn my_fn(ftx: &FunctionContext) -> Result<Value, ExecutionError> {
    if let Some(Value::Opaque(opaque)) = &ftx.this {
        if opaque.runtime_type_name() == "my_struct" {
            Ok(opaque
                .deref()
                .downcast_ref::<MyStruct>()
                .unwrap()
                .field
                .clone()
                .into())
        } else {
            Err(ExecutionError::UnexpectedType {
                got: opaque.runtime_type_name().to_string(),
                want: "my_struct".to_string(),
            })
        }
    } else {
        Err(ExecutionError::UnexpectedType {
            got: format!("{:?}", ftx.this),
            want: "Value::Opaque".to_string(),
        })
    }
}



let value = Arc::new(MyStruct {
    field: String::from("value"),
});

let mut ctx = Context::default();
ctx.add_variable_from_value("mine", Value::Opaque(value.clone()));
ctx.add_function("myFn", my_fn);
let prog = Program::compile("mine.myFn()").unwrap();
assert_eq!(
    Ok(Value::String(Arc::new("value".into()))),
    prog.execute(&ctx)
);

Nov 14 '25 15:11 alexsnaps

Nice! I ended up playing around with this on my flight and took a slightly different approach. What I ran into was that there was no way to have things like json() etc work on it without a trait (I chose to make 1 primary function to turn the Opaque into another Value (presumably, not another Opaque!)) but other approaches could work. The as_any is annoying though.

https://github.com/howardjohn/cel-rust/tree/values/opaque

Nov 14 '25 16:11 howardjohn

@howardjohn see the example I just added and let me know what you would still be missing here. As I mentioned, dynamic dispatch of a method "name" to multiple types (whether Opaque or other) requires the fn to know about them all. Also... still pending is solving OpaqueValue equality... they are currently all not equal.

Nov 14 '25 17:11 alexsnaps

The main question for me is if we want to allow turning an opaque value into something. I am not actually 100% sure how this works in Go, what happens if you return a opaque type then marshal that to json or equivilent? for example, does the ip() type in k8s return "1.2.3.4" or something.. opaque?

If we do want this, we probably need a non-Any trait to allow defining these types of conversion on the type like in https://github.com/howardjohn/cel-rust/tree/values/opaque. This probably is how we would do equality as well I would imagine?

Nov 15 '25 02:11 howardjohn

Note https://github.com/cel-rust/cel-rust/pull/96 is a prior attempt at this. I am not 100% sure why the need for the unsafe stuff

Nov 15 '25 02:11 howardjohn

This probably is how we would do equality as well I would imagine?

Yes. Tho it'll have to be some Any to be able to cast back down... I'm thinking of composing the traits (the traits themselves are yet to be defined) as what I'm doing for the CEL traits in my WIP big refactor branch...

So an OpaqueValue would probably delegate these to it's "internals" or just return None for things like e.g. JSON output. Thinking about what Optional does, which is an Opaque type, since it decorates an actual CEL Value in that case, it delegates down, i.e. the return JSON is 500, not an optional.of(500).

Hope this clarifies somewhat.

Nov 15 '25 15:11 alexsnaps

Ideally we wouldn't need OpaqueValue and have a straight Opaque(Arc<dyn ValueHolder>), but the arbitrary makes that impossible I think, as implementing that make it a non trait object (fn return Self).

Nov 17 '25 19:11 alexsnaps

So this is pending decision on #223 as this works branches off it. I think that otherwise this covers it. @howardjohn added json support with this commit Anything missing otherwise?

Nov 18 '25 19:11 alexsnaps

Did a bit of playing around to make defining functions more ergonomic:

https://github.com/howardjohn/cel-rust/blob/b0c90986b6d75f29b2976b850af9ffa0b7534f13/cel/src/functionsx.rs#L60-L74

Then my functions can just be like

    impl IpAddr {
        pub fn is_localhost(self: Arc<Self>) -> Result {
            Ok(self.0.is_loopback().into())
        }
    }

which is pretty nice. Probably we can make more of the magic function handling to handle other cases like a.b(args) etc in the future

Nov 19 '25 18:11 howardjohn

Did a bit of playing around to make defining functions more ergonomic:

https://github.com/howardjohn/cel-rust/blob/b0c90986b6d75f29b2976b850af9ffa0b7534f13/cel/src/functionsx.rs#L60-L74

Then my functions can just be like
    impl IpAddr {
        pub fn is_localhost(self: Arc<Self>) -> Result {
            Ok(self.0.is_loopback().into())
        }
    }
which is pretty nice. Probably we can make more of the magic function handling to handle other cases like a.b(args) etc in the future

Hm interesting... I think it would be really complicated, but the holy grail would be the ability to call the impl block's method directly from CEL, like ip.is_localhost() in CEL calls IpAddr::is_localhost on Rust. This would be a piece of cake in many other languages, I've never explored whether I can dynamically call a function in Rust since there's no real reflection.

Nov 19 '25 19:11 clarkmcc

Also, remember that overloads/dynamic dispatch will need to happen very differently moving forward. Again, if the goal is still to be spec compliant. See here: https://github.com/google/cel-go/blob/master/common/stdlib/standard.go#L249

Nov 19 '25 19:11 alexsnaps

Also, remember that overloads/dynamic dispatch will need to happen very differently moving forward. Again, if the goal is still to be spec compliant.

I'm assuming you're referring to your comment here about the check phase specifically needing to know all the function overloads ahead of time. I'm interested to see what API proposals you have in mind for this, but am fairly opposed to ruining the very easy-to-use API that we have right now, at least for evaluation-only.

If we could come up with a way to make users define these overloads only for the check phase, then I've got no concerns there. But the CEL-spec even says that users can run without the check phase, it's more of an opt-in feature (for users, not the spec I mean).

Type-checking is an optional, but strongly encouraged step that can reject some semantically invalid expressions using static analysis. Additionally, the check produces metadata which can improve function invocation performance and object field selection at evaluation-time.

Nov 19 '25 19:11 clarkmcc

I'm assuming you're referring to your comment here about the check phase specifically needing to know all the function overloads ahead of time.

Well that too, but mostly that overloads are, well, dynamic dispatches. Which are currently impossible with this interpreter. functions are not "just" to be named, they need to declare what type they operate on, so that I can overload a method, e.g. .size() would invoke a different target Rust fn depending on the target's type.

Today, all this magic of inferring types is not good enough, as we need to be the ones doing the dynamic dispatch, not the compiler. More so if we want composable "extensions" or libraries.

but am fairly opposed to ruining the very easy-to-use API that we have right now

So while I'm all in for "easy", I don't think it should come before correctness... So, again, what's the goal? Don't get me wrong, I hope we can get the most easy to use yet compliant solution, but if compliance forces us given in on ease of use, then I think that's what we should do.

The blanket trait implementation discussed above tho I think is a good example, as they are just, well, blanket implementation that make the life of users easier. We should provide those whenever we can, no discussion around that. And I hope we can come up with APIs that are spec compliant and, even if only eventually, possibly thru maturing, permit for "easy to use APIs"... but, again, just like wrt performance, I'd first aim for spec compliance.

Nov 19 '25 19:11 alexsnaps

I guess the tl;dr of what I'm saying is along the side of the say:

Make it Work (correct, here spec compliant), then make it Beautiful (which lead to easy to use APIs for users), finally make it Fast

Nov 19 '25 20:11 alexsnaps

@howardjohn wrt to this

practically speaking, am I missing anything if I just blindly attempt downcast vs checking the name?

You wouldn't have to I guess if you know what you are downcasting to is only every present in a single opaque type. eg. if you have Opaque<MyStruct>(foo) & Opaque<MyStruct>(bar), they'd both give you a Some(&dyn MyStruct) back, yet they'd be different types and as such should e.g. always be ne.

Nov 20 '25 11:11 alexsnaps

but am fairly opposed to ruining the very easy-to-use API that we have right now

I've been thinking about this somewhat, and the only way I think we can implement proper overloads while keeping the API "easy" would be a procedural macro that would generate the code require to allow the interpreter to do proper dynamic dispatches... Couldn't think of another way.

So I guess the question becomes, what is the scope of this PR? It probably makes sense to keep it close to the "look & feel" of the current API (including for function calls as they exist today). I'm mostly doing this in order to be able to support Optional in the interpreter and turn optional syntax in the parser (e.g. .?).

But I'm very hesitant now, as:

looking up the necessary overloads, there are some that I probably can't implement until more alignment as been done in the interpreter;
the optional.of notation isn't even possible right now;
operators.Index handing would need special casing in the current resolve, maybe not "too bad" tho;
given the current fn call resolution, it'd be yet more "reserved" function names in the stdlib.

So let me ask this,

do we want to implement Optional support right now?
are we saying that the current mean of registering "custom function" is the API that we absolutely want for the interpreter?
if the former, then how is one to overload an already registered method/function name? that would need addressing if we want to eventually have a spec compliant interpreter
or are we saying that the current API stays, with it's current flaws, but there is another way to achieve overloads?
if indeed we want proper overloads, do we also need an easier that the current proposed API?

I'm a little lost as to what the goals here are. My position has always been that we were aiming at providing a spec compliant cel interpreter, which might still be the case... but I'm hesitant about that too now tbh.

That being said, I think there still might be some confusion around overloads/dynamic dispatching, here is another example, where the findGateways is declared to exist on two different types. It's important to understand tho, that any one could add a different findGateways method on different target types and/or different args... where then the interpreter is responsible for resolving the dispatch of the overloaded method or function, a mechanism Rust doesn't support btw. And I can't see how to make the compiler resolve that for us, as it is typically a runtime decision (more so when we'll introduce DynTyped values. So that API will need to be explicit. And the only way I can think of to eventually make it "easier" on the user would be to provide a procedural macro to annotate the fns, I think that should be feasible.

Nov 20 '25 12:11 alexsnaps

I don't have a strong opinion on many of the discussion points but would say that overall this PR, as is, unblocks some things for me (at first, implementing a IP and CIDR library mirroring the k8s one) and would be useful despite the limitations. So I would be happy for this to merge even with the gaps, though can live without it

Nov 20 '25 17:11 howardjohn

@howardjohn wrt to this

practically speaking, am I missing anything if I just blindly attempt downcast vs checking the name?

You wouldn't have to I guess if you know what you are downcasting to is only every present in a single opaque type. eg. if you have Opaque<MyStruct>(foo) & Opaque<MyStruct>(bar), they'd both give you a Some(&dyn MyStruct) back, yet they'd be different types and as such should e.g. always be ne.

Also, note that this a very instance of the problem with the current approach, your function should only ever be invoked against your OpaqueType, not any type. Which is why we need to find a way to make the interpreter know about that: target: Option<Type>, args: &[Type] along side how the function is referred to in CEL (i.e. it's name) and what Rust fn to dispatch it too. That would give us proper overloads for functions & methods, e.g. ctx.add_function("size", Some(OpaqueType("Optional"), &[], optional::fn::size), to then be used on optional.of(1).size() in cel, a size method on OpaqueType("Optional") that takes no arguments.

Nov 20 '25 23:11 alexsnaps

I'm a little lost as to what the goals here are. My position has always been that we were aiming at providing a spec compliant cel interpreter, which might still be the case... but I'm hesitant about that too now tbh.

I think that is the goal. Prior to our conversation last week, I was not fully grasping the fact that you cannot overload a CEL function under the current implementation without controlling the Rust function (you can overload CEL functions if you control the single Rust function, but you can't overload with multiple Rust functions -- this was the piece I was mentally not grasping until last week).

With this new understanding, I agree that spec compliance is the goal even if developer experience suffers.

Where I think there might be some disagreement is how we get there. My opinions on this:

Any changes to the way functions work are distinct enough from the opaque value support (this PR), that they should be a separate PR and discussion from this one.
I would normally agree with the nail-it-then-scale it approach where we just get overloads working properly, regardless of how ugly, and then make it beautiful later, but there are people using this library so I'm much more inclined to get closer to the "nailing it" side of things when we release proper support for overloads. I would like to talk through possible designs for this that could simplify developer experience (maybe we have an Overload trait that is default implemented for functions, and users can override default methods on that trait to specify types, and args, or something -- totally spitballing).

@alexsnaps what are your opinions on these two points ^?

Nov 24 '25 16:11 clarkmcc

Where I think there might be some disagreement is how we get there. My opinions on this:
1. Any changes to the way functions work are distinct enough from the opaque value support (this PR), that they should be a separate PR and discussion from this one.

Totally! No disagreement there. I won't bolt on proper overloads to this work. The reason I brought it up, was that all "extended sugarcoating" we might do as part of this, we'll probably break soon after, as it won't solve the actual problem, but would make the life of users of the current approach easier.

So to be clear, I'm all fine adding magic. Certainly the Into traits as per the diff above. I would maybe stop there for now tho and avoid further logic downcasting the &dyn Any automatically, as we probably won't be able to keep that anyways. Now I also folded the runtime_type_name check into the interpreter, so arguably at least that downcast would at least always be "correct".

2. I would normally agree with the nail-it-then-scale it approach where we just get overloads working properly, regardless of how ugly, and then make it beautiful later, but there are people using this library so I'm much more inclined to get closer to the "nailing it" side of things when we release proper support for overloads. I would like to talk through possible designs for this that could simplify developer experience (maybe we have an `Overload` trait that is default implemented for functions, and users can override default methods on that trait to specify types, and args, or something -- totally spitballing).

Actually I agree with that too... I guess I don't want to invest days or weeks into making something like this "more user friendly" in a way that I know will not port to that future. The mistake was probably on my side in opening this PR to begin with. I think the other work in #177 is actually more important and foundational: open the type system up. Then, follow up to that would be proper overloads. And finally adding the Optional support (and as such Opaque).

I got sidetracked on that work because it would have helped @howardjohn and us, at Kuadrant, in enabling Optional support... which I believe is really the feature the interpreter misses the most as of today, when it comes to "user friendliness"... certainly for CEL expr. authors.

So this raises the question now, what do we do? Do we want this "minimal" Opaque support as is? If you ask me, I think it's good enough. We could do better on the Context::add_function, but otoh the experience hasn't worsen and we make more things possible with the support as is. Or should we close this and pun on this until I am done with #177 (which probably should be renamed to "Open the type system up"), followed by a new PR adding proper overloads, which we could even have while keeping add_function btw... And then come back around to this?

Nov 24 '25 18:11 alexsnaps