cel-rust icon indicating copy to clipboard operation
cel-rust copied to clipboard

Emerge a new public API

Open alexsnaps opened this issue 4 months ago • 8 comments

Starting this as a place to discuss how to slowly emerge an API that lets users easily configure an Environment to use in with their CEL expressions to parse, check and eventually evaluate them.

see the current doc on the lib.rs

use cel::common::value::CelVal;
let opaque_type = cel::common::types::Type::new_opaque_type("foo");

let env = cel::Env::builder()
    .add_type(opaque_type)
    .add_variable("answer", cel::common::types::UINT_TYPE)
    .add_overload("is_it", "is_it_uint", &[&cel::common::types::UINT_TYPE], &cel::common::types::BOOL_TYPE, is_it)
    .add_member_overload("is_it", "is_it_on_uint", &cel::common::types::UINT_TYPE, &[], &cel::common::types::BOOL_TYPE, is_it)
    .build();

let mut ast = env.parse("(answer == 42) == is_it(answer) && answer.is_it()")?;
ast = env.check(ast)?;

//let prog = Program::new(ast);

fn is_it(val: &CelVal) -> CelVal {
     CelVal::Boolean(true)
}

heavily inspired by the golang impl

PS: This is currently broken, I know

alexsnaps avatar Aug 02 '25 12:08 alexsnaps

/cc'ing @howardjohn @cgettys-microsoft - mostly as an FYI of what's in the pipeline, as I see work & time invested in things that I expect to go away or at least change heavily in the next few weeks (🤞)

alexsnaps avatar Aug 02 '25 12:08 alexsnaps

Also, I thought I'd give some insights on what happens when check is performed, wrt to function call resolutions. In the example above, is_it will be the function resolved on a "only parsed" expression, while a checked one would resolve to is_it_uint and is_it_on_uint. In the former case a "synthetic" overload will be resolved from is_it and the dynamic dispatch is resolved at runtime. In the latter case, since all args (and possible target type) are known, the IdedExpr.id will be used to lookup the actual overload to use.

This is true for all operators, e.g. in the case of _+_, when doing a uint addition, it'd go straight to _+_uint. We today go the dyn route always, as per here, this code would eventually complete disappear, in favor of multiple overloads, with one synthetic being able to do "dynamic dispatch".

alexsnaps avatar Aug 02 '25 13:08 alexsnaps

Hm, since we're talking public API now, I always thought this was really clunky in cel-go

add_member_overload("is_it", "is_it_on_uint", &cel::common::types::UINT_TYPE, &[], &cel::common::types::BOOL_TYPE, is_it)

If I recall correctly, the name of the overload actually matters for determining what type to attach the method to which means it's very easy to get it wrong. My first time using cel-go, I had to go read the source and tests to get an understanding of how to configure the overloads the way I wanted. I fully acknowledge that this might be the more performant API to do this, but it sure is nice to do

fn is_it(This(v): This<u64>) -> bool {
    v == 42
}

clarkmcc avatar Aug 02 '25 13:08 clarkmcc

Making sure I get the point right here...

fn is_it(This(v): This<u64>) -> bool {
    v == 42
}

Is fine. We need to have the OverloadDeclaration somehow populated. Cause we need the full signature target: Option<Type>, args: [&Type], ret_type: &Type, where target is for member overloads. And keep in mind that the type system is open, so users could have something like:

fn is_it(This(v): This<Foo>) -> Bar {
   Bar {}
}

I guess what I'm saying is that we can provide all the best sugar coating, I just don't know right now how to best retrieving the typing information.

If I recall correctly, the name of the overload actually matters for determining what type to attach the method

No, the typing information is what's used. The name reflects it, tho I'm unsure of the value of that indirection to be honest.

alexsnaps avatar Aug 02 '25 14:08 alexsnaps

/cc'ing @howardjohn @cgettys-microsoft - mostly as an FYI of what's in the pipeline, as I see work & time invested in things that I expect to go away or at least change heavily in the next few weeks (🤞)

Looks nice. I'm not gonna be at all upset if some of the stuff I've done goes away / not upset about it, making little improvements / trying things is how I understand a library better when I'm using it. Throwaway work just comes with the territory :D

cgettys-microsoft avatar Aug 03 '25 01:08 cgettys-microsoft

Sorry I probably just don't have context here, but the example in the PR description seems like a very large regression in UX. Is this all required stuff now to do basic evaluation, or is this just some more advanced additional things users can do?

From my POV:

  • I don't want to provide types for variables, as I already defined a typed struct w/ Serialize, so why should I define the type again?
  • I don't want to deal with the clunk overloads; the current add_function is already a great UX

howardjohn avatar Aug 05 '25 15:08 howardjohn

  • I don't want to provide types for variables, as I already defined a typed struct w/ Serialize, so why should I define the type again?

Correct me if I'm wrong, but you're providing the type alongside providing the Value itself in that case, right? Which we can absolutely keep on doing. But you'd be in "parse mode", i.e. not use a "checked AST".

* I don't want to deal with the clunk overloads; the current `add_function` is already a great UX

This one is more interesting, I'd like to indeed keep it possibly "simple" as the add_function. Now sadly it won't suffice for all cases, e.g. "overloads" as a member overload on UInt & Int, and that with wanting to support checking the AST. There might be an easier way than the way this works in the golang implementation, to automatically "infer" types, tho that'd probably need to happen at compile time. I'm exploring that space for 2 days already, but Rust only provides "so much" in terms of reflection at runtime, which makes this all... interesting.


to be clear, declaring all that is only required for the check phase which is optional. Which means that if you're fine directly resolving bindings, types (and form) at evaluation time, you will always be able to call eval on an expression as before... tho it possibly resulting in an error like "unknown identifier" or "no such overload" et al... and at ~some perfomance cost obviously too~ the expense of not being able to benefit from the performance optimization having that information upfront brings.

alexsnaps avatar Aug 05 '25 15:08 alexsnaps

Correct me if I'm wrong, but you're providing the type alongside providing the Value itself in that case, right? Which we can absolutely keep on doing. But you'd be in "parse mode", i.e. not use a "checked AST".

Ah right, to have a Value we need the actual data and here its just the parsing. I do wonder if given a struct which implements Serialize, which we will later use to produce the Value, we can derive all the types anyways. Which I guess is likely the reflection limitations you were referring to

howardjohn avatar Aug 05 '25 16:08 howardjohn