Amber [BUG] Context values not reset on failure

Describe the bug There are places in the code (current master branch) where we temporarily set a new context value, call the syntax parser for some child object, and then reset the context value. This example is from IterLoop::parse(), but there are others:

// Save loop context state and set it to true
let mut new_is_loop_ctx = true;
swap(&mut new_is_loop_ctx, &mut meta.context.is_loop_ctx);
// Parse loop
syntax(meta, &mut self.block)?; // <=============================== If this returns Err...
token(meta, "}")?;
// Restore loop context state
swap(&mut new_is_loop_ctx, &mut meta.context.is_loop_ctx); // <==== ...this is never called
meta.pop_scope();
Ok(())

The problem is that if (in this case) syntax() or token() should fail, we will not reset the context value. As consequence, a failure in the syntax parser for one object can cause syntax errors in unrelated objects, leading to hard-to-track-down bugs.

To Reproduce N/A

Expected behavior N/A

Additional context N/A

Sep 22 '24 20:09 hdwalters

@hdwalters it would be nice to create a rust macro for this swap mechanism:

// Save loop context state and set it to true
let mut new_is_loop_ctx = true;
// Parse loop
safe_swap(&mut new_is_loop_ctx, &mut meta.context.is_loop_ctx, {
    syntax(meta, &mut self.block)?;
    token(meta, "}")?;
});
meta.pop_scope();
Ok(())

Proposed sketch of such macro

The implementation that I see is that we could run the contenxt for the swap in a function and check for what value has been returned:

Ok -> Then it will swap back the values and propagateOk() (by returning it)
Err -> Then it will swap back the values and propagate the Err(err) (by returning it)
Pass -> Then it will swap back the values and not return anything

Example implementation

This may not be a correct Rust code. Please adjust the implementation to the contextual requirements

struct Pass;
type Passable = SyntaxResult | Pass

macro_rules! safe_swap {
    (a:expr, b:expr, ctx:tt) => {{
        swap($a, $b);
        let foo = || {
            $ctx
            Pass
        };
        let result: Passable = foo();
        // Swap back
        swap($a, $b);
        if let Pass = result {
            result
        } else {
            return result
        }
    }}
}

Oct 04 '24 11:10 Ph0enixKM

I've been working on a procedural macro, which would allow us to annotate the struct itself with #[derive(WithContext)], and the struct members with #[context]. I didn't want to advertise this until it was a bit more stable, but I'll post an example after work this evening.

Oct 04 '24 13:10 hdwalters

This has forced me to read up on Rust macros, a subject I've always been a bit shaky on!

Oct 04 '24 13:10 hdwalters

I searched the source code for all occurrences of mem::swap() where a Result::Err could be returned before the second occurrence. I found references to ParserMetadata::binop_border, and various fields inside ParserMetadata::context. I've written a "derive" procedural macro ContextManager, which should be applied to the ParserMetadata struct. This generates three additional ParserMetadata functions for each field annotated with the context helper attribute:

#[derive(Debug, ContextManager)]
pub struct ParserMetadata {
    ...
    /// Determines where the binary operator should end
    #[context]
    pub binop_border: Option<usize>,
    ...
    /// Context of the parser
    #[context]
    pub context: Context,
    ...
}

The first function passes the new temporary value with ownership, in this case an Option<usize>. This is intended to be called for primitives and small structs, and the generated function calls clone():

pub fn with_binop_border<B>(&mut self, binop_border: Option<usize>, mut body: B) -> SyntaxResult
where
    B: FnMut(&mut Self) -> SyntaxResult,
{
    ...
}

The second function passes the new temporary value by mutable reference, in this case a Context. This is intended to be called with values which would be more expensive to clone, or whose contents are required after the context manager has returned; the generated function calls mem::swap():

pub fn with_context_ref<B>(&mut self, context: &mut Context, mut body: B) -> SyntaxResult
where
    B: FnMut(&mut Self) -> SyntaxResult,
{
    ...
}

The third function passes a function defined on a member struct, in this case a Context, and a value. This is necessary because the "derive" macro only has access to the abstract syntax tree it's attached to, and is therefore unable to inspect the definition of Context:

pub fn with_context_fn<V, S, B>(&mut self, mut setter: S, value: V, mut body: B) -> SyntaxResult
where
    S: FnMut(&mut Context, V) -> V,
    B: FnMut(&mut Self) -> SyntaxResult,
{
    ...
}

All three functions accept a lambda whose single parameter is a reference to the SyntaxParser they're called on. This is necessary because the borrow checker does not allow more than one mutable reference to any object. Here are simplified examples of how these functions would be used:

meta.with_binop_border(None, |meta| {
    syntax(meta, &mut block)?;
    Ok(())
})?;

meta.with_context_ref(&mut context, |meta| {
    syntax(meta, &mut block)?;
    Ok(())
})?;

meta.with_context_fn(Context::set_loop_ctx, true, |meta| {
    syntax(meta, &mut block)?;
    Ok(())
})?;

Oct 05 '24 14:10 hdwalters

@hdwalters great job doing the research! I think that this is a great idea that we can implement. Let's remove binop_border field as it's actually not being used anywhere anymore. Also... do we need to maintain this macro as a separate package or repository? As far as I know this is something that Rust encourages.

Are there any modifiers that we will have to pass to generate 1st, 2nd or 3rd version or do we generate all at once? If all at once then it would be nice to have a control for not generating a clone version for complex data types.

No I just realized that we probably don't need the third function at all. We could just implement ContextManager on the Context itself, right? And then we could derive #[context] on the Context::set_loop_ctx field. Am I missing something?

Oct 06 '24 11:10 Ph0enixKM

@hdwalters great job doing the research! I think that this is a great idea that we can implement.

Thanks, it's been an interesting learning experience.

Let's remove binop_border field as it's actually not being used anywhere anymore.

Great, will do that when I create the PR.

Also... do we need to maintain this macro as a separate package or repository? As far as I know this is something that Rust encourages.

Unlike declarative macros with macro_rules!, procedural macros must be defined in a separate crate, with crate type proc-macro; but it can be in the same Git repository. I've created meta/Cargo.toml (directory structure not set in stone) and referenced it in the existing Cargo.toml.

However, I know practically nothing about CI systems; can anyone suggest what changes we would need to support this?

Are there any modifiers that we will have to pass to generate 1st, 2nd or 3rd version or do we generate all at once? If all at once then it would be nice to have a control for not generating a clone version for complex data types.

I think it would cause fewer headaches to just create all three, instead of enabling or disabling them individually. If a struct has the Clone trait, it's reasonable to allow it to be cloned via the generated function; if it doesn't, the project will fail to compile. However, remember this will only happen if the non-cloneable field is marked with #[context]; and we can revisit this if it causes problems in the future. We could also warn the developer by renaming generated function with_foo() as with_foo_clone().

Additionally, I believe the linker does not include unused functions in the final executable (and I would be amazed if this did not also apply to functions created by macros).

No I just realized that we probably don't need the third function at all. We could just implement ContextManager on the Context itself, right? And then we could derive #[context] on the Context::set_loop_ctx field. Am I missing something?

We can't do that, unfortunately. We would have to write something like the following, calling with_is_loop_ctx() on the Context object. This would be disallowed by the borrow checker, because there would be two mutable references to the parent SyntaxParser:

meta.context.with_is_loop_ctx(true, |context| {
    syntax(meta, &mut block)?;
    Ok(())
})?;

Oct 06 '24 13:10 hdwalters