wasm-tools icon indicating copy to clipboard operation
wasm-tools copied to clipboard

`wasmparser`: Make `VisitOperator` more user friendly

Open Robbepop opened this issue 3 years ago • 10 comments

At the moment the new VisitOperator trait requires users to implement each and every of the over 500 variants. This quickly becomes a mundane task, especially if there are only a few of the visit methods that are of importance to a user.

Idea 1

One idea is to introduce another default implemented method visit_fallback or visit_default (name not final) which acts similar to a default arm in a match expression. All visit_* methods would be default implemented to forward to this new visit_fallback method and the visit_fallback itself would be default implemented with something generic such as unimplemented!().

pub trait VisitOperator {
    type Output;

    fn visit_fallback(&mut self, offset: usize) -> Self::Output {
        unimplemented!()
    }
    
    fn visit_unreachable(&mut self, offset: usize) -> Self::Output {
        self.visit_fallback(offset)
    }
    
    fn visit_nop(&mut self, offset: usize) -> Self::Output {
        self.visit_fallback(offset)
    }
    
    // etc...
}

Advantages

This design would allow users to incrementally implement the VisitOperator trait and also would allow to only define small subsets of the VisitOperator trait API surface. An example where this could be useful is the wasmprinter crate which simply prints newline for most of the operators and could make heavy use of the visit_fallback method there.

Downsides

The huge downside to this design is that it is no longer clear to users if they have forgotten to implement a variant. Also if wasmparser is updated an added new variants users are no longer informed about this. I think this is a rather big disadvantage and was wondering how we could maybe find a solution that still informs users that want to take care about each and every visit_* method themselves.


Idea 2

We could introduce yet another VisitOperator-like trait. Let's call it VisitOperatorFallback. It is very similar to the original VisitOperator trait in that it houses a visit method for each Wasm operator. The difference to the original VisitOperator trait is that it has this new visit_fallback method and all the aforementioned default implementations. Users can decide to either implement the original VisitOperator or VisitOperatorFallback trait. In case they implement the VisitOperatorFallback (or just subsets of it due to its default impls) then the type also receive an auto implementation of VisitOperator that simply forwards to the methods of the VisitOperatorFallback.

pub trait VisitOperatorFallback {
    type Output;

    fn visit_fallback(&mut self, offset: usize) -> Self::Output {
        unimplemented!()
    }
    
    fn visit_unreachable(&mut self, offset: usize) -> Self::Output {
        self.visit_fallback(offset)
    }
    
    fn visit_nop(&mut self, offset: usize) -> Self::Output {
        self.visit_fallback(offset)
    }
    
    // etc...
}

impl<T> VisitOperator for T
where
    T: VisitOperatorFallback,
{
    type Output = <T as VisitOperatorFallback>::Output;
    
    fn visit_unreachable(&mut self, offset: usize) -> Self::Output {
        <T as VisitOperatorFallback>::visit_unreachable(self, offset)
    }
    
    fn visit_nop(&mut self, offset: usize) -> Self::Output {
        <T as VisitOperatorFallback>::visit_nop(self, offset)
    }
    
    // etc...
}

Advantages

With this design we have the same set of advantages as with solution 1 if users decide to implement VisitOperatorFallback trait. If users only care about a subset of the visit methods they can implement VisitOperatorFallback. If they care about all the visit methods they can implement VisitOperator trait and still be updated about changes properly as with the original design.

Disadvantages

We now require 2 massive traits instead of just 1. However, we could use macro_rules or proc. macros to generate them to keep them in sync.

Robbepop avatar Aug 12 '22 07:08 Robbepop

I don't have ideas myself on how to make this better. I would rule out your "idea 1" though because for many contexts it's too easy to forget to handle new instructions in an update. Otherwise the *Fallback trait idea (or some wrapper structure perhaps) is all I could think of which still isn't much.

alexcrichton avatar Aug 15 '22 14:08 alexcrichton

I think it is not important to fix this urgently but it might be really nice to have a more user friendly API in the future and until then we have this issue as a place where we can collect and share ideas. I have the same feeling as you about "idea 1" and I am also not 100% sold on "idea 2" but cannot come up with something better, yet.

Robbepop avatar Aug 15 '22 14:08 Robbepop

One possible solution is to leave Operator-the-type as well. That's pretty easy to work with using fall-throughs and such. The main problem with it is that it's not really all that high performance, so care needs to be taken that that it's only ever needed in "convenience" situations and in general it's not necessary.

alexcrichton avatar Aug 15 '22 14:08 alexcrichton

The reason why I am not 100% sold on the *Fallback trait idea is that we'd still have lots of code dupe which we should at some point auto generate using a macro. The same is true for the current codebase with the VisitOperator trait and the Operator enum. I don't think we should upkeep this code dupe in the long term of the project. An advantage of the *Fallback trait idea over keeping the Operator enum is that it neatly fits into the VisitOperator API whereas Operator kinda conflicts on performance matters. Also keeping the Operator enum that is probably simpler to use than the VisitOperator trait albeit being the non-preferred solution is contradicting a design principle of making it simple to use an API correctly (or as recommended) and hard to use incorrectly.

Robbepop avatar Aug 15 '22 14:08 Robbepop

Looking at the for_each_operator macro I see some potential in improving user experience further for using the VisitOperator API by splitting up the macro into smaller pieces.

For example for wasmi it would be really useful to additionally have for_each_simd_operator and for_each_threads_operator since on the one hand-side we want to treat operators generally in a way that is not possible to reflect via macro but those groups of operators coming from optional Wasm proposals can be implemented via macro. I would only introduce such macros for operators of Wasm proposals that are supposed to stay optional in the Wasm spec and encompass lots of operators at the same time.

The cool thing is that the general for_each_operator macro can itself be partially defined via those 2 other macros.

@alexcrichton (or anyone else with opinions) if this is something you'd like to support I will happily open a PR for it.

I use a custom (copy'n pasted) macro to represent the unsupported subset in wasmparser for the wasmi integration of the VisitOperator API that can be found here.

Robbepop avatar Aug 29 '22 15:08 Robbepop

Sounds reasonable to me!

Another alternative would be to update the syntax passed to the provided macro where each instruction is prefixed with something like @threads or @mvp where the receiving macro can special-case particular proposals either with an allow-list or a deny-list. For simplicity though I think it's best to start with the multi-macro approach you suggest.

alexcrichton avatar Aug 30 '22 14:08 alexcrichton

I actually like your alternative idea more than mine since it minimizes the API surface instead of bloating it up with more macros. Would you be fine with me going for your alternative design idea first?

Robbepop avatar Aug 30 '22 15:08 Robbepop

Sure yeah, dealer's choice 👍

alexcrichton avatar Aug 30 '22 16:08 alexcrichton