wasm-tools icon indicating copy to clipboard operation
wasm-tools copied to clipboard

`wasm-mutate`: List of mutations we should support

Open fitzgen opened this issue 2 years ago • 9 comments

We should have an add, edit, reorder, and remove mutator for each Wasm section (where it makes sense). This is a tracking issue for each of these things.

In general, removing an entity should be something done optimistically, but which can fail if the entity is referenced in other sections (e.g. if we are trying to remove a global, but a function uses it via global.get or global.set).

Custom Sections

  • [x] Add new custom section
  • [x] Edit existing custom section
    • [x] add bytes
    • [x] remove bytes
    • [x] edit bytes
  • [x] Remove custom section
  • [x] Move existing custom section

Type Section

  • [x] Add new type
  • [ ] Edit type (if it isn't used)
  • [x] Remove type
  • [ ] Reorder types

Import Section

  • [ ] Add new import
  • [ ] Edit import name/module
  • [ ] Edit import kind (if it isn't used)
  • [x] Remove import
  • [ ] Reorder imports

Function Section

  • [x] Add new function
  • [ ] Add a parameter (and update the code section as necessary)
  • [ ] Remove a parameter (and update the code section as necessary)
  • [ ] Turn a parameter into a local (and update calls and type section as necessary)
  • [x] Remove function
  • [ ] Reorder functions

Table Section

  • [ ] Add a new table (if there aren't any tables or bulk memory is enabled)
  • [ ] Edit an existing table
  • [x] Remove a table
  • [ ] Reorder tables

Memory Section

  • [ ] Add a new memory (if there aren't any memories or multi-memory is enabled)
  • [ ] Edit an existing memory
  • [x] Remove a memory
  • [ ] Reorder memories

Global Section

  • [ ] Add a new global
  • [ ] Edit an existing global
  • [x] Remove a global
  • [ ] Reorder globals

Export Section

  • [ ] Add a new export
  • [X] Rename an existing export
  • [ ] Edit an existing export to export a different entity
  • [X] Remove an export

Start Section

  • [ ] Add a start section (if there isn't one, and there is a function of the right type)
  • [ ] Change which function is the start function in an existing start section (if there is another of the right type)
  • [x] Remove the start section

Element Section

  • [ ] Add a new element segment
  • [ ] Edit an existing element segment
    • [ ] Reorder functions in the segment
    • [ ] Add a new function to the segment
    • [ ] Remove a function from the segment (needs to be careful of ref.funcs referencing this function)
  • [x] Remove an element segment
  • [ ] Reorder element segments

Code Section

  • [ ] Remove unused locals
  • [ ] Outline an expression into a new function
  • [ ] Inline a function call
  • [ ] Swap two expressions that don't have any ordering between each other

Peephole Mutations

  • [ ] Allow writing rules with wildcard immediates (e.g. (elem.drop.*) => (container) instead of (elem.drop.0) => (container) and (elem.drop.1) => (container), etc...)
  • [ ] Replace an f64 or f32 value with NaN (when not preserving semantics)
  • [ ] Replace a value with a use of an existing local of the same type (when not preserving semantics)
  • [ ] Replace a value with a use of an existing global of the same type (when not preserving semantics)
  • [ ] Swap operands of the same arity, e.g. turn (i32.add ?x ?y) into (i32.mul ?x ?y), etc... (when not preserving semantics)
  • [ ] Completely remove expressions that don't produce any result values, e.g. stores and void calls (when not preserving semantics)
  • [ ] Insert expressions that don't produce any result in arbitrary places, e.g. stores and void calls (when not preserving semantics)

Code Motion Mutations

  • [ ] Replace the body of an if/else/loop/block with unreachable (if not preserving semantics)
  • [ ] Replace if .. else .. end with just the consequent or just the alternative as a block (if not preserving semantics)
  • [ ] Replace loop with block and vice versa (if not preserving semantics)
  • [ ] Delete a whole if/else, block, or loop, dropping parameters and producing dummy value results as necessary (if not preserving semantics)
  • [ ] Inline a block .. end as just the .. part (will require checking that there are no br instructions targeting the block, as well as renumbering nested brs that jump to some label outside this block)

Data Section

  • [ ] Add a new data segment
  • [x] Edit an existing data segment
    • [x] Reorder bytes within the segment
    • [x] Add bytes to the segment
    • [x] Remove bytes from the segment
  • [x] Remove a data segment
  • [ ] Reorder data segments

Data Count Section

Not really applicable. More of something we just have to keep up to date based on mutations of the data section.

fitzgen avatar Dec 07 '21 19:12 fitzgen

Most of these removals (and additions) require modifications to the index space which in general can require a whole rewrite of the wasm module, so I think some of those may at least be nontrivial. I've got a branch I started awhile back which implements removing almost all the items here and I just need to finish up the giant match statement to rewrite the code section. I hope to push that up in the near future.

alexcrichton avatar Dec 13 '21 15:12 alexcrichton

Yes, I was hoping that we could develop a generic function to rewrite the code section with the given renamings and then reuse that function throughout most of these mutators.

fitzgen avatar Dec 13 '21 17:12 fitzgen

Hi all and happy new year !

I have a doubt on the ItemRemoveMutator and I did not find a better way to write it down :(.

Is there a reason to think that every item removal, excepting for Data or Element, is semantically equivalent ?

https://github.com/bytecodealliance/wasm-tools/blob/6b8da0ec8a283fe04a8a4d012fedc1abe526bac1/crates/wasm-mutate/src/mutators/remove_item.rs#L37

 match self {
            Item::Function => info.num_functions() > 0,
            Item::Table => info.num_tables() > 0,
            Item::Memory => info.num_memories() > 0,
            Item::Global => info.num_globals() > 0,
            Item::Tag => info.num_tags() > 0,
            Item::Type => info.num_types() > 0,

            // Note that data/elements can lead to traps and side-effectful
            // initialization of imported tables/memories, so these are only
            // considered for removal if we're not preserving semantics.
            Item::Data => !config.preserve_semantics && info.num_data() > 0,
            Item::Element => !config.preserve_semantics && info.num_elements() > 0,
        }

Thanks in advance

Jacarte avatar Jan 20 '22 18:01 Jacarte

Happy new year! :)

If the other entities are used (e.g. a function is called or exported) then the mutator will return an error. If they are not used, on the other hand, then removing them still preserves semantics.

fitzgen avatar Jan 20 '22 19:01 fitzgen

Add a parameter (and update the code section as necessary) Remove a parameter (and update the code section as necessary)

This sort of transformation may be made easier if it is preceded by duplicating the type definitions so that each operation and item that refers to a type refers to an unique type definition.

nagisa avatar Feb 04 '22 12:02 nagisa

Another transformation that I think could be interesting and which I'm not sure is covered by the issue text here is import-ifying and local-ifying items such as globals and functions. That is replacing something like (import "" "" (global i32)) with (global i32 (i32.const 0)) and vice-versa.

Implementing both transformations could also pose a risk of making it difficult to reach a fixpoint too, though.

nagisa avatar Feb 04 '22 13:02 nagisa

Implementing both transformations could also pose a risk of making it difficult to reach a fixpoint too, though.

We only care about fixpoints when reducing, and in that situation we can avoid the problem by only taking mutations that actually reduce the size of the resulting Wasm binary.

fitzgen avatar Feb 04 '22 17:02 fitzgen

(Added a bunch more code-related mutation ideas, in case anyone is following along at home, and is interested in taking a look.)

fitzgen avatar Apr 13 '22 18:04 fitzgen

Not sure if I wrote this anywhere else, but it keeps staying on top of my mind and I while I did start a prototype, I never really finished it…

We should implement a mutation that replaces an encoding of a construct with an a different encoding of the same construct. Like, for example there are multiple different ways to LEB-encode a specific number, and because typically one encoding is canonical, those other encodings are most likely to cause problems...

nagisa avatar Nov 09 '22 15:11 nagisa