Amber icon indicating copy to clipboard operation
Amber copied to clipboard

Introduce optimizer - remove unused variables

Open Ph0enixKM opened this issue 7 months ago • 9 comments

A new layer between Translate and Render layer called "optimizer" that optimises resulting Bash code. In this PR I'm introducing a new variable optimiser that removes the "hanging" variable declarations (the ones that are never used after all).

Optimizer

Optimizer currently has one sublayer that removes unused variables but we can easily extend it with more sublayers that alter the AST accordingly:

pub fn optimize_fragments(ast: &mut FragmentKind) {
    remove_unused_variables(ast);
    // Add new sublayer here
}

It works in tandem with FragmentKinds that are returned by translation layer a couple of lines above. It's goal is to modify the FragmentKind structure (the AST) in a more efficient way without modifying the logic behind the code.

I've introduced a new compiler flag called AMBER_NO_OPTIMIZE that can disable optimizer in any time (we can extend it's behavior later to turn off certain sublayers for instance).

New optimize_unused field in VarStmt

There are cases when we really don't want a variable to be optimized and to potentially disappear if it's being overwritten later on in the code. It's currently used for the return statement as it's translated to a global variable assign (because Bash's return is an exit code return).

Unused variables sublayer

This optimizer sublayer removes unnecessary or redundant instances of VarStmtFragment. It's broken down to 2 phases:

  1. find_unused_variables that gathers information about what variables are being used in an unbounded context (an expression), context bounded to a variable (a variable assign), and tracks if we enter or leave block of code that could theoretically not run at all (a conditional block) like if statement, loop (particularly when it's being looped on an empty array).
  2. remove_non_existing_variables that tracks the aggregated data from the previous phase and decides whether or not to remove currently inspected variable statement.

There are also two other important functions:

  1. UnusedVariablesMetadata::is_var_used determines if provided variable is being used later on. Collects all the transitive relations between variables and removes them or add them accordingly.
  2. should_optimize_var_stmt determines if we even want to optimize given variable. Variable that aren't fully overwriting state (like a += b, a[0] = b), are explicitly marked as variables that should never be optimized (like function return), and references shouldn't be optimized by this system.

Ph0enixKM avatar May 20 '25 17:05 Ph0enixKM

I was so happy to try it but the there isn't any code right now 😂

Mte90 avatar May 21 '25 08:05 Mte90

@Mte90 I'm still working on this - I had to push the changes because I started coding this on my iPad on GitHub Codespaces and then I came back home and I wanted to continue on my laptop 😆

Ph0enixKM avatar May 21 '25 09:05 Ph0enixKM

shouldn't this be implemented as a part of a broader code scanning solution, rather than a separate construct?

b1ek avatar May 21 '25 09:05 b1ek

shouldn't this be implemented as a part of a broader code scanning solution, rather than a separate construct?

I'm not sure what do you mean. Every compiler has some optimizer backend integrated. Do you mean like a separate binary? I believe that this would only make things harder to install, debug and we'd have to introduce a new API for the binaries to communicate.

Ph0enixKM avatar May 21 '25 13:05 Ph0enixKM

This PR works for simple variables but I'm still working on clearing up transitive variable relations like:

// These all should be cleared in bash:
let a = 12
let b = a
let c = b

Ph0enixKM avatar May 21 '25 13:05 Ph0enixKM

@Mte90 yes, thanks for spotting this!

Ph0enixKM avatar May 21 '25 14:05 Ph0enixKM

shouldn't this be implemented as a part of a broader code scanning solution, rather than a separate construct?

I'm not sure what do you mean. Every compiler has some optimizer backend integrated. Do you mean like a separate binary? I believe that this would only make things harder to install, debug and we'd have to introduce a new API for the binaries to communicate.

no, i mean to make this current code extendable to implement things like union types in the future. it could be implemented as a separate binary but i see no reason to do that.

my point is that it should be extendable to things beyond merely removing unused variables

b1ek avatar May 22 '25 08:05 b1ek

It will be... this PR introduces a function that with given AST it just removes unused variables. We can always add more modules to optimizer/ and extend it to do all sorts of stuff. We can call them in the mod.rs where is the main opimizing function

Ph0enixKM avatar May 22 '25 08:05 Ph0enixKM

It works finally!

https://github.com/Mte90/My-Scripts/commit/5550f9db926f412e015aab02bf14d01c26eb2884#diff-002ea630f348d5925b58584d294d3ba0dfcd196abafc4a0598ad2a00b579564b

Mte90 avatar May 22 '25 09:05 Mte90

I've added a simple macro to make the tests more readable

Ph0enixKM avatar Jun 02 '25 19:06 Ph0enixKM

I'll write more tests and then we can merge

Ph0enixKM avatar Jun 20 '25 09:06 Ph0enixKM