aa icon indicating copy to clipboard operation
aa copied to clipboard

Tree shaking snapshot

Open AaronGoldman opened this issue 3 years ago • 0 comments

Re-entrant snapshots

Use case: preprocessing Some processing can be broken into parts that make sence to divide up over time and space. For example if I am making a game that has lots of game assets that must be preprocessed from the form that the artists work with to the densely packed forms the game needs to run efficiently.

Use case: fast startup If the vm needs to compile large parts of the standard library and the runtime itself before calling main() this will put a lower bound on the startup time. We may want to AoT the code needed to get to main fast. To use AA as a candidate for replacing programs that run often for short periods of time like ls, cp, mv, or cloud lambda functions fast startup time can become a limiting lower bound.

Use case: small size If we want to use AA in a resource constrained embedded environment like a driver we may want to produce artifacts that are smaller than the runtime + the code. In order to get a small artifact we may want to distribute a artifact where parts of the standard library and runtime that are not used have been removed.

Approach: In languages with a hard build time run time distinction this is often done with a CI/CD build system that produces an artifact that can be deployed to production or released to customers.

In AA we have a REPL and dynamic code loading. This makes a pure AoT compile impossible since new code can arrive at any time. If we want to be able to get back some of the advantages of AoT preprocessing. We need a system for producing a artifact that is suitable to be deployed to production or released to customers.

We may want to solve problems like this by doing some amount of build time processing and then take a snapshot. This would involve closing all stateful connections to the environment and encoding the running system as an artifact.

VM + Code: The golang language has seen easier deployability by producing a single statically linked binary that doesn't need a libc. Just put the binary on the server and run it. The simplest way to do this is to simply collect all the code and append it to the end of the binary. The interpreter could search its own binary for the standard library and a main() function. This would give us the ability to deploy an artifact without concern for questions like what vm is present in the target environment. Just drop the elf on the target and execute it. The advantage of this model is you can recover the code from the artifact and we have the full power of the runtime and the standard library. aa pack my_program/ > my_artifact

VM + preprocessing: If we let the runtime run some function and then once it is run snapshot the stat of the runtime. We can save out the compiled code as well as the counters from the runtime. This would give us faster startup and faster convergence to steady state. If the VM could also emit a packed snapshot and continue running we can read and aggregate the counters for the next AoT build. These snapshots may be larger than the original code holding code, compiled code, counters, and application state. This could act as a more modern coredump and potentially a more useful one as we could ether reflect against the snapshot-ed state or re-enter it. aa build my_program/ > my_artifact

VM + preprocessing - whatever we can: If we want to make tiny binarys there are many things we may want to pull out of the binary. We can go in two directions. The first is to only have a minimal interpreter and store only bytecode from the original source. The second is too aggressively compile and treeshake. If the user deletes many of the functions it no longer needs including the eval command we can know that no new code will be loaded after that point. We can now start aggressively removing all of the code from the application that are not reachable. We can then prune the unreachable parts of the library's the application loads including the standard library. From there we get to removing parts of the runtime itself. If we could prove that every remaining allocation site could be reference counted we could remove the garbage collector. If we know we have AoT compiled all paths then we can remove the Jit. The more we know about the code that survives the tree shaking the smaller the output could be made. Even when we are trying to get a tiny binary we may still want to use the full power of the full language at the build time and then drop to a less feature full set of constraints for the final form. Garbage collected > Reference counted > only stack - no runtime allocation > no stack - Full static memory allocation aa build my_program/ > my_artifact but with code in the build() function indicating the minimisation

User may want to have hooks for application life cycle: Build - runs on the dev or CI/CD environment Start - runs every time a snapshot is re-entered. Main - runs when a snapshot is started with run Test - runs when aa is started with a test command Probably these hooks would not be specified as part of the language or runtime but as part of the standard libraries default main parsing the command line arguments pack, build, run, test, and repl could all just be commands recognized by the standard libraries main()

Lots of open questions about how to handel closing all connections with the environment. What if It wakes up on a different OS, cpu architecture, VM, ... In the event of a panic should the runtime drop a snapshot that could be debugged later? What would it take to snapshot in a potentially inconsistent state during a panic?

AaronGoldman avatar Nov 03 '21 22:11 AaronGoldman