sway
sway copied to clipboard
Add module AST cache.
Description
This adds a module AST cache which speeds up the parsing and type checking stages for modules that have not changed. To make this work, when first parsing, we save the tree of dependencies and hashes for each module. Then when compiling again, we check the cache and re-use if possible.
To make this work for LSP, I've had to revert the change that was done earlier to re-create the engines for each compilation step. This may lead to memory bloating, which was the reason that change was done in the first place IIRC, so I'll be investigating a garbage collection step that LSP can run after each compilation.
On the LSP did_change test, that compiles a single file, makes a change, and re-parses again, this drops the time from 6s (3s + 3s for the second parse) to 3s, as expected, since the core and standard library are re-used instead of being parsed and analyzed again.
Checklist
- [ ] I have linked to any relevant issues.
- [ ] I have commented my code, particularly in hard-to-understand areas.
- [ ] I have updated the documentation where relevant (API docs, the reference, and the Sway book).
- [ ] I have added tests that prove my fix is effective or that my feature works.
- [ ] I have added (or requested a maintainer to add) the necessary
Breaking*orNew Featurelabels where relevant. - [ ] I have done my best to ensure that my PR adheres to the Fuel Labs Code Review Standards.
- [ ] I have requested a review from the relevant team or maintainers.
This new update adds a work-in-progress GC implementation (under an experimental flag in LSP) as well as tests for the module cache under the LSP testsuite, as well as a new API to measure and print the stats of the slab.
During the GC implementation, I reimplemented the concurrent slab memory representation on top of slotmap, but it ended up making the compiler roughly 20% slower. Also realized the way the slab works right now is probably not the most optimal for GC-aware compilation, after some thought I think we should have a slab per source module.
Also noticed that with the new module cache, the memory bloating problem we had where engines grew up unreasonably large really fast is not as bad, since now most of the modules get re-used for each sucessive parsing, so I've deleted the engines re-creation code path, though we probably want to bring it back soon.
So after some discussion with @IGI-111, we decided to leave the GC and slab architecture rework battle to the future, and take the pragmatic approach to get most of the speed up benefits in the meanwhile.
@JoshuaBatty Interested in hearing your thoughts about this, is this OK? Also added a new LSP request to be able to get at the metrics. It may be nice in the future to have a VS Code extension command to get this information (as well as slab stats reports. still needs an LSP request for this one).
Hey @tritao just pulled down this branch and gave it a quick play with gc turned on. You can see in the video below that we start off with typed information in the language server. But if we introduce some malformed code we lose access to these (see issue #4866). However, if we then remove the line causing the error, we never get the typed tokens back and the language server is unable to work. See below.
https://github.com/FuelLabs/sway/assets/1289413/0ad640a8-c829-4a9f-a80a-cb4e28b9440c
Hey @tritao just pulled down this branch and gave it a quick play with gc turned on. You can see in the video below that we start off with typed information in the language server. But if we introduce some malformed code we lose access to these (see issue #4866). However, if we then remove the line causing the error, we never get the typed tokens back and the language server is unable to work. See below.
Screen.Recording.2023-07-26.at.1.18.21.pm.mov
Did/can you check if this issue also happens with the GC turned off? Do you think its related to this PR?
It seems to work as expected when garbage collection is turned off.
Would be interesting to see numbers on cargo r -p test --release for both CPU and mem.
It seems to work as expected when garbage collection is turned off.
Ok, that's expected, the GC feature itself is not fully working in this PR by design, I'll need to spend some time re-designing some things for it to work reliably and efficiently.
Would be interesting to see numbers on
cargo r -p test --releasefor both CPU and mem.
Numbers will be the same with and without this PR, because right now caching will only work when the engines are re-used across parsing sessions, and that is only the case for LSP right now.
For test, I think we may be able to re-use the engines across tests, but not really sure if we want to do that, given it means each individual test will not have a fresh engines state. Long-term we will need to serialize/deserialize the state for the type and decl engine to be able to speed up compilations, basically have the cache all primed on the first run.
For
test, I think we may be able to re-use the engines across tests, but not really sure if we want to do that, given it means each individual test will not have a fresh engines state. Long-term we will need to serialize/deserialize the state for the type and decl engine to be able to speed up compilations, basically have the cache all primed on the first run.
I also think our tests should run on a clean engine set.
Though what this is missing is a validation test that checks if doing subsequent compilations of some sway that uses a bunch of language features still works. We'll want to be able to detect if we break this.
Rebased to latest.
There were some significant changes to the codebase recently which impacted this.
First the handler refactoring, which required some changes to its interface to allow to consume its data.
Storing the handler itself didn't work due to its usage of RefCell and because LSP now stores its Engines in some thread-aware way. @IGI-111 I think the changes should be ok, but please take a look and let me know.
LSP was also significantly refactored recently, hence all my previous changes to enable the sharing of engines are now broken, and it looks like the new system uses threads and is not really designed to allow sharing of the engines. As such, I've split off my LSP changes to https://github.com/FuelLabs/sway/pull/4988, until we figure out what to do about it (cc @JoshuaBatty).
As for testing this, an unit test doesn't really work. The best way to do it needs some significant changes to test, which I have documented and opened an issue for in https://github.com/FuelLabs/sway/issues/4987.
So I would like to merge this work right now if possible to unblock usage of this in LSP right away, as the performance improvements are considerable, and to prevent bitrot of this in the meanwhile. And will get back to improving the testing situation as soon as some other work is finished.
Makes sense to move all the GC changes to https://github.com/FuelLabs/sway/issues/4994? I am still confused on how we are going to implement sweep, or if we are going to go with one-slab-per-module and drop it entirely; or even something else.
Makes sense to move all the GC changes to #4994? I am still confused on how we are going to implement sweep, or if we are going to go with one-slab-per-module and drop it entirely; or even something else.
Long-term the one-slab-per-module is the way to go IMHO as then we don't really need a fine-grained GC, but that's likely to take too much more time than just reworking the slab allocator. I think we can use a variant of the buddy allocator algorithm to implement the slab with efficient removal, and that could be done in a few days of work.
So I guess it comes down to decide if we're going the one-slab-per-module approach or not. I see no value in moving the GC marking changes to #4994, as that's for LSP-specific changes and the LSP team might be finishing that one. @IGI-111 what do you think? Do we drop the GC stuff and just bite the bullet and do the work to get one-slab-per-module approach? Or keep the GC approach in the short-term?
This PR has gone for a long time already. I'd say we merge these changes to a stable state and move to the one slab per module approach in a later change.
I just took out the GC changes. We can figure out the best approach in the meanwhile, but am thinking just going to the slab-per-module approach may be the way to go.
I just took out the GC changes. We can figure out the best approach in the meanwhile, but am thinking just going to the slab-per-module approach may be the way to go.
Sounds straight forward enough to manage from the LSP side with this approach.