ts-morph
ts-morph copied to clipboard
Fast Remove
As part of the future performance improvements, working on improving the #remove() method's performance might be a good start.
On remove, the following happens today:
- Source file text is changed to not include the node's text.
- Source file text is reparsed.
- Wrapped nodes are updated with new compiler nodes.
A better solution, but a bit more complicated would be:
- Source file text is changed to not include the node's text.
- Existing AST is manipulated to remove the removed node (ex. update
#statementsand#_childrenwhen removing a statement) - Go up the parents, and down the following nodes to change any
posandendproperties. Also, remove stuff like symbols and such (need to investigate more) - Update the source file text.
- Forget the removed node.
Not having to reparse and fill the wrapped nodes with new nodes should improve the performance big time. This change will have no effect on the public API.
Note: The tests for these should do a deep search of the final AST to ensure the previous compiler node isn't remaining inside the AST somewhere and that no symbols exist hidden within the AST. I'll need to edit private and internal data in the nodes to do this.
When adding this, be sure to add some metrics to the project specifically for this scenario (need line graph showing performance over every commit... do worst case and best case on some generated data).
I hacked together something for this today and this was 10x faster than the current way on the 002_Removing.ts performance test. So this is a must do, but will be a bit challenging.
I was talking with Titian at tsconf and it seems I could maybe take advantage of the incremental parser here. The function of interest seems to be updateLanguageServiceSourceFile.
Hey, we have an autogenerate file with over 200,000 lines of code, and removing nodes from this file has become impossible due to heap limit issues... :/ I tried forgetting nodes, batching node removal, etc but nothing works. Im thinking will have to resort to straight up string manipulation :/
For more context, it doesnt outright not work, i iterate various nodes (both while analyzing and already filtered in an array) and at around 40 nodes removes, the heap limit is reached. I also tried reassigning/re-reading the file and project to the variable, hoping the JS garbage collector would clear some space, but it didn't work.