Speed optimizations
NOTE: This branch is based on #17 and depends on it being merged first, thus the draft status.
We've had to deal with huge scripts (9.5 MB) that use lots of arithmetic, duplicated variables etc. - these scripts have millions of nodes that proved to be really challenging for deobshell to handle. Initially, I stopped the process after multiple hours. I did quite a few changes to the code that brought the runtime down to ~17 minutes for 3 million nodes (excluding AST generation, which is also really resource and time intensive...).
In particular:
- Generating the parent map gets very expensive when you have that many nodes. I changed it so it's only generated once, and it's updated whenever nodes are changed/replaced/deleted. This complicates the logic slightly in places, but overall it's much much faster.
- Iterating over the entire tree is expensive if you do it over and over, and if the order of optimization calls is not thought through, you get many useless iterations that change nothing at all in the end. I've reordered the functions to better suit the use case of "large arithmetic script". I don't know if there are other huge scripts that use different techniques - for those it might have to be re-evaluated or an entirely different approach for invoking optimizations would have to be chosen. For small scripts of a couple dozen kilobytes or so, it literally doesn't matter.
Smaller/micro optimizations:
- Use
iter("node tag")instead of tag comparisons in the loops. This pushes the comparison into native code, which is quite a bit faster. - Replace
inexpressions with a single string on the right-hand side with==. - Avoid list creation for
inchecks, use tuples instead.
I also added some more barewords, and operators including a new test script.
I understand that this changeset is pretty big. If you review the commits, I recommend using "Hide whitespaces", especially for Code speed optimizations. I ran the updated code against all scripts in the data folder and there were no changes.
If you wish I can add a (malware) script that exercises these changes and ones I've done in the past. I have one that is 1.1 MB, so not outrageously large. I won't add its AST because that's over 40 MB, but the .deob.ps1 and deob.xml is ok. Let me know if you'd like me to do that.