fuzzilli icon indicating copy to clipboard operation
fuzzilli copied to clipboard

Improve loops in FuzzIL

Open saelo opened this issue 5 years ago • 5 comments

FuzzIL's representation of loops is oversimplified and cannot express the fact that more or less arbitrary computations can be performed in the loop header. This might, however, be interesting for things like JIT compiler fuzzing. As such it might make sense to split loops into constructs such as:

BeginLoopHeader
...
BeginLoopBody
...
EndLoop

However, the loop header has special requirements as only expressions are valid in there but not additional statements (so e.g. you cannot put more control flow operations into it). This would have to be expressed somehow (e.g. as a new context).

Additionally, it might make sense to define something like a "loop variable", which could be something as simple as a Phi that is emitted by the LoopHead. This would simplify the current code generators for loops.

saelo avatar Mar 21 '19 10:03 saelo

Thought: What optimizations trigger based off of the "ideal" for loop, something of the form "for(let i = 0; i < 10000; i++)"? I don't have a good handle on how v8's optimizations work under the hood, but I'd expect it can optimize off of both the comparison and increment being constant values.

The current setup uses variables for both the comparison and increment, but there could be value in generating code with hardcoded constants in each position to compare against. That pattern shows up frequently in bug POCs and regression tests, across all the major engines.

What do you think?

WilliamParks avatar Aug 21 '20 22:08 WilliamParks

Basically, JITing is just based on a simple counter: every unit of code (function or possibly loop body) has a counter counting how often it has been executed. Once the counter reaches a certain value, the code is JIT compiled. JSC and Spidermonkey work that way afaik. V8 does it slightly differently: there the JS execution stack is regularly inspected and the counter of every function on the stack is incremented (this approach probably favours long-running functions). In both cases, it's basically enough to just execute the function multiple times. This is easiest achieved through some form of loop, but the concrete form of that loop (a for loop, a while loop, etc.) shouldn't matter.

That said, it might still be interesting to have a JITCall instruction which behaves like a function call but executes the given function a specific number of time to force compilation. It's not clear if that would be superior to the current approach but could be something to experiment with.

saelo avatar Aug 26 '20 12:08 saelo

Do you have any further thought about the problem? it is not easy to compile JS with expressions in loop header to Fuzzil OP. Thanks!

oicu0619 avatar Aug 03 '21 08:08 oicu0619

So once https://github.com/googleprojectzero/fuzzilli/pull/256 lands, we should be able to properly implement loops, in particular loop headers.

For that, we first need to split what is currently the .script context into two new contexts, e.g. .statement and .expression, and use the two everywhere where we currently use .script.

Then, the general approach would be to split loops into three operations: BeginXLoopHeader, BeginXLoopBody, EndXLoop. The BeginXLoopHeader would open [.expression] context but not .statement. As such, everything that is an expression is allowed inside the header, but nothing else. The BeginXLoopBody would then open [.loop, .statement, .expression] context, and so statements etc. are allowed in the body.

Then, we'd probably just use the output of the last instruction in the loop header as loop condition (this basically happens automatically if we lift them as while (expr1, expr2, expr3) { ... }. (Note, there are some operations, e.g. StoreProperty, that don't have outputs in FuzzIL, but would still work here, e.g. while (a[42] = 1337) { ... }. Not sure if we want to allow that.)

One issue is that we can't define new variables inside the loop header. So if we don't find a good way to work around that, we might need to do something like that, which is probably not great...

let i = ...;
...
let loopV7, loopV8;
while (loopV7 = foo(), loopV8 = i++, i < loopV7) {
    ...;
}

Any thoughts welcome!

saelo avatar Sep 19 '21 15:09 saelo

(Note, there are some operations, e.g. StoreProperty, that don't have outputs in FuzzIL, but would still work here, e.g. while (a[42] = 1337) { ... }. Not sure if we want to allow that.)

Yes definitely a good idea to have this ability. We should also allow inlining of Unary Operations. For example we should now be able to compile the following programs to FuzzIL without having to resort to workarounds that require generating temp variables to store intermediate values. Example of loop with UpdateExpression:

let i = 10;
while(--i) { <-- UpdateExpression
    print(i)
}

and an example of a loop that uses MemberExpressions:

let arr = [1,2,3,4]
while(--arr.length) { <-- Unary operator applied to a MemberExpression
    print(arr)
}

amarekano avatar Oct 11 '21 17:10 amarekano

This is now fixed with https://github.com/googleprojectzero/fuzzilli/commit/dd75a782b14d111a37021faf10f5a3448d63a7e6 and https://github.com/googleprojectzero/fuzzilli/commit/c76ea0edfb5a59e0e762a2c1a01351f4604b3935

saelo avatar Mar 15 '23 12:03 saelo