Differential fuzzing of Neo Smart-Contract VMs (including neo-go)
Recently, I stumbled upon LibAFL paper which had a reimplementation of another fuzzer called NeoDiff. The goal was to mutate smart contract byte code and look for differences in the VM state, leading to potential chain splits. This was done to Neo v2 Python and C# implementations (though it seems like LibAFL version only fuzzed EVM implementations: 'go-etherium' and 'openetherium', but not Neo, though original work fuzzed it).
So the suggestion would be to make a differential fuzzer for Neo v3 VMs and try find behavior differences (potentially harmful) in current VM implementations: 'Neo' (C#) / 'neo-go' (Go) / 'mamba' (Python).
I haven't started working on this yet, wanted to get some opinions.
It would be helpful for us to make such research, especially for Go/C# implementations and various set of edge-cases. We have a set of VM compatibility tests with C# node (integrated as VM git submodule), and currently there's no known incompatibility issues for our VMs, but may be you'll manage to find something.
In general, some unexpected bugs may be found by fuzzing, so vote up from my side for the proposed experiment.
Upd: made a repo https://github.com/Slava0135/N3onDiff and some setup. It uses custom harness for executing tests (vm-harness branch for both neo and neo-go). Now, I want to use LibAFL for an actual fuzzy testing, but this can take a while.
Also, I took a look at Mamba and Neon - they don't have VM implementation, so we just stick with C# and Go.
Found 1 issue so far: #3598
Also, I took a look at Mamba and Neon - they don't have VM implementation, so we just stick with C# and Go.
fyi; mamba used to have a VM but the project pivoted to light SDK. Neon never had one.
Looking forward to what you'll find :)
Found 2 more issues: #3612 #3613
Have been fuzzing VM again, with fixes merged. Haven't found anything in 100M executions (4 days), except for #3701 which is not a bug (i guess).
Doesn't mean there are no bugs left obviously. Would need to do smarter things to be more assured (specifically, generating bytecode aware inputs, its hard to generate valid JMP instructions right now).
Haven't found anything in 100M executions
Do you have any estimation of how much code has been covered? JMP is hard, but TRY is even harder while being trickier in the implementation.
Haven't found anything in 100M executions
Do you have any estimation of how much code has been covered? JMP is hard, but TRY is even harder while being trickier in the implementation.
Good question, I haven't implemented an option to save full coverage (yet), should be trivial enough though.
Should definitely try later.
Run it again, with coverage saved
Commit: 2b1b9a4fcaaf8181fa8ba532f2897ce5e928df78
profile.txt
HTML (remove .txt): cover.html.txt
Don't know how to get profile in count mode from covdata tool, would be nicer to have
Don't know how to get profile in
countmode fromcovdatatool, would be nicer to have
I guess I should have compiled harness with go build -cover -covermode=count
Seems to be done, issues found/closed.