unicorn
unicorn copied to clipboard
Optimize memory handling
By using the memory snapshots with real examples we found some bottlenecks. Therefor I have implemented some optimizations.
We found that the flatview creation is quite expensive (around 1/4 of all runtime including emulation) when you have a lot of memory regions. By permanently creating new regions with CoW we have a lot of regions and rebuild for each cow the flatview again. I have added a update function does only change affected ranges.
Also we found that the find_ram_offset_last optimization doesn't work after a restore. To fix this the context now includes also the ramblock_freed bool.
This also includes two smaller optimizations: only clear the affected tlb on memory_cow and save the last ramblock of the ram_list
It's currently only a draft, because we need to do a few more tests to see if the optimization works as expected.
I have additional added the flatview to the context. This way flatview doesn't need to be rebuild on context_restore.
I have tested and profiled this. It does in fact save us relevant time.
@wtdcode have you overseen this PR? It would be nice to get some feedback.
@wtdcode have you overseen this PR? It would be nice to get some feedback.
Sorry for not leaving a comment. I had a quick review on it and didn't find any obvious problems. I planned to have further debugging to have a better understanding before commenting further.
I ported d01035767e47f69b4e545d1843dbaf08e6a74752 which generally allows unicorn to go fast qemu_st path (not going through store_helper again) to have better writing performance. This shall have side effects on your memory_cow which relies on store_helper.
I have twinkled a bit on the code to ensure that when snapshot is enabled, no TLB is marked clean. However, I'm not sure if that works for you, you may have a look.
Regarding this PR, I locally tested and think it is good to go if you don't have more to add.
Thanks for the update. As far as I understand this patch allows faster writes, if the tlb cache entry exists. So for snapshots it might be enough to clear the tlb cache. This way the store_helper would be called again and everything works as expected. I'll test this.
think it is good to go if you don't have more to add
I would like to rebase this to the dev branch to remove the "fixup" commits. Also because I haven't looked at this for a few weeks, I would like to test this again with the current HEAD. I'll ping you when I'm finished testing.
@wtdcode I have tested the code again and removed the fixup commits so for me this is ready to merge.
For the notdirty write I'm working on a optimization, but I'll open another PR for it.
@wtdcode I have tested the code again and removed the fixup commits so for me this is ready to merge.
For the notdirty write I'm working on a optimization, but I'll open another PR for it.
Will check today or tomorrow, thanks for your work!
Okay, I finished the review, and it's safe to go now with minor fixes and a few document updates. Thanks again for your brilliant work.
I have added the last fixes and rebased it to the dev branch.
The failed test doesn't look like it's a bug by my changes.
I have added the last fixes and rebased it to the dev branch.
The failed test doesn't look like it's a bug by my changes.
I will rerun and it seems Github Action bug.
Thanks!