FEX icon indicating copy to clipboard operation
FEX copied to clipboard

Just in time OBJ/IR Caching

Open skmp opened this issue 3 years ago • 13 comments

Overview

Caches IR and OBJ as it gets compiled, in a multithreaded and multi process safe way.

This started as a cleanup of the loading interface, and as such has a lot of cruft, then I cherry-picked stuff in from #1548

Reading from the caches takes a shared_lock, and that's it.

Writing to the caches takes a unique_lock, and also fnctl advisory lock when the index or the data files are resized.

Index files are re-mapped as they grow, they grow in 64kb chunks.

Data files are mapped in 'chunks' of up to 16Megs.

Settings have been cleaned up. AOTIRLoad and AOTIRCapture have been merged to IRCache. AOTIRGenerate has been completely removed, and a separate executable, FEXAOTGen is used for that.

To enable caching, use

"IRCache": "auto"/"disabled"/"read"/"write"/"readwrite",
"ObjCache": "auto"/"disabled"/"read"/"write"/"readwrite"

Todo

  • [x] Rework in a generic index file
  • [x] Cleanup & Implement configuration
  • [x] Cleanup object cache / debug changes
  • [x] See what broke tests
  • [x] Update Summary
  • [ ] Investigate glmark2 rare crash
  • [ ] gimp also crashes

Follow Up

  • [ ] Investigating adding some form of balancing in the BSTs?
  • [ ] Investigate adding a sorted array mode if the BSTs are not dirty

skmp avatar Jul 10 '22 08:07 skmp

Among other things, this also hashes the actual block ranges, not (start, end). It also limits the blocks to a single vma mapping. So we don't compile code from multiple files into one cached block.

Implements #798 with a brute force approach

skmp avatar Jul 10 '22 08:07 skmp

Also, it inserts ranges for cached blocks for SMC invalidation, though only after hashing them (Follow up: #1963)

skmp avatar Jul 10 '22 08:07 skmp

(Refactor broke this branch, will fix toms)

skmp avatar Jul 11 '22 00:07 skmp

Steam is stable again :)

skmp avatar Jul 11 '22 01:07 skmp

(now with OBJ support for x86 and arm64, always enabled for now)

skmp avatar Jul 27 '22 04:07 skmp

This has been rebased on top of main, and most bugs have been fixed. I plan to spent tomorrow on cleanups as well.

Would be good to get some feedback as-is @Sonicadvance1 @neobrain ~

skmp avatar Jul 27 '22 23:07 skmp

Would be nice if this passed CI before reviewing thoroughly.

Sonicadvance1 avatar Jul 28 '22 00:07 Sonicadvance1

Would be nice if this passed CI before reviewing thoroughly.

The CI fails are unrelated to the main logic, it's just not writed initialization for the TestHarness. I'll be pushing a fix in a bit

skmp avatar Jul 28 '22 16:07 skmp

This has been cleaned up, with most dead/stale code removed.

Follow up

  • [x] Fully postfix cache folders with compilation modal options

skmp avatar Jul 28 '22 19:07 skmp

  • [x] Make AOTGenerate a fexloader-only option

skmp avatar Jul 28 '22 19:07 skmp

  • [x] Fix FEXUpdateIRCache.sh, add FEXUpdateObjCache.sh, cleanup scripts?

skmp avatar Jul 28 '22 19:07 skmp

  • [x] Compact relocations to actual size

Follow Up:

  • [x] Generate relocations in relocation pools, import cached files directly to code cache (#1939, #1332)

skmp avatar Jul 28 '22 19:07 skmp

All tasks completed on this? Merge conflict is still here.

Sonicadvance1 avatar Aug 10 '22 11:08 Sonicadvance1

(This has been squashed and rebased to main)

skmp avatar Aug 22 '22 12:08 skmp

After much, much debugging and fixing several other bugs, it looks like this is a race condition that goes away when code isn't compiled 'too fast'.

So far I've verified that it is not

  • Bad code stored in cache
  • Bad code read from cache
  • Cache index/data corruption

skmp avatar Aug 29 '22 05:08 skmp

@Sonicadvance1 thoughts on disabling/ignoring the Visual Debugger for now? It is largely broken at this point, all of the APIs i've disabled here were already broken before.

Otherwise, I'll do another bug hunting spree for this tonight, and if the bug is not found I think it's best to merge and keep experimental til 2210.

skmp avatar Aug 29 '22 18:08 skmp

Bug fixed, ranges were not serialized correctly.

Follow Up

  • Performance drops a little with that, though we can work around it in different ways in the future, such as whole page hasing (#1961)

Now investigating the interpreter failures.

skmp avatar Aug 29 '22 21:08 skmp

With the last round of fixes this passes all asm tests for me locally.

I'll take another look tomorrow to cleanup things and get everything ready for merge.

IR tests will have to be updated for new OP_BREAK semantics

skmp avatar Aug 29 '22 21:08 skmp

Looks like all the fixes have managed to round up smc issues outlined in #1754 at last.

Based on my testing, parallels/m1 is ~ 60% likely to fail on that test, with stale code running for ever.

Oddly enough, it doesn't repro in orion

skmp avatar Aug 30 '22 18:08 skmp

The smc issues don't repro with gdb attached, so likely yet another race somewhere between multi threaded invalidation and translation.

Also, the smc test crash with objc enabled, possibly with irc as well. While not a blocker for now, possibly follow up?

skmp avatar Aug 31 '22 08:08 skmp

Hmm, looks like that resolved the SMC issue, though we got another spurious failure in pthread_cancel in the 8.4 runner. I've seen this one before, though I haven't investigated it. Logged as follow up for https://github.com/FEX-Emu/FEX/issues/1754#issuecomment-1232677973 and will re-run CI here.

skmp avatar Aug 31 '22 09:08 skmp

Also, the smc test crash with objc enabled, possibly with irc as well. While not a blocker for now, possibly follow up?

Generated #1958

skmp avatar Aug 31 '22 09:08 skmp

Some performance numbers from Orion

No Cache

skmp@ornio:~/projects/FEX/build$ time Bin/FEXLoader /bin/ls > /dev/null

real	0m0,277s
user	0m0,249s
sys	0m0,025s

OBJCache

skmp@ornio:~/projects/FEX/build$ time Bin/FEXLoader /bin/ls > /dev/null
[Info] Warning: OBJ/IR Caches are experimental, and might lead to crashes.

real	0m0,029s
user	0m0,012s
sys	0m0,015s

Native

skmp@ornio:~/projects/FEX/build$ time /bin/ls > /dev/null

real	0m0,007s
user	0m0,000s
sys	0m0,007s

Best OBJCache result so far

skmp@ornio:~/projects/FEX/build$ time Bin/FEXLoader /bin/true > /dev/null
[Info] Warning: OBJ/IR Caches are experimental, and might lead to crashes.

real	0m0,014s
user	0m0,009s
sys	0m0,005s

skmp avatar Aug 31 '22 14:08 skmp

(Closing this as there is an ongoing powergrab by @Sonicadvance1, I will migrate my work to https://github.com/skmp/fex-emu-ng.git)

skmp avatar Sep 01 '22 14:09 skmp