n64: implement accurate timer interrupt in recompiler
Previously, the timer interrupt was only being checked at the end of each recompiler blocks, so each interrupt was potentially delayed until next branch. Now, it is checked as part of CPU::step so it is cycle-accurate. If the timer interrupt is triggered, the recompiler is early-exited so that the interrupt can actually occur precisely at the right moment.
On top of that, the previous code had also a overflow-checking bug, so it failed to trigger an interrupt when executing a block in which the counter overflown and compare was reached at the same time.
Nice one! Thank you 😊
Yesterday I got confused about all the cycles shift. I revisited the code today and realized that it's still necessary to keep a 33-bit compare / count now that we increment count for each cycle (as count counts half-cycles).
I changed the implementation to do the check in CPU::step so that it should be cycle-accurate in any case, including stalls.
I happened across another bit of code that further complicates things. The recompiler can emit code that directly modifies the clock counter: https://github.com/higan-emu/ares/blob/ed4d21d7a35ee1f1e3e4ba44f40781d2026bffb3/ares/n64/cpu/recompiler.cpp#L39-L44 The existing implementation handles these cycles in synchronize(), but the code in this PR currently does not.
I've pushed a commit that switches to doing the timer interrupt check only in instructionEpilogue as requested by the review. I can't personally vet it's correct. It does pass the libdragon testsuite which includes a stress test of the timer interrupt, but:
- I can't personally vet that all cases where an instruction causes the timer interrupt to occur now triggers. For instance, I don't know if there are (or there will be in the future) calls to
Cpu::stepbetweeninstructionEpilogue()andsynchronize(), which would make the thread clock advance "too late". We could enforce this by making sure thatprevCountis equal toThread::clockwhensynchronizeis called. I can add an assert there, though there are no other asserts in the code base (?). - There are surely race conditions if a timer interrupt occurs mid-instruction (eg: during RDRAM stalls on instruction fetch) on instructions that modify
countorcompare(MTC0). This could be fixed with a call tocheckTimerInterruptin the right places, but I don't have good tests for this.
In general, I feel this version is harder to verify that it is correct and possibly more fragile to future modifications of the code base. I can't deny that it must be faster than the previous one, though I don't know how much and I don't have means to do correct timings.
I happened across another bit of code that further complicates things. The recompiler can emit code that directly modifies the clock counter:
https://github.com/higan-emu/ares/blob/ed4d21d7a35ee1f1e3e4ba44f40781d2026bffb3/ares/n64/cpu/recompiler.cpp#L39-L44
The existing implementation handles these cycles in synchronize(), but the code in this PR currently does not.
We can fix this by changing the idle skip to a call to Cpu::step(64). I can do that if we decide to go with the first version of the PR