GBADotnet icon indicating copy to clipboard operation
GBADotnet copied to clipboard

Improve prefetch unit until it passes all mgba timing tests

Open DaveTCode opened this issue 2 years ago • 12 comments

DaveTCode avatar Mar 20 '22 20:03 DaveTCode

image

1234/2020 passing as of today with timers working fairly well. IWRAM tests pass each time so the instructions themselves are about right. Likely to be issues with prefetch unit and other issues that I don't know about yet

DaveTCode avatar Apr 01 '22 16:04 DaveTCode

Fixes for DMA to share wait states with CPU have caused massively broken DMA tests image

needs investigation

DaveTCode avatar Apr 01 '22 16:04 DaveTCode

image

I've fiddled around with this a bunch tonight and got to 1341 (have had it higher but probably by accident). Specifically now the standard NOP tests all time correctly w/ prefetch (where they shouldn't activate it because there's no spare bus cycles)

The LDRH tests look two cycles high on prefetch activated versions which is fairly consistent across.

image

DaveTCode avatar Apr 11 '22 20:04 DaveTCode

image

The key learning here that fixes prefetch is that prefetching is always sequential (obviously) and pays no attention to the state of the SEQ signal from the CPU if a prefetch is already occurring. Not at all surprising but needed fixing nonetheless.

The remaining issues can be grouped as:

  1. Thumb multiplication (and follow on thumb bios timings) - suspect straightforward bug in thumb mul use of SEQ.
  2. DMA - screw DMA timings
  3. LDMIA across boundary of ROM - edge case which I would have expected to fall out but definitely doesn't

DaveTCode avatar Apr 12 '22 15:04 DaveTCode

image that's an example of DMA timings

DaveTCode avatar Apr 12 '22 15:04 DaveTCode

image

Multiplication failures, implies that the timing is off entirely, nothing to do with prefetch but only for thumb. Which is odd since they use the same code!

DaveTCode avatar Apr 12 '22 15:04 DaveTCode

Multiplication issues are resolved, they were caused by incorrect operand ordering for masks.

BIOS calls from thumb are not resolved though, so presumably there's something in there about switching from thumb/arm and back.

DaveTCode avatar Apr 12 '22 15:04 DaveTCode

Fix bios timings by clearing the prefetch unit when the pipeline is cleared

DaveTCode avatar Apr 12 '22 15:04 DaveTCode

image

118 tests left to go, I haven't counted but I think the two cases left are:

  1. LDMIA across ROM boundary (maybe just a case of checking when SEQ is set during LDMIA? Or possibly what order the LDMIA loads happen in although I'd have hoped that was already right)
  2. DMA - Likely two issues here,
    • Should take 2 cycles in certain cases (particularly around iwram?) - egregious timing differences!
    • Various times I take less cycles than expected but hard to pin down exactly when.

DaveTCode avatar Apr 12 '22 15:04 DaveTCode

The large number of "2 cycle" failures are because the CPU should have managed to read the timer value before getting blocked by DMA because the code is in IWRAM and so it taking very little time to run.

Trouble is that I thought that I'd nailed down the exact blocking behaviour for other tests!

DaveTCode avatar Apr 17 '22 20:04 DaveTCode

Rough plan to solve the DMA/CPU issue is to properly emulate the bus owner at any given cycle. That means a new flag on the bus and DMA setting/unsetting it depending on what state it's in. Not doing it now because it's late but at worse it's a slightly indirect way of writing the same code I have now. Hopefully it fixes this without breaking any other tests!

DaveTCode avatar Apr 17 '22 20:04 DaveTCode

image

Hot damn was that ever really tedious to figure out.

DaveTCode avatar Apr 18 '22 20:04 DaveTCode