ostis
ostis copied to clipboard
Clocked CPU
@stefanberndtsson is planning to convert the CPU component into a clocked design.
Would it be possible to jot down a few notes here every now and then to get a feel for what's going to happen?
Would it be possible to have the old implementation in parallel with the new, as a compile-time option?
Yes, jotting will be done. :)
As for having a compile option, not so sure. Even the first iteration, which should be fairly kind to the structure still requires changing at least one actual instruction. That would mean pretty much keeping separate compilable instances of cpu_run(), cpu_step_instr(), cpu_do_cycle() and any changed instructions. I'm not sure this is manageable for more than one or two instructions, unless the whole cpu/* is preserved and clocked cpu is done alongside it in a new directory of course.
Yes, I was thinking new directory.
Ok, I'll start that way. I won't maintain two different models forever though :)
Ok, plans (everything based off the same 8MHz clock):
First iteration
- Create a new cpu_clock() which will do single clock cycles for the cpu.
- Each clock will run the glue/mmu/shifter_clock functions that runs from cpu_do_cycle() today
- Each clock will run a new version of cpu_step_instr() that does the following:
- If cpu->icycle is larger than 0, do nothing, just return; the previous instruction hasn't "finished" yet.
- Fetch instruction (or use a previously prefetched one) if the previous instruction has consumed all its cycles
- Run the new instruction (which is the exakt same code as today). This instruction will run everything on the first clock cycle, but will report how many cycles there should be in total (cpu->icycle) before the next one is to be run
- Handle exceptions if at the appropriate cycle
- Decrement cpu->icycle
- Return from function
- Rewrite movem to work with the above scheme (it does not give a proper cpu->icycle, all other instructions do)
This should take care of the basic bits, and the next code iteration can begin
Second iteration
- Rewrite one instruction at a time to be reentrant and be called for every cycle. The instruction should then handle its read/write and the prefetch at the proper cycle points. This will remove the need for a bus_read/write_long() since the words will be read separately. It will not handle waitstates.
- Either here or the next iteration, move the new cpu_clock() code into something more of a system clock, and call cpu_clock() from there, just as it'll call mmu/glue/shifter as well.
Third iteration
- Make bus_read/write into a split call, by requesting the action and handling the result. This will make it possible for an external part to insert waitstates onto the cpu.
Thanks; seem like a level-headed approach.
Temporarily handling wait states is easy, if you choose to do that. Share a clock counter between the CPU and MMU. When you want to start a bus cycle, delay until (clock & 3) == 0.
Those are not the waitstates I'm refering to. They can be simulated within the cpu. The 2/6-cycle (yes, I know it's not exactly 6c, but dependent on the E-clock) delays from the PSG/ACIA are not something the cpu itself can really predict without too much hacky code. Those could be handled with the split of request/result on the bus.
In the meantime, the devices will simulate the extra cycles just like they do today (by increasing icycle)
Ah, those waitstates.
There might be a small addition to the second iteration here.
The instruction sets icycle to however many cycles that needs to pass before entering the instruction again. Normally 0. It also sets a (new) flag to signal that the instruction is done.
This means that the current external waitstate simulation can run as it does now, since that read will just increase the icycle to 2 (or 6 or so), and thereby delaying the instruction a bit. Yes, it will technically be delayed in the wrong way (it will get the result first, then wait), but in the second iteration this is good enough and won't interfere with any existing code.
The clocked-cpu branch now contains most of the concepts listed in the first iteration. Movem still does the wrong thing, but it works somewhat at least. The first iteration is not finished yet though.
Helpful hint: Don't run along too far on the branch. I suggest either merging it into master occasionally, or rebasing it to master frequently. I do the latter with my branches.
Once I get the debugger bits done, I think I'll just merge it for now, since it doesn't affect the old system anyway. Otherwise I would've rebased a lot.
"68000 Undocumented Behavior Notes" http://web.archive.org/web/20091015132715/http://www.trzy.org/files/68knotes.txt