ostis Clocked CPU

@stefanberndtsson is planning to convert the CPU component into a clocked design.

Would it be possible to jot down a few notes here every now and then to get a feel for what's going to happen?

Mar 23 '16 07:03 larsbrinkhoff

Would it be possible to have the old implementation in parallel with the new, as a compile-time option?

Mar 23 '16 07:03 larsbrinkhoff

Yes, jotting will be done. :)

As for having a compile option, not so sure. Even the first iteration, which should be fairly kind to the structure still requires changing at least one actual instruction. That would mean pretty much keeping separate compilable instances of cpu_run(), cpu_step_instr(), cpu_do_cycle() and any changed instructions. I'm not sure this is manageable for more than one or two instructions, unless the whole cpu/* is preserved and clocked cpu is done alongside it in a new directory of course.

Mar 23 '16 07:03 stefanberndtsson

Yes, I was thinking new directory.

Mar 23 '16 07:03 larsbrinkhoff

Ok, I'll start that way. I won't maintain two different models forever though :)

Mar 23 '16 10:03 stefanberndtsson

Ok, plans (everything based off the same 8MHz clock):

First iteration

Create a new cpu_clock() which will do single clock cycles for the cpu.
Each clock will run the glue/mmu/shifter_clock functions that runs from cpu_do_cycle() today
Each clock will run a new version of cpu_step_instr() that does the following:

If cpu->icycle is larger than 0, do nothing, just return; the previous instruction hasn't "finished" yet.
Fetch instruction (or use a previously prefetched one) if the previous instruction has consumed all its cycles
Run the new instruction (which is the exakt same code as today). This instruction will run everything on the first clock cycle, but will report how many cycles there should be in total (cpu->icycle) before the next one is to be run
Handle exceptions if at the appropriate cycle
Decrement cpu->icycle
Return from function

Rewrite movem to work with the above scheme (it does not give a proper cpu->icycle, all other instructions do)

This should take care of the basic bits, and the next code iteration can begin

Second iteration

Rewrite one instruction at a time to be reentrant and be called for every cycle. The instruction should then handle its read/write and the prefetch at the proper cycle points. This will remove the need for a bus_read/write_long() since the words will be read separately. It will not handle waitstates.
Either here or the next iteration, move the new cpu_clock() code into something more of a system clock, and call cpu_clock() from there, just as it'll call mmu/glue/shifter as well.

Third iteration

Make bus_read/write into a split call, by requesting the action and handling the result. This will make it possible for an external part to insert waitstates onto the cpu.

Mar 23 '16 11:03 stefanberndtsson

Thanks; seem like a level-headed approach.

Temporarily handling wait states is easy, if you choose to do that. Share a clock counter between the CPU and MMU. When you want to start a bus cycle, delay until (clock & 3) == 0.

Mar 23 '16 11:03 larsbrinkhoff

Those are not the waitstates I'm refering to. They can be simulated within the cpu. The 2/6-cycle (yes, I know it's not exactly 6c, but dependent on the E-clock) delays from the PSG/ACIA are not something the cpu itself can really predict without too much hacky code. Those could be handled with the split of request/result on the bus.

In the meantime, the devices will simulate the extra cycles just like they do today (by increasing icycle)

Mar 23 '16 11:03 stefanberndtsson

Ah, those waitstates.

Mar 23 '16 11:03 larsbrinkhoff

There might be a small addition to the second iteration here.

The instruction sets icycle to however many cycles that needs to pass before entering the instruction again. Normally 0. It also sets a (new) flag to signal that the instruction is done.

This means that the current external waitstate simulation can run as it does now, since that read will just increase the icycle to 2 (or 6 or so), and thereby delaying the instruction a bit. Yes, it will technically be delayed in the wrong way (it will get the result first, then wait), but in the second iteration this is good enough and won't interfere with any existing code.

Mar 23 '16 12:03 stefanberndtsson

The clocked-cpu branch now contains most of the concepts listed in the first iteration. Movem still does the wrong thing, but it works somewhat at least. The first iteration is not finished yet though.

Mar 23 '16 20:03 stefanberndtsson

Helpful hint: Don't run along too far on the branch. I suggest either merging it into master occasionally, or rebasing it to master frequently. I do the latter with my branches.

Mar 24 '16 14:03 larsbrinkhoff

Once I get the debugger bits done, I think I'll just merge it for now, since it doesn't affect the old system anyway. Otherwise I would've rebased a lot.

Mar 24 '16 14:03 stefanberndtsson

"68000 Undocumented Behavior Notes" http://web.archive.org/web/20091015132715/http://www.trzy.org/files/68knotes.txt

Mar 31 '16 05:03 larsbrinkhoff

ostis ostis copied to clipboard

Clocked CPU

ostis
ostis copied to clipboard