amaranth
amaranth copied to clipboard
Strange simulator behavior: clock signal pausing while sync blocks are speeding up
I have the following simulator configuration:
receiver = ADATReceiver()
clk_freq = 120e6
# 24 bit plus the 6 nibble separator bits for eight channel
# then 1 separator, 10 sync bits (zero), 1 separator and 4 user bits
adat_freq = 48000 * ((24 + 6) * 8 + 1 + 10 + 1 + 4)
clockratio = clk_freq / adat_freq
sim = Simulator(receiver)
sim.add_clock(1.0/clk_freq, domain="sync")
sim.add_clock(1.0/adat_freq, domain="adat")
print(f"clock ratio: {clockratio}")
cycles = 10
def sync_process():
for _ in range(int(clockratio) * cycles):
yield Tick("sync")
def adat_process():
testdata = one_empty_adat_frame() + generate_sixteen_frames_with_channel_numbers_in_most_significant_nibble_and_sample_numbers_in_sample()
for bit in testdata[224:512 * 2]:
yield receiver.adat_in.eq(bit)
yield Tick("adat") # if I don't put this here, the adat signal is constant zero
sim.add_sync_process(sync_process, domain="sync")
sim.add_sync_process(adat_process, domain="adat")
with sim.write_vcd('receiver-smoke-test.vcd', traces=[receiver.adat_in, receiver.clk_in, receiver.adat_clk_in]):
sim.run()
When I simulate this, I have strange parts, where the clock of domain sync seems to pause, but at the same time a counter in that domain seems to speed up:
What is going on here? Am I using the simulator in a wrong way, or is this a bug in the simulator?
I have absolutely no idea how you managed to get this behavior; I'm assuming a simulator bug.
Please attach the full code necessary to reproduce this issue.
@whitequark I just pushed my code. It is my first nontrivial FPGA project, so I am pretty much a newbie here. It is also my first nmigen project. https://github.com/hansfbaier/adat-fpga
@whitequark Could you reproduce the behavior with the code I pushed above?
Sorry, I have only got around to it now. I get this:
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/whitequark/adat-fpga/adat/cli.py", line 62, in <module>
main()
File "/home/whitequark/adat-fpga/adat/cli.py", line 29, in main
args.with_icache, args.icache_nways, args.icache_nlines, args.icache_nwords,
AttributeError: 'Namespace' object has no attribute 'with_icache'
@whitequark : cli.py is not yet working, I have not come that far yet. While developing, I run the simulator in the main of: ./adat/sources/receiver/receiver.py I can run it from its directory with: /usr/bin/env /usr/bin/python3 receiver.py
Am Mo., 14. Dez. 2020 um 09:39 Uhr schrieb whitequark < [email protected]>:
Sorry, I have only got around to it now. I get this:
Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/whitequark/adat-fpga/adat/cli.py", line 62, in
main() File "/home/whitequark/adat-fpga/adat/cli.py", line 29, in main args.with_icache, args.icache_nways, args.icache_nlines, args.icache_nwords, AttributeError: 'Namespace' object has no attribute 'with_icache' — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nmigen/nmigen/issues/554#issuecomment-744130083, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABEI72V354A637BAULFMFLSUV3FNANCNFSM4UMVEV6Q .
The behavior seems to happen whenever combinatorial logic operates. Then the clock takes a gasp,
but synchronous changes happen anyway, but with "squeezed" timings.
Fortunately that did not hinder me completing the receiver. Just pushed all the changes. The design probably can be much simplified, but it is my first non trivial FPGA project, and it just passed the smoke test, wohoo!
The design probably can be much simplified, but it is my first non trivial FPGA project, and it just passed the smoke test, wohoo!
Congratulations! I haven't been able to look into this issue yet as I've been quite busy working on CXXRTL, but it's something that I think is very serious, so I'll make sure to address it before the next release.
As the repository mentioned above is a moving target, here is a snapshot to reproduce the issue:
Steps to reproduce:
-
cd adat
-
python3 receiver.py
- open the
receiver-smoke-test.vcd
with gtkwave and look whereclk_in
drops out
I also noticed this. In my case, only the next clock edge is executed (no more than one extra transition).
The code in the zip is quite straight forward: There are two clocks that go through a mux, and for a couple of cycles the output clock is inactive when switching clocks. At the beginning of this period the "off-by-one" error is clearly visible.
I added markers to the following image:
- A -> Clock 1 stops
- B -> Extra transition
- C -> Clock 2 stops
- D -> Extra transition
Same as @hansfbaier , run as python clock_bug_test.py
. Tested with nmigen 0.3.dev261+g9f78ac0
.
I have debugged both examples. For the original bug report, the issue comes from the following interaction:
- the clock signal of
sync
domain issync_time_detector.clk_in
- the clock signal of
sync
is driven by the simulator byadd_clock
-
sync_time_detector.clk_in
is also combinationally driven fromtop.clk_in
(a const-0 signal due to never being assigned) - the driver conflict is not detected (which is a bug tracked as #557)
- the two drivers fight unpredictably, with the comb driver winning whenever comb logic in
top
runs for unrelated reasons, sometimes changing value multiple times in 0 simulation time
The other report (from @andresmanelli) has a completely different failure mode, and is instead caused by the glitchy way in which the ClockMux
is written:
- some time after the selector is switched from clka to clkb, clka edge happens while
and_1_reg_3
is still high -
o_clk
immediately goes high (being combinationally set toand_1_reg_3 & clka
) -
clka
edge triggers logic inclka
domain and makesand_1_reg_3
transition low -
o_clk
immediately goes back low
The above happens all within 0 simulation time, making it invisible in VCD. It can be seen using the recently added fs_per_delta
option.
In both cases the solution is to be more careful when driving your clock signals.
Thank you for making us aware of fs_per_delta
I can see us getting a lot of use out of it