Write data before address is lost
UPDATE: When I changed DEFAULT_DELAY => 1 PS, the problem was resolved. If a delay of 0 is not supported, I suggest an assert (at least) in Axi4MemoryVTI.vhd.
I'm using Axi4/src/Axi4MemoryVTI.vhd. When I enable SetUseRandomDelays(TransRec, TRUE) in my test bench, I see that the memory component accepts the write burst data (AxiBus.WriteData.Ready = '1') before it accepts the write burst data (AxiBus.WriteAddress.Ready). In this case, the data for the next write transaction is stored to this address. Note I also have DEFAULT_DELAY => 0 ps.
Here is a waveform using my code's AXI bus structure, sorry. The cursor is at the point where the write address is accepted. When I subsequently read from address 0x012, I get 0xECE3 (first data from next transaction, lower right) instead of 0x75C8 (first data from matching transaction, lower left).
Following is the OSVVM transcript for this waveform. I see two "Write Address" transactions at 64 ns and 66 ns. (My clock period is 2 ns.)
%% 56 ns Log DEBUG in u_mem, Write Data. WData: 75C8 WStrb: 11 Operation# 1
%% 58 ns Log DEBUG in u_mem, Write Data. WData: D6CD WStrb: 11 Operation# 2
%% 64 ns Log DEBUG in u_mem, Write Address. AWAddr: 012 AWProt: 000 AWLen: 00000001 AWSize: 001 AWBurst: 01 AWID: 00 AWUser: 0 Operation# 1
%% 64 ns Log INFO in u_mem, Memory Write. AWAddr: 012 AWProt: 000 WData: 75C8 WStrb: 11 Operation# 0
%% 66 ns Log DEBUG in u_mem, Write Address. AWAddr: 012 AWProt: 000 AWLen: 00000001 AWSize: 001 AWBurst: 01 AWID: 00 AWUser: 0 Operation# 2
%% 72 ns Log DEBUG in u_mem, Write Response. BResp: 0 BID: 0 BUser: 0 Operation# 1
%% 80 ns Log DEBUG in u_mem, Write Data. WData: ECE3 WStrb: 11 Operation# 3
%% 80 ns Log DEBUG in u_mem, Write Address. AWAddr: 051 AWProt: 000 AWLen: 00000110 AWSize: 001 AWBurst: 01 AWID: 00 AWUser: 0 Operation# 3
%% 80 ns Log INFO in u_mem, Memory Write. AWAddr: 012 AWProt: 000 WData: ECE3 WStrb: 11 Operation# 1
%% 82 ns Log DEBUG in u_mem, Write Data. WData: C598 WStrb: 11 Operation# 4
%% 82 ns Log DEBUG in u_mem, Write Response. BResp: 0 BID: 0 BUser: 0 Operation# 2
%% 84 ns Log DEBUG in u_mem, Write Data. WData: ADDF WStrb: 11 Operation# 5
%% 84 ns Log INFO in u_mem, Memory Write. AWAddr: 051 AWProt: 000 WData: ADDF WStrb: 11 Operation# 2
I found it. Currently if the delay is 0, in the procedure DoAxiReadyHandshake, there is a sequence of code that
if Valid = '1' then
-- Proper handling
if not ReadyBeforeValid then
Ready <= '1' after ReadyDelayCycles + tpd_Clk_Ready ;
end if ;
-- If ready not signaled yet, find ready at a rising edge of clk
if Ready /= '1' then
wait on Clk until Clk = '1' and (Ready = '1' or Valid /= '1') ;
AlertIf(
AlertLogID,
Valid /= '1',
TimeOutMessage &
" Valid (" & to_string(Valid) & ") " &
"deasserted before Ready asserted (" & to_string(Ready) & ") ",
FAILURE
) ;
end if ;
else
-- TimeOut handling
Alert(
AlertLogID,
TimeOutMessage & " Valid: " & to_string(Valid) & " Expected: 1",
FAILURE
) ;
end if ;
-- End of operation
Ready <= '0' after tpd_Clk_Ready ;
So when ReadyBeforeValid is FALSE, tpd_Clk_Ready = 0 ps, ReadyDelayCycles = 2 * tperiod_Clk, the code execution path is:
Ready <= '1' after ReadyDelayCycles + tpd_Clk_Ready ;
wait on Clk until Clk = '1' and (Ready = '1' or Valid /= '1') ;
Ready <= '0' after tpd_Clk_Ready ;
As a result, this projects a waveform on to Ready for it to change to '1' at delta cycle 0 of in 2 clocks from now. When Clock rises at this time, Ready just became '1', the wait wakes up and immediately assigns Ready back to 0.
There are a number of potential fixes:
- Make sure no one sets the delay to 0 ns.
- Change the wait statement to:
wait on Clk until Clk = '1' and ((Ready = '1' and Ready'last_event > 0 sec) or Valid /= '1') - Reassign the generics to internal constants: constant tpd_Clk_Ready_norm : time := maximum(tpd_Clk_Ready, std.env.resolution_limit) ;
- Change the assignment to Ready to WaitForClock(Clk, ReadyDelayCycles) ; -- Make ReadyDelayCycles an integer Ready <= '1' after tpd_Clk_Ready ;
Analysis:
- Given that two people have already set the delay to 0, I doubt even if I document it that I can prevent this from happening.
- This gives me concern about the overhead of Ready'last_event > 0 sec
- This gives me concern that it must be tediously done for every AXI VC. It would be easy to miss one.
- I think I like this. While it probably has more overhead than solution 2, it leads to stability. tperiod_Clk no longer needs to be entirely accurate as it is just used for the timeout values, hence, tperiod_Clk could be measured in the VC rather than specified as a generic. This will require an interface change to DoAxiReadyHandshake which can at least temporarily be handled by overloading.
Good find!
I'm not a fan of enforced delays - independent of 1 ns, 1 ps or 1 fs - converting behavioral simulations to "gate" simulations. Aa a result, all checks in a testbench would need to be changed to a timing aware check. Simple checks at rising_edge() aren't possible anymore.
AMD does it in some of there primitives like BlockRAMs or DSPs. It opens just a can of worms and causes many many problems in simulation as well as it increases overall testbench complexity.