OpenHBMC
OpenHBMC copied to clipboard
WRAP burst sometimes reads wrong data
I sometimes see wrong data being read by WRAP bursts. The last 1-2 words of the burst (16 words) return data from an address 16 words higher than expected.
On the AXI bus it looks like this, the last word of the burst is delayed a lot (and contains wrong data):
This happens because the burst towards the HyperRAM is terminated too early and another burst is started by the hbmc_ctrl state machine:
After the red trigger line, there are two more RWDS transitions with data != 0x0, but the hb_recov_data_vld (dbg_dru_valid in the screenshot) and hb_recov_data (dbg_dru_data) signals don't output them, because the DRU already got reset.
A new burst is then started by the logic meant to continue long bursts after reaching the max CS low time, but since it isn't designed to handle WRAP-bursts, it uses the wrong address. (With INCR bursts this won't read wrong data I think, only reduce performance)
The reason for the early DRU reset seemed to be the different RWDS waveform at this sample, leading to hb_recov_data_vld going low, because the DRU needs another bit first to recover the data. When this happens close to the end of a burst, the ST_RD_8 state resets the DRU, even if there is more data to recover.
To verify this, I've added an additional state ST_RD_9 to ensure the state machine only advances if at least two cycles of data_vld are low. This leads to the following signals with a correctly read WRAP burst in a similar situation:
Notice the one low dru_valid cycle before the last words.
I'm not sure whether this would also be a good fix for this issue, but with this small change the Microblaze using OpenHBMC now ran three nights without any error, while before it crashed after 1-2 hours:
ST_RD_8: begin
if (~hb_recov_data_vld) begin
rd_state <= ST_RD_9;
end
end
ST_RD_9: begin
if (~hb_recov_data_vld) begin
dru_iserdes_rst <= 1'b1;
rd_state <= ST_RD_DONE;
end else begin
rd_state <= ST_RD_8;
end
end
Tested with 166.6 MHz HyperRAM clock, BUFIO/BUFR mode, 100 MHz AXI clock, on a TE0725.