opentitan icon indicating copy to clipboard operation
opentitan copied to clipboard

[tlul] TL-UL `data` zeroing inconsistencies

Open ballifatih opened this issue 2 years ago • 9 comments

Description

This is a spin-off from #16767 concerning the inconsistencies observed at TL-UL data port. In the earlier issue, there was a consensus that, all things considered, we would like to clear the value of data after a TL-UL transaction is completed (with an exception to entropy related data/randomness, @vogelpi).

Using the same waveform examples, notice the following highlighted inconsistencies:

Expected behavior:

  • 0x0 -> data -> 0x0 on any TL-UL data port (host or client)

Unexpected observation:

  • On the keymgr side, the TL-UL output data port is not cleared.
  • Sometimes data port is cleared to all 1s (observed at Ibex TL-UL input).
  • On Ibex side, the TL-UL output port is feeding some "garbage" values to data port, even though there is no transaction.

Screenshot from 2023-02-21 17-56-50 Screenshot from 2023-02-21 17-51-47

ballifatih avatar Feb 21 '23 17:02 ballifatih

@andreaskurth As far as M2.5 is concerned, I guess here only the security sensitive data that is left on the bus could be a problem. For instance, in the first example we see the key that is being read at 32 bit granularity lives on the bus for a while.

ballifatih avatar Feb 21 '23 17:02 ballifatih

I'll take a look at what's happening on the Ibex side, I can't remember off hand how the TL-UL wrapper is implemented but it may be we need to explicitly zero out the write data from Ibex. We could it in the TL-UL wrapper itself but it would be sensible feature for Ibex secure configuration anyway (i.e. for users outside of OpenTitan).

GregAC avatar Feb 24 '23 17:02 GregAC

I think setting TL-UL output explicitly to zero when not used is a good idea.

From the first waveform (highlighted with light blue), it seems like even when Ibex TL-UL output moves to another value, the value seen by keymgr's TL-UL input remains same. I couldn't dig into it, but maybe there is also a generic TL-UL primitive that is feeding this value from xbar.

ballifatih avatar Feb 24 '23 18:02 ballifatih

As discussed in Security WG (16/03/2023), this is not a necessary fix for M2.5, therefore I will mark it as FutureRelease. I will just repeat some notes from this meeting:

As pointed out by @zi-v and @moidx, if we force this behavior (i.e. always zero after transaction) at HW level, then this is not something we can revert at SW. On the other hand, in the case data is left on the bus, we can prevent this by dummy TL-UL transactions. The latter option is allows more flexibility in the face of risks posed by certification process.

Reconciling this with the contradictory outcome of #16767, I think what really remains for the FutureRelease is to ensure consistency regardless of which of the two strategies we choose.

ballifatih avatar Mar 16 '23 21:03 ballifatih

We have discussed the security implications in the previous issue #16767 and concluded on no HW change for security, but needs to be considered for cryptolib SW dev. Removing Hostlist:Security hence.

This issue here is meant to discuss TL-UL data zeroing from a design consistency stand-point. @msfschaffner and @GregAC PTAL, might be in the nice to have, hence, close as not planned for now territory.

johannheyszl avatar Jan 15 '24 08:01 johannheyszl

Thanks @johannheyszl, considering that we would like to minimize changes at this point, I advocate close as not planned for now if this is not required from a security standpoint.

@GregAC @vogelpi @andreaskurth WDYT?

msfschaffner avatar Feb 20 '24 20:02 msfschaffner

I agree that this isn't needed for the current release. For a future release, however, we may want to consider an enhancement where SW can configure tlul_adapter_host to 'wipe' the data bus to zero or with randomness between valid flits. (Rationale: Easier programming model and less opportunities to make mistakes, and fewer cycles during which confidential data would be 'exposed' on the data bus.) I thus wouldn't close it completely but move to Backlog. (The parent issue #16767 gets closed as soon as programmer guidance for the current behavior is in place.)

andreaskurth avatar Feb 23 '24 09:02 andreaskurth

Agreed with @andreaskurth

GregAC avatar Feb 23 '24 11:02 GregAC

Leaving this as sounds good to me for Eearlgrey-PROD.

There are two separate things to take care of, both wouldn't be hard but I agree to minimize changes at this point. For future releases, the following needs to be taken care of:

  1. Ibex: Updating the data output without doing load/store instructions should be avoided.

  2. For the TL-UL adapters, there are two cases:

    • With the parameter AccessLatency == 0 (the default), devices leave the last data on the bus.
    always_ff @(posedge clk_i or negedge rst_ni) begin
      if (!rst_ni) begin
        rdata_q <= '0;
        error_q <= 1'b0;
      end else if (a_ack) begin
        rdata_q <= (error_i || err_internal || wr_req) ? '1 : rdata_i;
        error_q <= error_i || err_internal;
      end
    end
    assign rdata = rdata_q;
    

    A new case needs to be added to the flop if a_ack == 0 and d_ack == 1 to also set this to all 1. All 0 would also work but it's going to be more logic.

    • With the parameter AccessLatency == 1, we have the following:
          assign rdata = (error_i || error_q || wr_req_q) ? '1      :
                    (rd_req_q)                       ? rdata_i :
                                                       rdata_q; // backpressure case
    

    meaning the read data first propagates through and is then latched into rdata_q. Clearing to '1 after the d_ack requires more thought and is a bit more tricky.

vogelpi avatar Feb 23 '24 11:02 vogelpi