riscv-debug-spec
riscv-debug-spec copied to clipboard
How to reset the DM on the first connection on both 0.13 and 1.0 spec versions
- Assume an external debugger (ED) is connecting to a hart for the first time.
- ED wants to reset the DM in order to get it into a predictable state.
- To reset the DM the ED needs to set
dmcontrol.dmactive = 0
. - However, the spec states [3.7. Abstract Commands]:
While an abstract command is executing ({abstractcs-busy} in {dm-abstractcs} is high), a debugger must not change {hartsel}...
The ED does not know whether an abstract command is executing, so it needs to read dmcontrol
to know the value of dmcontrol.hartsel
before writing dmcontrol = dmactive=0 | <the old hartsel value>
(requesting DM reset without changing hartsel).
- Assume the read of
dmcontrol
results in an error. - Since the ED does not know anything about the state of the DM yet, there are two possibilities:
-
dmcontrol.dmactive
is low and [3.14.2. Debug Module Control ]
Any accesses to the module may fail.
-
dmcontrol.dmactive
is high and it's a legitimate DMI error (e.g. some issue with the connection).
-
- There are two possible routes:
- ED assumes
dmcontrol.dmactive
is low and writesdmcontrol = <dmcontrol.dmactive=1> | <hartsel=0>
-- may result in the change ofhartel
, so ED can't do that. - ED reads
dmcontrol
again.
- ED assumes
Now let's take into account RISC-V Debug Spec v0.13, since the ED does not yet know which version it is (dtmcs
only distinguishes between 0.11 and (0.13 or 1.0)).
In versions less then 0.13.3, according to [1.2.1.2. Incompatible Changes from 0.13 to 1.0]:
- Bump version to 3. #512 , Require debugger to poll dmactive after lowering it. #566
The HW is not required to eventually return something meaningful when dmcontroll.dmactive
was set to low. So the ED may wait forever in the loop, reading dmcontrol
and getting an error on RISC-V Debug Spec v0.13 - compliant HW.
Can this somehow be fixed?
My proposal would be to specify the external debugger is permitted to perform dmactive <-- 0
at any time, without the need to retain any other dmcontrol bits, and regardless of the DM state -- so that the external debugger can bring the DM safely and reliably to a known state (but what happens to in-flight operations like abstract commands would be undefined).
I am inclined to believe that this was the intention of the debug spec, however strict interpretation of the spec text While an abstract command is executing (...), a debugger must not change {hartsel}
(as @en-sc pointed out above) seem to contradict that.
My proposal would be to specify the external debugger is permitted to perform
dmactive <-- 0
at any time, without the need to retain any other dmcontrol bits, and regardless of the DM state...
The thing is, there may be a RISC-V Debug Spec v0.13 implemetation that complies to this strict interpretation of While an abstract command is executing (...), a debugger must not change {hartsel}
. Since the ED does not know whether the target is 0.13 or 1.0 compliant at this poin, it should go with the strictest spec of the two and still read dmcontrol.hartsel
in order not to change it while writing dmactive
to zero.
I would suggest specifying that while dmactive
is low, read of dmcontrol
should not fail (may result in a busy response). However, It maybe to strict of a requirement for the HW.
I maintain the opinion that
- if a DM that is reset (via
dmcontrol.dmactive <-- 0
), and - at the same time change of
hartsel
bits by the external debugger disrupts the operation of the DM,
then the reset implementation in the DM can be considered defective, because in that case dmactive=0 does not clear the state of the debug module, as required by any version of the debug spec.
Since the ED does not know whether the target is 0.13 or 1.0 compliant at this poin, it should go with the strictest spec of the two (...)
My recommendation for the EDs would be to go with the most robust approach in this case - that is to perform the DM reset always. If the ED detects the case when there would be concern about the hartsel bits, the ED can warn the user and possibly provide a configuration option to the user to alter the behavior.
I'we created a flowchart that would hopefully help to better describe the issue.
- Asserting dmactive should always be allowed, and reset hartsel. dmactive is a reset signal.
- Even in 0.13, dmactive is listed as R/W, so writing 0 should result in the bit becoming 0 eventually. I don't see an issue with polling it and the bit never becoming 0. But maybe I misunderstood your point.
- If there is no DM at the bus address, then the value will always read 0 and the debugger will perceive it as dmactive never goes high. That's not ideal, but I think it's good enough.
- Asserting dmactive should always be allowed, and reset hartsel. dmactive is a reset signal.
I would to clarify this. Please, take a look at #1040.
- Even in 0.13, dmactive is listed as R/W, so writing 0 should result in the bit becoming 0 eventually. I don't see an issue with polling it and the bit never becoming 0. But maybe I misunderstood your point.
Oh, I see. I didn't consider this. Thanks!
- If there is no DM at the bus address, then the value will always read 0 and the debugger will perceive it as dmactive never goes high. That's not ideal, but I think it's good enough.
Initially, this issue was the least concerning to me. I think it can be mitigated by a descriptive error message on the ED side.
Asserting dmactive should always be allowed, and reset hartsel. dmactive is a reset signal.
That would have been a better way to write the spec, but not what the spec says. Fixing this is not backwards compatible, and it's not worth it to make a backwards incompatible change to address the extreme corner case where the DM is executing an abstract command and also reading dmcontrol results in an error.