PC miscompare in single step after debug entry on illegal instruction
Template for Bug Issue
ISS miscompare
Bug Title
CSR DECP and MECP miscompare when external debug request is granted during execution of an illegal instruction.
This issue appears related to https://github.com/openhwgroup/cv32e40p/issues/548 but manifests the failure mechanism slightly differently.
Component
Component:Verif
The miscompares are: UVM_ERROR @ 4921272.300 ns : uvmt_cv32_step_compare.sv(96) reporter [Step-and-Compare] PC expected=0x1a110800 and actual=0x0002f700 PC=0x1a110800 UVM_ERROR @ 4921299.300 ns : uvmt_cv32_step_compare.sv(96) reporter [Step-and-Compare] PC expected=0x1a110804 and actual=0x1a110800 PC=0x1a110804 UVM_ERROR @ 4921317.300 ns : uvmt_cv32_step_compare.sv(96) reporter [Step-and-Compare] PC expected=0x1a110808 and actual=0x1a110804 PC=0x1a110808
The sequence of events are:
- An illegal dret is executed at 0x16690. However at nearly the same time an external debug request is asserted.
- Unlike issue 548, this time the DEPC and MEPC do not miscompare upon the next instruction. The next instruction is the debug handler which is properly compared.
- The debug handler completes with a dret, placing the core into single-step mode.
- The next instruction retired by the core(RTL) is 0x2f700 (start of mtvec) as expected. However the ISS miscompares as it expects 0x1a1108000 (the debug handler start) as the next instruction.
Steps to Reproduce
Use this core-v-verif: https://github.com/silabs-oysteink/core-v-verif Branch: silabs-oysteink_random-debug
Common.mk should be changed to point to the head of the CV32E40P branch referenced: CV32E40P_REPO ?= https://github.com/strichmo/cv32e40p CV32E40P_BRANCH ?= strichmo/temp/tracer_with_ebrk #2020-10-08 -CV32E40P_HASH ?= 90c23eb +CV32E40P_HASH ?= head
Testcase executed with Xcelium 20.03.009 % makecv32 comp_corev-dv gen_corev-dv test TEST=corev_rand_debug SEED=1 CFG=no_pulp USER_RUN_FLAGS=+rand_stall_obi_all
Hi @strichmo, if I merge in core-v-verif pr #287 will it impact the Designer's ability to reproduce this error?
@MikeOpenHWGroup The above fork for cove-v-verif points to a custom tracer and custom RTL which we will PR later. So no, any ongoing master pushes/PRs will not affect reproducibility for this nor 548.
@strichmo @eroom1966 Hi Steve, please correct me if I am wrong here or if you have other expectations from us. The way I read above issue is that you are saying/expecting that there is an ISS issue here. For now I am therefore assigning this one to @eroom1966 . Please let me know if you think this should be handled differently.
@Silabs-ArjanB I agree with your statement. This appears to be an ISS issue.
@strichmo @Silabs-ArjanB @silabs-oysteink
I am hoping someone can direct me. Following the steps to reproduce I have an issue with this command
% makecv32 comp_corev-dv gen_corev-dv test TEST=corev_rand_debug SEED=1 CFG=no_pulp USER_RUN_FLAGS=+rand_stall_obi_all
I have no command called 'makecv32', so I thought this was a typo and that should be 'make cv32'
however issuing the command
make cv32 comp_corev-dv gen_corev-dv test TEST=corev_rand_debug SEED=1 CFG=no_pulp USER_RUN_FLAGS=+rand_stall_obi_all
gives me the error
... DEBUG:cfgyaml2make:{'cflags': '-DNO_PULP', 'compile_flags': '+define+NO_PULP\n', 'description': 'Sets all PULP-related parameters to 0', 'name': 'no_pulp', 'ovpsim': '--override root/cpu/marchid=4 --override ' 'root/cpu/misa_Extensions=0x1104 --override ' 'root/cpu/noinhibit_mask=0xFFFFFFF0\n'} DEBUG:cfgyaml2make:File written to /tmp/tmpkll4ok7a make: *** No rule to make target 'cv32'. Stop.
Can anyone see what I may be doing wrong ?
thx Lee
makecv32 is a wrapper found in core-v-verif/bin. If you include that in your $PATH it should be fine. You should also be able to substitute makecv32 with make as long as you are in the cv32/sim/uvmt_cv32 folder when you run it.
I think this was a complete typo, I removed the 'makecv32' and replaced with just 'make' I think this was the issue.
I am now seeing the SHELL interpreter issue I have reported previously
# Clean old assembler generated tests in results for (( idx=0; idx < $((0 + 1)); idx++ )); do
rm -f /home/moore/git/openhw/iss_549/core-v-verif/cv32/sim/uvmt_cv32/xrun_results/corev-dv/corev_rand_debug/corev_rand_debug_$idx.S;
done /bin/sh: 1: Syntax error: Bad for loop variable
I need to go and hack the makefile's to get this to work in a bash shell
@silabs-oysteink aha, we overlapped - I will go back and undo what I did - thx
Hi @eroom1966 Can you just try 'make' instead of 'makecv32'?
Hi All well this is disconcerting - for me it passes ? This would indicate a difference in the simulator event propagation
How do I now pass options to the ISS simulator, previously I modified the file called ./cv32/tests/cfg/default.yaml
This no longer seems to work ? is there a bespoke yaml for this test ?
I think we need to produce a full ISS trace listing from your (failing) environment, and my (working) environment, and compare the divergence
Thx Lee
@silabs-oysteink @strichmo @Silabs-ArjanB Hi All I need a trace file from a failing simulation, as I have mentioned, mine does not fail. I am running xcelium version
xrun -version TOOL: xrun(64) 19.09-s010
Not sure how this compares to your environment. I have also noticed the version of the ISS model is
$IdVer: IMPERAS_VERSION 20200821.2 $
Is this the version we fixed the race ? (answering my own question, this was the version fixing the race condition)
This is the log I have from running the make command line (this has full tracing enabled) simulate.log
This is generated by adding in the following parameters to the ISS control file --trace --tracechange --traceshowicount --monitornets --tracemode
Update, reproduced the issue and investigating .... Looks (at first glance) as though #548 is the same issue
Hi All having made a change to the RM in order to ensure correct handling of dret in Machine mode during single step, we see unexpected behavior earlier in the execution, here is an email I sent to @silabs-oysteink earlier which we discussed
the DCSR value is as follows, before each of these streams (case 1 / 2 ) execute dcsr 0x40008047 0100 0000 0000 0000 1000 0000 0100 0111 xdebugver=4 ebreakm=1 cause=1 nmip=0 step=1 prv=3
case 1 477180.000 ns 78656 1a110b36 340c1c73 csrrw x24, x24, 0x340 x24=00038ce4 x24:00038ce4 477192.000 ns 78657 1a110b3a 7b302773 csrrs x14, x0, 0x7b3 x14=00008000 477231.000 ns 78661 1a110b3e 7b200073 dret
**** we expect as per case 2 below, but seems to count as retired **** 477255.000 ns: Illegal instruction (core 0) at PC 0x00004594:
477273.000 ns 78672 1a110800 7b371073 csrrw x0, x14, 0x7b3 x14:00008000 477303.000 ns 78676 1a110804 7b202773 csrrs x14, x0, 0x7b2 x14=00000000
case 2 1376058.000 ns 227001 1a110b36 340c1c73 csrrw x24, x24, 0x340 x24=00038ce4 x24:00038ce4 1376070.000 ns 227002 1a110b3a 7b302773 csrrs x14, x0, 0x7b3 x14=00008000 1376109.000 ns 227006 1a110b3e 7b200073 dret 1376133.000 ns 227011 0002f700 340c1c73 csrrw x24, x24, 0x340 x24=00038ce4 x24:00038ce4 1376166.000 ns 227015 1a110800 7b371073 csrrw x0, x14, 0x7b3 x14:00008000 1376187.000 ns 227019 1a110804 7b202773 csrrs x14, x0, 0x7b2 x14=00000000
What we cannot understand is why the behavior of executed dret in Machine mode is different in the RTL in the two cases.
update it would appear that the problem we are seeing is possibly due to the setting of the haltreq signal too early at the ISS @silabs-oysteink and myself are still investigating.
Hi @eroom1966 @strichmo Is this issue still being looked at or has it been solved?
still open, I have not had time to investigate a good way to schedule the haltreq signal, I will get some time today to re-investigate
@eroom1966 any update on this issue?
This seems a very old issue Can it be reproduced on the current core-v-verif testbench, if not should it be closed ?
Wow, I wonder how we let this one get through. It is probably resolved, but I will investigate.
Hi @MikeOpenHWGroup
Any news on that issue?
Can we close it?
This issue affects the deprecated "step-and-compare-2.0 with OVPsim" used for CV32E40Pv1. The consensus opinion is that the issue was either in the OVPsim ISS or step-and-compare logic (not the RTL). As both step-and-compare-2.0 and OVPsim are depreciated I will close this issue.