core-v-verif icon indicating copy to clipboard operation
core-v-verif copied to clipboard

OVPSIM mret does not set MIE = MPIE in MSTATUS CSR

Open silabs-mateilga opened this issue 2 years ago • 70 comments

For a few select seeds when running the "corev_rand_interrupt_wfi_mem_stress" test, what seems to be a bug in the reference model can be observed. The scenario is that when a core is woken from a wfi instruction by a disabled interrupt, and the next instruction is an mret instruction, the MIE bit is not set to MPIE in the MSTATUS CSR, which is in breach with the 1.12 privileged spec.

Steps to Reproduce: Branch: cv32e40x/dev https://github.com/openhwgroup/core-v-verif/tree/cv32e40x/dev Hash: fe9bdeff9178adbff54ca5591e71e08c757bffae

Command line path: core-v-verif/cv32e40x/sim/uvmt/

Run commands: make comp_corev-dv gen_corev-dv TEST=corev_rand_interrupt_wfi_mem_stress CFG=default SIMULATOR=xrun USE_ISS=YES COV=YES RUN_INDEX=27 GEN_START_INDEX=27 RNDSEED=-844432648

make test TEST=corev_rand_interrupt_wfi_mem_stress CFG=default SIMULATOR=xrun USE_ISS=YES COV=YES RUN_INDEX=27 GEN_START_INDEX=27 RNDSEED=-844432648

Waveform showing the difference in the MSTATUS register between RVFI and RVVI: image

@eroom1966 Could you take a look at this?

silabs-mateilga avatar Feb 18 '22 10:02 silabs-mateilga

@silabs-mateilga Having trouble reproducing your steps, this is what I did after setting my environment to run xcelium

$ git clone https://github.com/openhwgroup/core-v-verif $ cd core-v-verif $ git checkout cv32e40x/dev $ git reset fe9bdef --hard $ make comp_corev-dv gen_corev-dv TEST=corev_rand_interrupt_wfi_mem_stress CFG=default SIMULATOR=xrun USE_ISS=YES COV=YES RUN_INDEX=27 GEN_START_INDEX=27 RNDSEED=-844432648

make: *** No rule to make target 'comp_corev-dv'. Stop.

Please advise Thx Lee

eroom1966 avatar Feb 18 '22 12:02 eroom1966

Ah, forgot to mention the path. The commands must be ran from: core-v-verif/cv32e40x/sim/uvmt/

silabs-mateilga avatar Feb 18 '22 12:02 silabs-mateilga

Now I get /home/moore/git/bugs/core-v-verif/mk/Common.mk:265: *** CV_SW_PREFIX not defined in either the shell environment, test.yaml or cfg.yaml. Stop.

FYI, the variables I am setting are, please advise if this is incorrect export SIMULATOR=xrun export PATH=${PATH}:$(pwd)/core-v-verif/bin export COREV=YES export COREV_SW_TOOLCHAIN=/home/moore/riscv/corev-openhw-gcc-ubuntu1804-20200913 export CV_SW_TOOLCHAIN=/home/moore/riscv/corev-openhw-gcc-ubuntu1804-20200913 export CV_SW_VENDOR=corev

eroom1966 avatar Feb 18 '22 12:02 eroom1966

Looks like I need the following export CV_SW_PREFIX=riscv32-corev-elf-

eroom1966 avatar Feb 18 '22 14:02 eroom1966

Ah, apologies, I sometimes forget how much of the environment is "just there".

export CV_SW_TOOLCHAIN=/tool/gcc/riscv32-embecosm-gcc-centos7-20211031 export CV_SW_PREFIX=riscv32-unknown-elf- export CV_SW_VENDOR=unknown export CV_SIMULATOR=xrun export CV_CORE=cv32e40x

silabs-mateilga avatar Feb 18 '22 14:02 silabs-mateilga

Next issue I am sure I have reported this before, I think the problem is the type of shell

There is a 'for loop' construct used here which has a pre-requisite on the type of shell being used to invoke make I think this is assuming csh or maybe ksh ? I am using /bin/bash

If there are constructs requiring specific shell features, then the Makefiles must define the SHELL variable in order to use the correct interpreter https://www.gnu.org/software/make/manual/html_node/Choosing-the-Shell.html

Do you know what shell this is expecting to invoke ?

( We could probably benefit with some advice from @MikeOpenHWGroup )

make: [/home/moore/git/bugs/core-v-verif/mk/uvmt/xrun.mk:446: gen_corev-dv] Error 1 (ignored)
for (( idx=27; idx < $((27 + 1)); idx++ )); do \
	cp -f /home/moore/git/bugs/core-v-verif/cv32e40x/bsp/link_corev-dv.ld /home/moore/git/bugs/core-v-verif/cv32e40x/sim/uvmt/xrun_results/default/corev_rand_interrupt_wfi_mem_stress/$idx/test_program/link.ld; \
	cp /home/moore/git/bugs/core-v-verif/cv32e40x/sim/uvmt/xrun_results/default/corev-dv/corev_rand_interrupt_wfi_mem_stress/corev_rand_interrupt_wfi_mem_stress_$idx.S /home/moore/git/bugs/core-v-verif/cv32e40x/sim/uvmt/xrun_results/default/corev_rand_interrupt_wfi_mem_stress/$idx/test_program; \
done

eroom1966 avatar Feb 18 '22 14:02 eroom1966

~~I believe that is plain sh-syntax~~

silabs-hfegran avatar Feb 18 '22 14:02 silabs-hfegran

I'm unsure of what the specific problem is, but here is the latest documentation of the environment variables required.

https://github.com/openhwgroup/core-v-verif/tree/master/mk#required-corev-environment-variables

silabs-mateilga avatar Feb 18 '22 14:02 silabs-mateilga

do you think that this should run in a bash shell ? Can I ask what shell you folks are using, and on what host OS ?

eroom1966 avatar Feb 18 '22 14:02 eroom1966

This should run fine invoked from a bash shell on most linux-distributions.

What kind of OS are you attempting to run this on @eroom1966?

silabs-hfegran avatar Feb 18 '22 14:02 silabs-hfegran

I see the problem, I have constructed a small testcase This syntax does NOT work in a bourne (/bin/sh) shell, but does work in a bourne-again shell (/bin/bash)

what you may have on your Linux system is a symbolic link from /bin/sh -> /bin/bash could you check ?

make by default will use /bin/sh - this syntax will not work in /bin/sh in the makefile I can force it to use /bin/bash by defining SHELL = /bin/bash

here is my makefile

SHELL = /bin/bash
GEN_START_INDEX=0
GEN_NUM_TESTS=10
all:
	echo "shell=$(SHELL)"
	for (( idx=${GEN_START_INDEX}; idx < $$((${GEN_START_INDEX} + ${GEN_NUM_TESTS})); idx++ )); do \
		echo "$$idx"; \
	done

I am pretty sure that your linux system has a link from /bin/sh -> /bin/bash hence it gets a bash style interpreter in Make

This is not guaranteed to work on a machine without this link if this is the case

eroom1966 avatar Feb 18 '22 14:02 eroom1966

This should run fine invoked from a bash shell on most linux-distributions.

What kind of OS are you attempting to run this on @eroom1966?

Ubuntu 18 the problem is not the invoking shell, it is the resolution of /bin/sh as the manual states make uses /bin/sh unless specified otherwise this can be overridden with SHELL = /bin/bash

eroom1966 avatar Feb 18 '22 14:02 eroom1966

Thanks for that @eroom1966, you are correct - it is linked to bash

silabs-hfegran avatar Feb 18 '22 14:02 silabs-hfegran

OK, the makefiles need to define the SHELL variable for the interpreter intended to be used, which it appears is /bin/bash

eroom1966 avatar Feb 18 '22 15:02 eroom1966

Apologies for these issues - Ubuntu to my knowledge has not been extensively tested as it is not an officially supported cadence platform. That said, this should not be related to the shell issues you are seeing. ExternalRepos.mk, which is included in core-v-verif/cv32e40x/sim/uvmt/Makefile that should be called when executing make from the sim/uvmt-folder, exports the SHELL-variable to /bin/bash when invoked.

Just to double check, are you running from the core-v-verif/cv32e40x/sim/uvmt-folder as @mateilga mentioned previously in this discussion? I can see this problem arising if the makefiles in mk/uvmt are invoked directly.

To test this, I added the echo $(SHELL) to the comp_corev-dv target in the xrun-makefile. Running echo $SHELL in my local terminal i correctly see the /bin/zsh-shell that I am using, while running the makefile correctly echoes /bin/bash as set by ExternalRepos.mk

silabs-hfegran avatar Feb 18 '22 15:02 silabs-hfegran

slow progress, getting a compilation error

/home/moore/riscv/corev-openhw-gcc-ubuntu1804-20200913/bin/riscv32-corev-elf-gcc -Os -g -static -mabi=ilp32 -march=rv32imc_zba1p00_zbb1p00_zbc1p00_zbs1p00 -Wall -pedantic  -c /home/moore/git/bugs/core-v-verif/cv32e40x/bsp/crt0.S -o crt0.o
Assembler messages:
Fatal error: -march=rv32imc_zba1p0_zbb1p0_zbc1p0_zbs1p0: Invalid or unknown z ISA extension: 'zba'

I am guessing my toolchain is too old ? Can you please provide a link to the toolchain necessary to compile this example ?

Thx Lee

eroom1966 avatar Feb 18 '22 15:02 eroom1966

We are using the Risc-v top-of-tree compilers from embecosm (currently our installed version is the one from october 31st last year. As that build may no longer be available for download, any later gcc build, available at the link below, should work and have the necessary support to run these tests.

https://www.embecosm.com/resources/tool-chain-downloads/

embecosm gcc for ubuntu 18.04 direct link: https://buildbot.embecosm.com/job/riscv32-gcc-ubuntu1804/91/artifact/riscv32-embecosm-gcc-ubuntu1804-20211205.tar.gz

You will need to have the following environment variables set/updated if not already in place: export CV_SW_TOOLCHAIN=/tool/gcc/riscv32-embecosm-gcc-<os-build> export CV_SW_PREFIX=riscv32-unknown-elf- export CV_SW_VENDOR=unknown

silabs-hfegran avatar Feb 18 '22 16:02 silabs-hfegran

I can now build with this toolchain - thanks

eroom1966 avatar Feb 18 '22 17:02 eroom1966

@silabs-mateilga Approximately how many instructions before there is a difference ?

eroom1966 avatar Feb 18 '22 17:02 eroom1966

Hmm, I ran this and it passed ? My logfile attached, any indications of what I am doing wrong in not reproducing a FAILED run ? logfile.txt

eroom1966 avatar Feb 18 '22 17:02 eroom1966

The difference happens at the retirement of instruction 38861. I unfortunately don't have the opportunity to dig further in to this now, I'll have a look first thing Monday morning. What I can say is that this failure showed up on 2 of 100 runs of this test, so any difference that changes randomness of the build and/or instruction stream will likely result in a passed test.

silabs-mateilga avatar Feb 18 '22 18:02 silabs-mateilga

OK, lets look Monday When you say difference, do you mean that the testbench reports a difference, and thus a FAILED test ? or are you eyeballing a difference from the screenshot provided ?

I did not see a log at the beginning of this thread highlighting the reported difference Thx Lee

eroom1966 avatar Feb 18 '22 18:02 eroom1966

@silabs-mateilga I suggest you send Lee your generated & compiled .elf file together with instructions on how to execute the test on that elf file to rule out any possible toolchain-related differences here.

silabs-hfegran avatar Feb 21 '22 05:02 silabs-hfegran

@eroom1966 The following is the output from the test: `UVM_ERROR @ 2599926.300 ns : uvme_cv32e40x_core_sb.sv(381) uvm_test_top.env.core_sb [CORESB] CSR Mismatch, order: 38861, pc: 0x00020c8e, csr: mstatus, rvfi = 0x00001888, rvvi = 0x00001880, mask = 0xffffffff

UVM_ERROR @ 2599956.300 ns : uvme_cv32e40x_core_sb.sv(305) uvm_test_top.env.core_sb [CORESB] PC Mismatch, rvfi_order: 38862, rvvi_order: 38862, rvfi.pc = 0x00004406, rvvi.pc = 0x00020c8e

UVM_ERROR @ 2599956.300 ns : uvme_cv32e40x_core_sb.sv(312) uvm_test_top.env.core_sb [CORESB] INSN Mismatch, order: 38862, rvfi.pc = 0x00004406, rvfi.insn = 0x024ea933, rvvi.insn = 0x30200073

UVM_ERROR @ 2599956.300 ns : uvme_cv32e40x_core_sb.sv(342) uvm_test_top.env.core_sb [CORESB] GPR Mismatch, order: 38862, pc: 0x00004406, rvfi_x[18] = 0x00000000, rvvi_x[18] = 0x0000003d

UVM_ERROR @ 2599956.300 ns : uvme_cv32e40x_core_sb.sv(381) uvm_test_top.env.core_sb [CORESB] CSR Mismatch, order: 38862, pc: 0x00004406, csr: mstatus, rvfi = 0x00001888, rvvi = 0x00001880, mask = 0xffffffff ` As you can see, the error is reported as a failure on it's own, and the difference in mstatus causes a mismatch on the following instruction retirement as well.

I'll put together an .elf file and instructions as @silabs-hfegran suggested and send it to you.

silabs-mateilga avatar Feb 21 '22 08:02 silabs-mateilga

test_program.zip

@eroom1966 Extract these files to [your_results_path]/results/xrun_results/default/corev_rand_interrupt_wfi_mem_stress/27/test_program/ Run the second command in the top post here. (make test ...)

This will allow you to at least run the same instruction stream, it does however still not guarantee that you will see the same results, as the timing of the interrupts are influenced by simulator version. We run xcelium version 21.07.a001.

I think the easiest way for you to reproduce the error is if you are able to run your own regressions. What I did to find the original seed was to run an edited version of the "interrupt" regression list, where I only ran 100 runs of the failing test. This should produce at least one failure. I've seen quite consistently 1/50 test runs fail.

silabs-mateilga avatar Feb 21 '22 10:02 silabs-mateilga

@silabs-mateilga I have unpacked your example to update the following files

inflating: test_program/corev_rand_interrupt_wfi_mem_stress_27.elf
inflating: test_program/corev_rand_interrupt_wfi_mem_stress_27.hex
inflating: test_program/corev_rand_interrupt_wfi_mem_stress_27.itb
inflating: test_program/corev_rand_interrupt_wfi_mem_stress_27.objdump
inflating: test_program/corev_rand_interrupt_wfi_mem_stress_27.readelf

I then ran the following command, as instructed

make test TEST=corev_rand_interrupt_wfi_mem_stress CFG=default SIMULATOR=xrun USE_ISS=YES COV=YES RUN_INDEX=27 GEN_START_INDEX=27 RNDSEED=-844432648

The test reported PASSED, and I had no errors, logfile attached logfile.txt

I think we need to ask a third party to verify your findings, lets also compare simulator versions. looks like I am running the following

xrun(64): 20.03-s010: (c) Copyright 1995-2020 Cadence Design Systems, Inc.

Do you have access to this version to try it on your testcase ?

eroom1966 avatar Feb 21 '22 11:02 eroom1966

I also have 20.09 installed - trying to see if I can reproduce with that version

eroom1966 avatar Feb 21 '22 11:02 eroom1966

I highly doubt we will see a failure with a seed that fails on one xcelium version on another xcelium version. As I mentioned, the easiest would be if you are able to run a regression and find a seed that fails in your system. If not I have access to 20.09.009, I can try to find a failing seed with that simulator version.

silabs-mateilga avatar Feb 21 '22 11:02 silabs-mateilga

I just tried on 20.09 and the test you provided passed

eroom1966 avatar Feb 21 '22 11:02 eroom1966

As I mentioned, the easiest would be if you are able to run a regression and find a seed that fails in your system

The only checks I generally run are individual tests or the ci_check prior to committing changes - I am happy to try your suggestion, but I would need detailed instructions in order to try this

eroom1966 avatar Feb 21 '22 12:02 eroom1966