riscv-qemu
riscv-qemu copied to clipboard
qemu-system-riscv32 store doesn't modify the memory during a `sw ra, 12(sp)`
Hi ! I'm trying to test a simple bare metal C code with qemu-system-riscv32 [1], and I think the jump instructions are off by 4 bytes. You can simply test it by running the "ROM" code of the sifive_e machine in gdb:
$ qemu-system-riscv32 -nographic -machine sifive_e -kernel foo.elf -s -S
$ # in another terminal
$ riscv64-unknown-elf-gdb
(gdb) target remote :1234
(gdb) layout asm /* should display 0x1000 lui t0, 0x20400 | 0x1004 jr t0 */
(gdb) p /x $pc
$1 = 0x1000 /* lui t0, 0x20400 */
(gdb) stepi
(gdb) p /x $pc /* jr t0 */
$2 = 0x1004
(gdb) p /x $t0
$3 = 0x2040000
(gdb) stepi
(gdb) p /x $pc
$4 = 0x20400004 /* should be 0x20400000 */
This as a nasty side effect with call: when we get back from a function call, the instruction restoring ra
is skipped, so if you have several chained function calls, the first ret
will return and the second ret
will trigger an illegal access to address 0
(most of the time).
For the record, my actual test code is a bare metal application printing "hello world" on the uart:
- my gcc compiler has been built from the https://github.com/riscv/riscv-tools repository
- my qemu is the one from this repository
- my application is compiled with the following flags:
CFLAGS := -ffreestanding -nostdlib -nostartfiles -march=rv32imafdc -mabi=ilp32 -mcmodel=medany -g -O1
[1] actual revision:
$ git log -1
commit 4d37030aa03e07ae927174c8fce0227648552e51
Author: Michael Clark <[email protected]>
Date: Tue Apr 24 10:12:27 2018 +1200
It seems there is two separate issues in fact:
- on both risc32 and risc64, the "jump" instructions seem to be performed correctly but gdb seems to do two steps at once, so it looks like the return address is wrong, but in fact its ok
- I tested my code on sifive_e, sifive_u and spike_v1.10 machines, and only the sifive_e has an issue (only one in 32 bits), where my stack is full of 0, so the
ra
is "correctly" set to 0, but it shouldn't be 0 in the first place on the stack.
You'll fine attached my current project (set to compile the sifive_e version) core.tar.gz
The issue seems related to sw
on the risc32 machines: on function entry, ra
is stored on the stack using sw ra, 12(sp)
but the memory doesn't seem correctly modified, so the future lw
will only recover a 0.
I only had the issue on 32 bit RISCV.
Thanks for the report, test case and GDB testing.
There are different code paths in target/riscv/translate.c
for when a debugger is attached. I wonder if the issue is only present when GDB is attached.
Now that I understand a bit better riscv, I'll try to reproduce the issue without gdb to confirm your idea. I'll let you know
I wrote a minimal code that basically just dump info from Machine CSR registers. I works in 32 bit mode (sifive_u) but not in 32 bit mode (sifive_e) with the following compilation / qemu options:
CFLAGS64 += -march=rv64imafdc -mabi=lp64 -mcmodel=medany
CFLAGS32 += -march=rv32imafdc -mabi=ilp32 -mcmodel=medany
qemu-system-riscv64 -machine sifive_u -nographic -kernel build/core.elf
qemu-system-riscv32 -machine sifive_e -kernel build/core.elf -nographic
Here are the two main files:
# start.S
.global _start
_start:
/* global pointer stuff */
.option push
.option norelax
la gp, __global_pointer$
.option pop
/* Clear bss section */
la a0, __bss_start
la a1, _end
bgeu a0, a1, 2f
1:
sw zero, 0(a0)
addi a0, a0, 4
bltu a0, a1, 1b
2:
la sp, machine_stack_start /* also correctly initialize sp for M mode */
/* Save M stack current address */
mv s0, sp
/*
* Allocate room on the kernel stack for the
* supervisor regs
*/
addi sp, sp, -(REGS_SIZE)
/* M main function */
mv a0, sp
call machine_main
/* main.c */
#define UART_BASE_PA UINT32_C(0x10013000)
static void uart_init(void)
{
/* Monitor Mode */
console_init(UART_BASE_PA);
}
static void entry(uint64_t mode);
void machine_main(regs *supervisor_regs)
{
uart_init(); /* It usually never exits this call because of the "ra == 0" issue */
entry(0);
while (1) ;
}
BTW this might help:
- https://github.com/michaeljclark/riscv-probe
riscv-probe
works on riscv32 and riscv64 in the spike_v1.9.1
and spike_v1.10
machines. e.g.
$ qemu-system-riscv32 -nographic -machine spike_v1.10 -kernel bin/riscv32/probe-htif
$ qemu-system-riscv64 -nographic -machine spike_v1.10 -kernel bin/riscv64/probe-htif
It would be possible to add the sifive uart to riscv-probe
by updating the Makefile and adding mbi_uart.c
and a new compile targets and linker scripts:
-
sifive_e
entry is0x20400000
-
sifive_u
entry is0x80000000
(same as spike)
Note sifive_e
and sifive_u
can be run in either riscv32 or riscv64 modes:
-
e31 processor for
sifive_e
on riscv32 -
e51 processor for
sifive_e
on riscv64 -
u34 processor for
sifive_u
on riscv32 -
u54 processor for
sifive_u
on riscv64
Something like this should work for the SiFive UART (borrowed from bbl):
#include <stdint.h>
enum {
UART_REG_TXFIFO = 0,
UART_REG_RXFIFO = 1,
UART_REG_TXCTRL = 2,
UART_REG_RXCTRL = 3,
UART_REG_DIV = 4
};
uint32_t *uart = (uint32_t *)0x10013000;
int mbi_console_getchar()
{
int32_t ch = uart[UART_REG_RXFIFO];
if (ch < 0) return -1;
return ch;
}
void mbi_console_putchar(uint8_t ch)
{
volatile uint32_t *tx = &uart[UART_REG_TXFIFO];
while ((int32_t)(*tx) < 0);
*tx = ch;
}
I might abstract riscv-probe
and build binaries for sifive_e
and sifive_u
as well as spike
by implementing mbi_uart.c
and adding linker scripts...
I'll have a look thanks ! It looks like a lot like the small code I wrote to understand all the machine registers :)
Yes. I used it for basic compatibility testing for reading and comparing machine CSRs between spike and QEMU in 32-bit mode and 64-bit mode. You'll get different results if you run it on spike_v1.9.1
compared to spike_v1.10
. It's possible to add more CSRs. It uses the CSR numbers in hex rather than the literal strings so that it can read CSRs that are not implemented in binutils.
There is also https://github.com/riscv/riscv-tests which can be run in QEMU. The 'v' tests run in the spike_v1.10
machine.
Okay, I updated riscv-probe (https://github.com/michaeljclark/riscv-probe) with support for both HTIF and UART, added invocation examples to the README and moved the linker script into a conf directory (defaults to 0x80000000
for spike
and sifive_u
).
I wrote a minimal code that basically just dump info from Machine CSR registers. I works in 32 bit mode (sifive_u) but not in 32 bit mode (sifive_e)
Were you able to resolve this @vsiles? Was it your code at fault, or is there an issue with sw ra
? I'm hitting the same problem with rv32 code built from Rust in qemu-system-riscv32
when using sifive_e
. The following code, at the end of a Rust function, loads 0 into ra
and then jumps to it, crashing the emulator:
0x204006a8: lw ra,124(sp)
0x204006ac: addi sp,sp,128
0x204006b0: ret
Which shouldn't be happening. I'd like to know if this is something I need to fix my end, or in qemu. Thanks!
I didn't try to resolve it, since we were only working with 64 bit archs. In such case, gdb display is funky (it might skip steps) but the behavior was correct. I did not look into the 32 bit case more in depth, sorry.
Thanks for the reply. Something funky is going on. The code enters this function:
IN: _ZN4core3fmt5write17h728046e032081804E
0x20400378: addi sp,sp,-128
0x2040037c: sw ra,124(sp)
At 0x20400378, ra
= 0x20400080 and sp
= 0x8001ffd0, according to Qemu's monitor, and both valid. sp
is not touched again until we reach...
0x204006a8: lw ra,124(sp)
0x204006ac: addi sp,sp,128
0x204006b0: ret
So ra
should be restored correctly to 0x20400080 by the lw
. Instead, it's set to 0, causing a crash. This is using QEMU emulator version 2.12.0 (Debian 1:2.12+dfsg-3)
running:
qemu-system-riscv32 -machine sifive_e -kernel target/riscv32imac-unknown-none-elf/release/kernel -nographic -d in_asm,cpu
I'll build from the latest source, but I can't see anything that's changed that would fix this. Is this something I could help with or patch if there's a problem with rv32 sw ra
@michaeljclark please?
Edit: It's still overwriting ra
with 0 using the latest riscv-all
branch (git log -1
--> commit 6e532472173162d479cdfd5f74d2db9f327967ca
)
I am wondering if it might be decompression related thus in target/riscv/translate.c
. I couldn't see from your objdump output whether you had RVC or not.
BTW The disassembler is distinct from the decoder in target/riscv/translate.c
so if you run with -d in_asm,op,op_ind,op_opt,out_asm
then you might see if something is mis-translated.
I have not encountered any issues like this on riscv32 yet so I'm curious what is going on... I'm surprised that riscv32 linux boots oif we can't sw ra
I can try your reproducer. Do you have the linker script?
Sorry, I no longer have access to these files (former job)... But I didn't have anything clever in it anyway
Perhaps there is a buffer overflow that is clobbering values on the stack. Since the problem only happens for 32-bit code, maybe the buffer overflow doesn't happen for 64-bit code. This could be a compiler or library bug where the size of something is computed wrong when you have a 32-bit target.
OK. It's my fault. I could have sworn there was 128KB of DRAM in the E300 series. It's 16KB. That'll be why ra
wasn't written to memory: sp
was set too high in memory, in non-existent DRAM. I'm so sorry. I suspect @vsiles was also hitting this same issue.
(PS: thanks for the port, it's appreciated!)
Might have been that indeed !