riscv-openocd
riscv-openocd copied to clipboard
Avoid resetting the target when loading an elf
I am not sure where asking this, but I am putting it here with the hope of getting some sort of guidance.
I am using the Microsemi polarfire soc, to be more accurate, I am using the Microchip Polarfire ICICLE Kit and the version of openocd which comes with the latest version of Softconsole.
Versions Microchip SoftConsole version is v2021.1-6.6.0.507, while openocd version is 0.10.0+dev-00859-g95a8cd9b5-dirty (2020-10-21-21:16)
Problem What I would like to do is to be able to attach to a running program without resetting the board in order to load an elf into the memory (L2LIM) and start executing it.
In the normal use case scenario what I do is this:
- I have a very basic bootloader built using baremetal library which initializes the SoC
- then I jump to an application stored in the form of an elf file into an attacched flash memory which gets copied into the lim by the bootloader and then executed, and this work
In the other debug use case scenario what I would like to do is this:
- the same bootloader initializes the SoC, but instead of loading an application in ram and jumping to the exec point, sits into a while loop after the SoC has been initialized
- with openocd I load the same elf directly into the lim trough GDB and the application starts executing
Sadly however, when openocd connects it resets the soc, so I lost the initialization done by the bootloader, and when the app start executing I end up trapped in here
By default the microsemi-riscv.cfg file is doing
proc do_board_reset_init {}
what I have tried so far is to modify this file and instead of the aforementioned line I putted:
reset_config none
but neither this worked.
The command I am using with gdb to load the elf are:
set arch riscv:rv64
set mem inaccessible-by-default off
target extended-remote localhost:3333
monitor halt
load
monitor resume
monitor shutdown
quit
Is that behavior intended (so there is not way to attach openocd to a running program on the board without resetting the board itself) or am I doing something wrong?
Hi @fcuzzocrea, OpenOCD should be able to accomplish the connection to the target without reset.
AFAIK, OpenOCD does not trigger reset of the target unless you explicitly instruct it to do so.
Please double-check all your OpenOCD configuration files whether they contain reset [halt|run|init]
command (which would reset the CPU target) or adapter assert|deassert [...]
which may trigger SRST signal.
Check all your config files. Or run OpenOCD with higher verbosity (-d3
on command line) and take a look at what TCL commands are executed, whether reset
or adapter assert
is among them.
Ciao @JanMatCodasip! Thanks a lot for your quick reply!
I tried to check my openocd cfg files, and I found that scripts/target/microsemi-riscv.cfg there the following lines
$_TARGETNAME_1 configure -event reset-init init_regs
$_TARGETNAME_2 configure -event reset-init init_regs
$_TARGETNAME_3 configure -event reset-init init_regs
$_TARGETNAME_4 configure -event reset-init init_regs
and when running openocd with the debug mode I see the following output
Debug: 169 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 configure -event reset-init board_reset_init
Debug: 170 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 configure -event reset-init init_regs
Debug: 171 1 command.c:143 script_debug(): command - reset_config reset_config trst_only
Debug: 173 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 configure -event gdb-detach
# resume execution on debugger detach
resume
Debug: 174 1 command.c:143 script_debug(): command - reset_config reset_config none
User : 176 1 options.c:63 configuration_output_handler(): none separate
User : 177 1 options.c:63 configuration_output_handler():
Info : 178 1 server.c:310 add_service(): Listening on port 6666 for tcl connections
Info : 179 1 server.c:310 add_service(): Listening on port 4444 for telnet connections
Debug: 180 1 command.c:143 script_debug(): command - init init
Debug: 182 1 command.c:143 script_debug(): command - target target init
Debug: 184 1 command.c:143 script_debug(): command - target target names
Debug: 185 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 cget -event gdb-flash-erase-start
Debug: 186 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 configure -event gdb-flash-erase-start reset init
Debug: 187 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 cget -event gdb-flash-write-end
Debug: 188 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 configure -event gdb-flash-write-end reset halt
Debug: 189 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 cget -event gdb-attach
Debug: 190 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 configure -event gdb-attach halt
Debug: 191 1 target.c:1428 handle_target_init_command(): Initializing targets...
Debug: 192 1 riscv.c:473 riscv_init_target(): riscv_init_target()
Debug: 193 1 semihosting_common.c:97 semihosting_common_init():
However, commenting out the aforementioned lines in the cfg, does not help :( The output I see after doing the edits is this one: https://pastebin.com/BZeYx2au
It seems you have reset commands in one of your configuration files, which will be triggered when GDB loads an application binary to flash. I wonder if that could be the issue. See the following lines from your log:
Debug: 186 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 configure -event gdb-flash-erase-start reset init
...
Debug: 188 1 command.c:143 script_debug(): command - mpfs.hart0_e51 mpfs.hart0_e51 configure -event gdb-flash-write-end reset halt
You can find in which .cfg file the commands are located, and change them to plain halt
. Then check if it helps:
mpfs.hart0_e51 configure -event gdb-flash-erase-start halt
mpfs.hart0_e51 configure -event gdb-flash-write-end halt
Or you may try loading the application binary to the RAM memory directly via OpenOCD's load_image
command (that is, without GDB).
Thanks again for the help!
Sadly I wasn't able to workaround the thing. What I tried are the following things:
- first I tried to declare the gdb-flash-erase-start halt and gdb-flash-write-end halt in the interface script (I wasn't able to find were they were declared, so some sort of default maybe was used?) but that didn't helped.
- then I tried to use the load image as you suggested. what I tried so far was to create a proc do_app_load with the load_image to the target address and then resume to the target address, but neither this worked, as openocd fails with
Error: invalid command name "load_image"
If it can be helpful, I attach here the cfg files I am using: scripts.zip
Edit:
I think that I am on the good road though. I modified my board/microsemi-riscv.cfg adding the following commands:
$_TARGETNAME_0 configure -event gdb-flash-erase-start {
# halt execution
halt
}
$_TARGETNAME_0 configure -event gdb-flash-write-end {
# resume execution after write
resume
}
$_TARGETNAME_0 configure -event gdb-attach {
# halt execution on debugger attach
halt
}
My application does not start, but now it does not trap anymore, but it is just stucked somewhere in the code. I can say that because if I start again openocd and connect trough GDB I can see were it is stucked:
Reading symbols from build/debug/app.elf...
(gdb) set arch riscv:rv64
The target architecture is set to "riscv:rv64".
(gdb) set mem inaccessible-by-default off
(gdb) target extended-remote localhost:3333
Remote debugging using localhost:3333
0x0000000008004d78 in prvSocInfoCommand (pcWriteBuffer=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
xWriteBufferLen=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
pcCommandString=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at ../../src/cli/cmd/util_cmd.c:222
222 buf += snprintf(pcWriteBuffer + buf, xWriteBufferLen - buf,
For what it's worth I used to work for Microchip on SoftConsole so have some knowledge of what you're working on here. However I'm a bit confused. I presume that this is the specific problem?
In the other debug use case scenario what I would like to do is this:
- the same bootloader initializes the SoC, but instead of loading an application in ram and jumping to the exec point, sits into a while loop after the SoC has been initialized
- with openocd I load the same elf directly into the lim trough GDB and the application starts executing
But I still find this a bit confusing.
- Is the while loop after the SoC has been initialized part of your basic bootloader?
- "With OpenOCD I load the same ELF directly into the LIM..." - I don't know what you mean by "same" here? The booloader again or something else?
Have you tried the following from SoftConsole:
- Configure the Icicle board to use PolarFire SoC boot mode 1 to execute your bootloader from eNVM (I am assuming that you have done this already but if not then refer to
.../Microchip/SoftConsole-v2021.1/extras/mpfs/mpfs-bootmodes-readme.txt
for instructions). - Create a debug launch configuration for your "main" program using an existing bare metal example program as a guide
- In the debug launch configuration go to Startup > Initialization Commands > Initial Reset and uncheck that to prevent the debugger from doing a
reset [init]
after connecting - that should leave the target "undisturbed" following the execution of the bootloader.
If you do that then power cycle the Icicle board it should boot your bootloader, it will (presumably) sit in your busy while loop, you launch the SoftConsole debug session, it will NOT reset the target and then the debugger will load the program and start executing/debugging it.
(Obviously all of this can be done without using SoftConsole and using just command line tools but it might be worth trying this SoftConsole based approach first).
If that does not help/work then please clarify what exactly happens/does not work as described and if I have misunderstood anything.
Hope this helps.
Hi @TommyMurphyTM1234 , thanks a lot for you answer!
For what regards your first question, yes, the SoC it is initialized by my basic bootloader (so to speak, I initialize the PLIC, enable irqs, initialize RTC and GPIO and SPI driver).
The bootloader it is flashed into the eNVM (with bootmode 1 - non-secure boot from eNVM) and it is executed by the eNVM.
After it starts, the bootloader can either load an application from SPI flash (by copying the application from the flash to the LIM and then jumping to the entry point of the ELF), or, if a button is pressed when the board is powered up, it sits in a busy while loop waiting for debugger to upload an ELF file directly to the L2LIM.
By same I mean that the ELF that I am trying to program with OpenOCD is the same ELF that I store into my SPI flash and gets loaded by the bootloader if no button is pressed when the board is power cycled. So just the way to load it to the LIM is different. I wrote this just to point out that in one way the ELF gets loaded correctly (when I copy it from the SPI flash to the LIM and then jump to the entry point), while in the other way (when I write it to the LIM using openocd) it does not start.
Of couse my application is built using IMAGE_LOADED_BY_BOOTLOADER=1
Anyway thanks for your suggestions! Sadly I have a custom setup to work without the need of softconsole, just using cli tools, but yeah, I will give a try to the softconsole based approach.
Let me know if you find this still confusing!
Hi @fcuzzocrea - thanks for the reply and clarifications.
Sadly I have a custom setup to work without the need of softconsole, just using cli tools, but yeah, I will give a try to the softconsole based approach.
OK - I'm pretty sure that that should still be possible even using just command line tools and without changing the OpenOCD target/microsemi-riscv.cfg
script.
Off the top of my head (so I could be missing something here)...
-
Please ensure that any changes that you may have made to
<path-to-softconsole>/openocd/share/openocd/scripts/target/microsemi-riscv.cfg
have been undone first. -
Run the SoftConsole OpenOCD from the command line as follows:
cd <path-to-softconsole>openocd/bin
openocd --command "set DEVICE MPFS" --file board/microsemi-riscv.cfg
- Run the SoftConsole RISC-V GDB from the command line as follows:
cd <path-to-softconsole>/riscv-unknown-elf-gcc/bin
riscv64-unknown-elf-gdb
(gdb) set mem inaccessible-by-default off
(gdb) set $target_riscv=1
(gdb) set arch riscv:rv64
(gdb) source <path-to-softconsole>/gdbinit/softconsole.gdbinit
(gdb) target remote localhost:3333
(gdb) load yourprogram.elf
(gdb) thread apply all set $pc=_start
(gdb) tb main
(gdb) continue
Maybe you can try the above and post back with the results? As I say I think that that should be the gist here, but I could have overlooked something and, unfortunately, don't have an Icicle board to try it out myself. The key thing here is that there is no reset of the target (e.g. monitor reset init
) via GDB so the post bootloader execution state of the target should not be disturbed before your program is loaded and debugged.
Regards Tommy
Thanks again for your help.
I tried what you suggested using an untouched openocd copy taken from softconsole:
fcuzzocrea@Latitude-5420:~$ /opt/openocd/bin/openocd --command "set DEVICE MPFS" --command "set COREID 0" --file /opt/openocd/share/openocd/scripts/board/microsemi-riscv.cfg
xPack OpenOCD (Microchip SoftConsole build), x86_64 Open On-Chip Debugger 0.10.0+dev-00859-g95a8cd9b5-dirty (2020-10-21-21:16)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
MPFS
0
Info : only one transport option; autoselect 'jtag'
do_board_reset_init
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : Embedded FlashPro6 (revision B) found (USB_ID=1514:200b path=/dev/hidraw1)
Info : Embedded FlashPro6 (revision B) CM3 firmware version: F4.0
Info : clock speed 6000 kHz
Info : JTAG tap: mpfs.cpu tap/device found: 0x0f81a1cf (mfg: 0x0e7 (GateField), part: 0xf81a, ver: 0x0)
Info : datacount=2 progbufsize=16
Info : Disabling abstract command reads from CSRs.
Info : Examined RISC-V core; found 5 harts
Info : hart 0: XLEN=64, misa=0x8000000000101105
Info : hart 1: currently disabled
Info : hart 2: currently disabled
Info : hart 3: currently disabled
Info : hart 4: currently disabled
Info : Listening on port 3333 for gdb connections
While from GDB
(gdb) file build/debug/c3app.elf
Reading symbols from build/debug/c3app.elf...
(gdb) set mem inaccessible-by-default off
(gdb) set $target_riscv=1
(gdb) set arch riscv:rv64
The target architecture is set to "riscv:rv64".
(gdb) source softconsole.gdbinit
(gdb) target remote localhost:3333
Remote debugging using localhost:3333
0x00000000202224e4 in ?? ()
(gdb) load build/debug/c3app.elf
Loading section .text, size 0x22230 lma 0x8000000
Loading section .sdata, size 0x70 lma 0x8022230
Loading section .data, size 0x3930 lma 0x80222a0
Loading section .sdram, size 0x1388 lma 0x8025bd0
Start address 0x0000000008000000, load size 159576
Transfer rate: 9 KB/sec, 13298 bytes/write.
(gdb) thread apply all set $pc=_start
Thread 1 (Remote target):
(gdb) tb e51
Temporary breakpoint 1 at 0x8005f4c: file ../../ext/pfsoc_platform/mpfs_hal/common/mss_plic.h, line 719.
(gdb) continue
Continuing.
^C
Program received signal SIGINT, Interrupt.
trap_from_machine_mode (regs=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
dummy=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
mepc=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at ../../ext/pfsoc_platform/mpfs_hal/common/mss_mtrap.c:755
755 i++; /* added some code as SC debugger hangs if in loop doing nothing */
(gdb)
and then nothing happens on my UART terminal because I ending up in trap:(
While from GDB
trap_from_machine_mode (regs=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, dummy=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, mepc=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at ../../ext/pfsoc_platform/mpfs_hal/common/mss_mtrap.c:755 755 i++; /* added some code as SC debugger hangs if in loop doing nothing */ (gdb)
You should not be getting those "Corrupted DWARF expression" error messages. Are you sure that you compiled the program with the SoftConsole RISC-V GCC toolchain and not some other RISC-V toolchain? Using a different compiler may result in mismatches with the SoftConsole GDB.
and then nothing happens on my UART terminal because I ending up in trap:(
Ignoring DWARF errors, it looks to me like your program is getting an exception - probably in the startup code since the temporary breakpoint at the e51()
"main" function never fires - which leaves it in the trap_from_machine_mode()
default trap handler. You need to debug the program to find out how/when/why that happens. E.g. at least look at the mcause
CSR to see what kind of trap/exception is happening and maybe also debug the program from _start
through the startup code to the point at which things go wrong.
BTW - probably related to the DWARF messages but this looks wrong because the e51()
function is almost certainly not in that header file if you are using the PolarFire SoC bare metal library code for your program:
> (gdb) tb e51
> Temporary breakpoint 1 at 0x8005f4c: file ../../ext/pfsoc_platform/mpfs_hal/common/mss_plic.h, line 719.
How exactly are you compiling your program and what toolchain are you using?
Hi Tommy and thanks again for your help!
What leaves me very confused is that, if instead of loading the ELF directly in LIM with OpenOCD, I put it on an external SPI Flash connected to the Icicle Kit, and I read and copy the ELF from the flash into the LIM using the MSS SPI driver, and then I jump to the entry point (which I extract from the ELF header) using this funcion the application start and runs correctly (the application is linked against the LIM).
It instead I program it directly into the LIM using OpenOCD the program traps. The program is the same and it is linked against the LIM and also the bootloader is the same between the two experiments.
I would expect the same behavior of the program when loaded into the LIM, regardless the way I load it.
For what regards the toolchain, I used before what was bundled with SoftConsole, but I switched to use a self compiled version of this toolchain, which AFAIK should be the riscv official one? I prefer to self build the tools I use when possible (in that regards, would be a nice to have the sources of the OpenOCD version which is shipped with SoftConsole, as standard OpenOCD does not have support for the onboard FP6).
For the second answer, my program is compiled with the aforementioned toolchain, it is using the baremetal library (with IMAGE_LOADED_BY_BOOTLOADER 1 in the mss_sw_config.h), and it is FreeRTOS based (I am using vanilla upstream FreeRTOS 10).
The CFLAGS I am using are the one I extracted from SoftConsole projects:
- '-fdata-sections'
- '-ffunction-sections'
- '-fmessage-length=0'
- '-fsigned-char'
- '-mabi=lp64'
- '-march=rv64imac'
- '-mcmodel=medany'
- '-mno-save-restore'
- '-msmall-data-limit=8'
- '-mstrict-align'
- '-mtune=sifive-5-series'
- '-Os'
- '-D__DYNAMIC_REENT__'
- '-DDDR_INIT'
- '-DMSS_CAN_USER_ISR=1'
- '-DUSING_FREERTOS'
As well as the ASFLAGS
- '-fdata-sections'
- '-ffunction-sections'
- '-fmessage-length=0'
- '-fsigned-char'
- '-mabi=lp64'
- '-march=rv64imac'
- '-mcmodel=medany'
- '-mno-save-restore'
- '-msmall-data-limit=8'
- '-mstrict-align'
- '-mtune=sifive-5-series'
and the LDFLAGS
- '--specs=nano.specs'
- '-mabi=lp64'
- '-march=rv64imac'
- '-nostartfiles'
- '-Wl,--gc-sections'
I can give another try at the toolchain shipped with SoftConsole though
Did you debug the code to see what exception/fault is occurring and where? That's what I would do regardless of the fact that the program runs ok on one scenario but not this one.
Alright, will dig more into my code and report back.
Just to be sure, I can use the bundled openocd cfg files as they are ? They do not issue a reset the SoC when OpenOCD Starts ? A reset of the SoC will not be done if not explicitly requested trough GDB, right?
Alright, will dig more into my code and report back.
You don't need to dig into the code to get the exception type. Just Ctrl-C break into the program when it's stuck in the default fault handler and check the mcause
CSR (p/x $mcause
) to see what it is. And maybe the mepc
CSR to see at what PC/instruction the trap occurred.
That should shed some light on the problem. But if it's still not obvious what's happening then you can single step debug from _start
rather than continue
after loading the program to see exactly where things go wrong.
Just to be sure, I can use the bundled openocd cfg files as they are ?
Correct.
They do not issue a reset the SoC when OpenOCD Starts ? A reset of the SoC will not be done if not explicitly requested trough GDB, right?
Correct. A target reset will only occur if you explicitly do monitor reset
from GDB.
I tried to follow your suggestion, putting a breakpoint at start but I am even more confused , it seems it traps right after reset_vector :(
That is what I did:
(gdb) file build/debug/c3app.elf
Reading symbols from build/debug/c3app.elf...
(gdb) set mem inaccessible-by-default off
(gdb) set $target_riscv=1
(gdb) set arch riscv:rv64
The target architecture is set to "riscv:rv64".
(gdb) target remote localhost:3333
0x0000000020222358 in ?? ()
Loading section .text, size 0x22230 lma 0x8000000
Loading section .sdata, size 0x70 lma 0x8022230
Loading section .data, size 0x3930 lma 0x80222a0
Loading section .sdram, size 0x1388 lma 0x8025bd0
Start address 0x0000000008000000, load size 159576
Transfer rate: 9 KB/sec, 13298 bytes/write.
(gdb) thread apply all set $pc=_start
Thread 1 (Remote target):
(gdb)
Thread 1 (Remote target):
(gdb) b _start
Breakpoint 1 at 0x8000008
(gdb) step
Single stepping until exit from function reset_vector,
which has no line number information.
Breakpoint 1, 0x0000000008000008 in reset_vector ()
(gdb) step
Single stepping until exit from function reset_vector,
which has no line number information.
0x00000000080000ac in trap_vector ()
(gdb) step
Single stepping until exit from function trap_vector,
which has no line number information.
trap_from_machine_mode (regs=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
dummy=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
mepc=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at ../../ext/pfsoc_platform/mpfs_hal/common/mss_mtrap.c:731
731 volatile uintptr_t mcause = read_csr(mcause);
(gdb) p/x $mcause
$1 = 0x2
(gdb) p/x $mepc
$2 = 0x8000018
(gdb) p/x $mstatus
$3 = 0x200001880
mcause should be illegal istruction?
You do not need to put a breakpoint at _start
. After loading the program and using thread apply all set $pc=_start
the program is ready to run and you should use si
(single instruction stepping: https://sourceware.org/gdb/download/onlinedocs/gdb/Continuing-and-Stepping.html) rather than C code line stepping.
However, again your logs suggest that your program is compiled and linked in a way that the debugging symbolic information may not be correct so any debugging is going to be difficult until you sort that out.
As per the RISC-V Privileged Specification (https://riscv.org/technical/specifications/) $mcause == 0x2
means illegal instruction. And $mepc == 0x8000018
is the program counter at which the offending instruction resides.
Have you manually checked the contents of LIM (at least the start addresses) against the list file for your program to see if there's any mismatch?
Or at least do disasm 0x08000000
to see what the start of LIM disassembles as.
No, actually I didn't checked the content of the LIM to check if there is what I am expecting (I used to do this with Lauterbach Trace32, GDB is very new for me so I need to learn how to do it properly)
Normally you could do compare-sections
in GDB to check the contents of memory against your local ELF file but (a) my recollection is that the Microchip OpenOCD script is not set up for this to work (no RAM work area defined which may be required) and (b) if your debug info is incorrect then such a comparison will almost certainly fail.
No, actually I didn't checked the content of the LIM to check if there is what I am expecting (I used to do this with Lauterbach Trace32, GDB is very new for me so I need to learn how to do it properly)
Try disasm 0x08000000
and see that it gives and if it matches the list file for your program.
One thing that I noticed:
Start address 0x0000000008000000, load size 159576 ... (gdb) b _start Breakpoint 1 at 0x8000008
Is this correct? I.e. is your program actually linked so that _start
is at 0x08000008 and not the start of LIM which is 0x08000000? This doesn't match with the start address displayed when loading the program.
What happens if, instead of thread apply all set $pc = _start
you do thread apply all set $pc = 0x08000000
and then continue
?
Hi Tommy, sorry for the late reply.
On friday I did some more debugging using Lauterbach debugging tools, and I think that the issue could be related to the way I am preparing the board to do the debugger ELF loading.
In particular I found that programming the ELF into the LIM using Lauterbach led me to the same result I had when doing it with OpenOCD (ending up trapped). However, if with Lauterbach I put a breakpoint at this line:
https://github.com/polarfire-soc/polarfire-soc-bare-metal-examples/blob/3e45221cb287978a35213d7687ab050861e4bd9a/driver-examples/mss/mpfs-hal/mpfs-hal-ddr-demo/src/application/hart0/e51.c#L175
And then I load the ELF, then the application starts correctly, so I presume that I am doing something wrong in preparing the board to accept the loading of a program trough OpenOCD at this point. I tried searching trough the HSS code, but I don't really understand where they are implementing the logic for allowing HSS execute an ELF built with IMAGE_LOADED_BY_BOOTLOADER 1.
For reference, the code which I am using to implement my while loop (shamefully copied from the jump_to_application_example) is this one:
void wait_for_debugger(HLS_DATA* hls)
{
/* Store current hardid */
uint32_t hartid = read_csr(mhartid);
/* Restore PLIC to known state */
__disable_irq();
PLIC_init();
/* Disable all interrupts: */
write_csr(mie, 0);
while (true) {
static volatile uint64_t counter = 0U;
/* Added some code as debugger hangs if in loop doing nothing */
counter = counter + 1U;
}
register unsigned long a0 asm("a0") = hartid;
register unsigned long a1 asm("a1") = (unsigned long)hls;
__asm__ __volatile__("mret" : : "r"(a0), "r"(a1));
__builtin_unreachable();
}
For what concerns your questions - running the disasm confirmed that I actually put in LIM the code I compile.
For what concerns the _start breakpoint, this leave me very confused, using nm on the ELF gives me:
0000000008000000 T _start
0000000008000000 t _start_non_bootloader_image
At address 0x8000008 objdump tells me that I have:
0000000008000000 <_start>:
8000000: 00000717 auipc a4,0x0
8000004: 0ac70713 addi a4,a4,172 # 80000ac <trap_vector>
**8000008: 30571073 csrw mtvec,a4**
800000c: 305027f3 csrr a5,mtvec
8000010: fef71ee3 bne a4,a5,800000c <_start+0xc>
8000014: 00050663 beqz a0,8000020 <_start+0x20>
Which matches what mss_entry.S is doing here.
So, at the end, I believe that probably my while loop isn't enough the put the board into a state to accept the execution of a file programmed using OpenOCD.
A simple way to workaround this could be, I think, to load trough the jump_to_application() function a simple ELF written in assembly like that:
_start:
nop
nop
nop
j _start
And then overwrite it with OpenOCD + GDB. I believe this should be enough to make it load the application as the board would be into a state which should be ready to run software (?)
I don't really understand why you have, once again, ignored my suggestion that you debug the program execution from 0x08000000 through the startup code to the point at which the illegal instruction exception occurs in order to actually understand what's happening here instead of guessing and proposing workaround hacks such as the endless loop stub program with the nop
s?
I tried to debug the execution from the 0x08000000 trough the startup code, both with OpenOCD and Lauterbach debugger.
All I was able to find is that the code goes trough the reset_vector, then it goes trough the trap_vector, and right after I end up trapped into trap_from_machine_mode. I wasn't able to pinpoint in the assembly code which is run in trap_vector where the actual illegal instruction exception occurs.
All I was trying to say is that, probably, the while loop I proposed in the previous comment and which I implemented into my bootloader, is not sufficient to prepare the SoC to accept the loading of an application programmed trough the OpenOCD into the LIM and which is expecting to be loaded by the bootloader.
I believe that the startup code expect to find several registers populated correctly (for instance a0 and a1).
Sorry if it seemed that I wanted to ignore your suggestion, wasn't my intention.
As I said before when you end up in the trap handler with $mcause == 0x2
(illegal instruction) then $mepc
gives the address of the instruction that caused the exception. According to your earlier post you had $mepc == 0x08000018
so what instruction is that in the disassembly and what is the value actually in memory at that address? If the first few instructions of the program are executing ok then I'm not sure that there's any evidence to place the blame on the bootloader busy wait loop for subsequent erroneous execution.
The other issue regarding what looks like a mismatch between the symbolic debugging information and the actual program also remains. E.g. _start
is actually at 0x08000000 but your symbolic debugging information seems to think that it's at 0x08000008.
If you are referring to assembly instruction, the instruction which I see using disassembly in GDB by manually inspecting the memory, matches what is in the ELF file inspected using objdump:
GDB Output:
(gdb) disassemble 0x8000018
Dump of assembler code for function reset_vector:
0x0000000008000000 <+0>: auipc a4,0x0
0x0000000008000004 <+4>: addi a4,a4,172 # 0x80000ac <trap_vector>
0x0000000008000008 <+8>: csrw mtvec,a4
0x000000000800000c <+12>: csrr a5,mtvec
0x0000000008000010 <+16>: bne a4,a5,0x800000c <reset_vector+12>
0x0000000008000014 <+20>: beqz a0,0x8000020 <reset_vector+32>
0x0000000008000018 <+24>: csrwi mideleg,0
0x000000000800001c <+28>: csrwi medeleg,0
0x0000000008000020 <+32>: csrw mscratch,zero
0x0000000008000024 <+36>: csrw mcause,zero
0x0000000008000028 <+40>: csrw mepc,zero
0x000000000800002c <+44>: beqz a0,0x8000030 <reset_vector+48>
0x0000000008000030 <+48>: csrr t0,misa
0x0000000008000034 <+52>: bltz t0,0x800003c <reset_vector+60>
0x0000000008000038 <+56>: j 0x8000030 <reset_vector+48>
0x000000000800003c <+60>: auipc gp,0x23
0x0000000008000040 <+64>: addi gp,gp,-1564 # 0x8022a20 <local_irq_handler_u54_1_table+16>
0x0000000008000044 <+68>: auipc a4,0x49
0x0000000008000048 <+72>: addi a4,a4,-68 # 0x8049000
0x000000000800004c <+76>: auipc a5,0x4b
0x0000000008000050 <+80>: addi a5,a5,-76 # 0x804b000
0x0000000008000054 <+84>: auipc sp,0x4b
0x0000000008000058 <+88>: addi sp,sp,-84 # 0x804b000
0x000000000800005c <+92>: sd zero,0(a4)
0x0000000008000060 <+96>: addi a4,a4,8
0x0000000008000064 <+100>: blt a4,a5,0x800005c <reset_vector+92>
0x0000000008000068 <+104>: auipc a4,0x31
0x000000000800006c <+108>: addi a4,a4,-1832 # 0x8030940
0x0000000008000070 <+112>: auipc a5,0x49
0x0000000008000074 <+116>: addi a5,a5,-1840 # 0x8048940
0x0000000008000078 <+120>: sd zero,0(a4)
0x000000000800007c <+124>: addi a4,a4,8
0x0000000008000080 <+128>: blt a4,a5,0x8000078 <reset_vector+120>
0x0000000008000084 <+132>: bnez a1,0x8000098 <reset_vector+152>
0x0000000008000088 <+136>: addi sp,sp,-64
--Type <RET> for more, q to quit, c to continue without paging--
0x000000000800008c <+140>: mv tp,sp
0x0000000008000090 <+144>: mv a0,tp
0x0000000008000094 <+148>: j 0x80023f8 <u54_single_hart>
0x0000000008000098 <+152>: mv a0,a1
0x000000000800009c <+156>: j 0x80023f8 <u54_single_hart>
0x00000000080000a0 <+160>: nop
0x00000000080000a4 <+164>: nop
0x00000000080000a8 <+168>: j 0x80000a0 <reset_vector+160>
End of assembler dump.
objdump output:
build/debug/c3app.elf: file format elf64-littleriscv
Disassembly of section .text:
0000000008000000 <_start>:
8000000: 00000717 auipc a4,0x0
8000004: 0ac70713 addi a4,a4,172 # 80000ac <trap_vector>
8000008: 30571073 csrw mtvec,a4
800000c: 305027f3 csrr a5,mtvec
8000010: fef71ee3 bne a4,a5,800000c <_start+0xc>
8000014: 00050663 beqz a0,8000020 <_start+0x20>
8000018: 30305073 csrwi mideleg,0
800001c: 30205073 csrwi medeleg,0
8000020: 34001073 csrw mscratch,zero
8000024: 34201073 csrw mcause,zero
8000028: 34101073 csrw mepc,zero
800002c: 00050263 beqz a0,8000030 <_start+0x30>
8000030: 301022f3 csrr t0,misa
8000034: 0002c463 bltz t0,800003c <_start+0x3c>
8000038: ff9ff06f j 8000030 <_start+0x30>
800003c: 00023197 auipc gp,0x23
8000040: 9e418193 addi gp,gp,-1564 # 8022a20 <__global_pointer$>
8000044: 00049717 auipc a4,0x49
8000048: fbc70713 addi a4,a4,-68 # 8049000 <__app_stack_bottom>
800004c: 0004b797 auipc a5,0x4b
8000050: fb478793 addi a5,a5,-76 # 804b000 <__app_stack_top>
8000054: 0004b117 auipc sp,0x4b
8000058: fac10113 addi sp,sp,-84 # 804b000 <__app_stack_top>
800005c: 00073023 sd zero,0(a4)
8000060: 00870713 addi a4,a4,8
8000064: fef74ce3 blt a4,a5,800005c <_start+0x5c>
8000068: 00031717 auipc a4,0x31
800006c: 8d870713 addi a4,a4,-1832 # 8030940 <__bss_end>
8000070: 00049797 auipc a5,0x49
8000074: 8d078793 addi a5,a5,-1840 # 8048940 <__heap_end>
8000078: 00073023 sd zero,0(a4)
800007c: 00870713 addi a4,a4,8
8000080: fef74ce3 blt a4,a5,8000078 <_start+0x78>
8000084: 00059a63 bnez a1,8000098 <_start+0x98>
8000088: fc010113 addi sp,sp,-64
800008c: 00010213 mv tp,sp
8000090: 00020513 mv a0,tp
8000094: 3640206f j 80023f8 <u54_single_hart>
8000098: 00058513 mv a0,a1
800009c: 35c0206f j 80023f8 <u54_single_hart>
80000a0: 00000013 nop
80000a4: 00000013 nop
80000a8: ff9ff06f j 80000a0 <_start+0xa0>
Actually, I was able to replicate the issue I have got also with this example from Microchip, which is an example of an application which is supposed to be loaded by the bootloader (so compiled with IMAGE_LOADED_BY_BOOTLOADER 1) which is the exact same thing I am trying to do in my use case.
Prior doing this experiment I flashed reference design ver 2021.11 (which also ships HSS, so the board was loaded with HSS). All I did was to open up the latest release of the example in SoftConsole and then build it using the build button. After, I just manually invoked OpenOCD:
fcuzzocrea@Latitude-5420:~$ /opt/openocd/bin/openocd --command "set DEVICE MPFS" --command "set COREID 1" --file /opt/openocd/share/openocd/scripts/board/microsemi-riscv.cfg
xPack OpenOCD (Microchip SoftConsole build), x86_64 Open On-Chip Debugger 0.10.0+dev-00859-g95a8cd9b5-dirty (2020-10-21-21:16)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
MPFS
1
Info : only one transport option; autoselect 'jtag'
do_board_reset_init
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : Embedded FlashPro6 (revision B) found (USB_ID=1514:200b path=/dev/hidraw1)
Info : Embedded FlashPro6 (revision B) CM3 firmware version: F4.0
Info : clock speed 6000 kHz
Info : JTAG tap: mpfs.cpu tap/device found: 0x0f81a1cf (mfg: 0x0e7 (GateField), part: 0xf81a, ver: 0x0)
Info : datacount=2 progbufsize=16
Info : Disabling abstract command reads from CSRs.
Info : Examined RISC-V core; found 5 harts
Info : hart 0: currently disabled
Info : hart 1: XLEN=64, misa=0x800000000014112d
Info : hart 2: currently disabled
Info : hart 3: currently disabled
Info : hart 4: currently disabled
Info : Listening on port 3333 for gdb connections
Info : accepting 'gdb' connection on tcp/3333
Info : Disabling abstract command writes to CSRs.
The arguments match what SoftConsole default configuration is setting
After doing that I manually invoked SoftConsole GDB and I got trapped again:
fcuzzocrea@Latitude-5420:~/.local/microchip/SoftConsole-v2021.1/extras/home/polarfire-soc-bare-metal-examples/applications/mpfs-pmp-demo/mpfs-pmp-app-u54-1/DDR-Release$ /home/fcuzzocrea/.local/microchip/SoftConsole-v2021.1/riscv-unknown-elf-gcc/bin/riscv64-unknown-elf-gdb mpfs-pmp-app-u54-1.elf
GNU gdb (xPack GNU RISC-V Embedded GCC (Microsemi SoftConsole build), 64-bit) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=riscv64-unknown-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://github.com/sifive/freedom-tools/issues>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Really redefine built-in command "remote"? (y or n) [answered Y; input not from terminal]
Reading symbols from mpfs-pmp-app-u54-1.elf...
(gdb) set mem inaccessible-by-default off
(gdb) set $target_riscv=1
(gdb) set arch riscv:rv64
The target architecture is assumed to be riscv:rv64
(gdb) target remote localhost:3333
0x000000000800c9ce in ?? ()
Loading section .text, size 0x2780 lma 0x80000000
Loading section .sdata, size 0x10 lma 0x80002780
Loading section .data, size 0xe30 lma 0x80002790
Start address 0x0000000080000000, load size 13760
Transfer rate: 11 KB/sec, 4586 bytes/write.
(gdb) thread apply all set $pc=_start
Thread 1 (Remote target):
(gdb) tb u54_1
Temporary breakpoint 1 at 0x8000138a: file ../src/application/hart1/u54_1.c, line 46.
(gdb) continue
Continuing.
^C
Program received signal SIGINT, Interrupt.
trap_from_machine_mode (regs=0x80413e8, dummy=3149939, mepc=2147483696) at ../src/platform/mpfs_hal/common/mss_mtrap.c:806
806 if(i == 0x1000U)
(gdb) p/x $mcause
$1 = 0x2
(gdb) p/x $mepc
$2 = 0x80000030
(gdb)
(gdb) disassemble 0x80000030
Dump of assembler code for function reset_vector:
0x0000000080000000 <+0>: auipc a4,0x0
0x0000000080000004 <+4>: addi a4,a4,176 # 0x800000b0 <trap_vector>
0x0000000080000008 <+8>: csrw mtvec,a4
0x000000008000000c <+12>: csrr a5,mtvec
0x0000000080000010 <+16>: bne a4,a5,0x8000000c <reset_vector+12>
0x0000000080000014 <+20>: beqz a0,0x80000020 <reset_vector+32>
0x0000000080000018 <+24>: csrwi mideleg,0
0x000000008000001c <+28>: csrwi medeleg,0
0x0000000080000020 <+32>: csrw mscratch,zero
0x0000000080000024 <+36>: csrw mcause,zero
0x0000000080000028 <+40>: csrw mepc,zero
0x000000008000002c <+44>: beqz a0,0x80000034 <reset_vector+52>
0x0000000080000030 <+48>: fscsr zero
0x0000000080000034 <+52>: csrr t0,misa
0x0000000080000038 <+56>: bltz t0,0x80000040 <reset_vector+64>
0x000000008000003c <+60>: j 0x80000034 <reset_vector+52>
0x0000000080000040 <+64>: auipc gp,0x3
0x0000000080000044 <+68>: addi gp,gp,-192 # 0x80002f80 <local_irq_handler_u54_1_table+112>
0x0000000080000048 <+72>: auipc a4,0x4
0x000000008000004c <+76>: addi a4,a4,-72 # 0x80004000
0x0000000080000050 <+80>: auipc a5,0x6
0x0000000080000054 <+84>: addi a5,a5,-80 # 0x80006000
0x0000000080000058 <+88>: auipc sp,0x6
0x000000008000005c <+92>: addi sp,sp,-88 # 0x80006000
0x0000000080000060 <+96>: sd zero,0(a4)
0x0000000080000064 <+100>: addi a4,a4,8
0x0000000080000068 <+104>: blt a4,a5,0x80000060 <reset_vector+96>
0x000000008000006c <+108>: auipc a4,0x4
0x0000000080000070 <+112>: addi a4,a4,-1404 # 0x80003af0
0x0000000080000074 <+116>: auipc a5,0x4
0x0000000080000078 <+120>: addi a5,a5,-1412 # 0x80003af0
0x000000008000007c <+124>: sd zero,0(a4)
0x0000000080000080 <+128>: addi a4,a4,8
0x0000000080000084 <+132>: blt a4,a5,0x8000007c <reset_vector+124>
0x0000000080000088 <+136>: bnez a1,0x8000009c <reset_vector+156>
0x000000008000008c <+140>: addi sp,sp,-64
0x0000000080000090 <+144>: mv tp,sp
0x0000000080000094 <+148>: mv a0,tp
0x0000000080000098 <+152>: j 0x800003ee <u54_single_hart>
0x000000008000009c <+156>: mv a0,a1
0x00000000800000a0 <+160>: j 0x800003ee <u54_single_hart>
0x00000000800000a4 <+164>: nop
0x00000000800000a8 <+168>: nop
0x00000000800000ac <+172>: j 0x800000a4 <reset_vector+164>
Actually I also tried single stepping but it is not clear the cause why I end up in trap_from_machine_mode (at least to me). The only thing I can think of is that no valid data are found in a0 and a1.
(gdb) thread apply all set $pc = _start
Thread 1 (Remote target):
(gdb) bt
#0 reset_vector () at ../src/platform/mpfs_hal/startup_gcc/mss_entry.S:300
(gdb) si
0x0000000080000004 300 la a4, trap_vector
(gdb) si
301 csrw mtvec, a4 # initalise machine trap vector address
(gdb) si
304 csrr a5, mtvec
(gdb) si
305 bne a4, a5, 2b
(gdb) si
311 beqz a0, 3f
(gdb) si
312 csrw mideleg, 0
(gdb) si
313 csrw medeleg, 0
(gdb) si
316 csrw mscratch, zero
(gdb) si
317 csrw mcause, zero
(gdb) si
318 csrw mepc, zero
(gdb) si
323 beqz a0, 1f
(gdb) si
325 fscsr x0
(gdb) si
trap_vector () at ../src/platform/mpfs_hal/startup_gcc/mss_entry.S:391
391 addi sp, sp, -INTEGER_CONTEXT_SIZE # moves sp down stack to make I
(gdb) si
394 STORE sp, 2*REGBYTES(sp) # sp
(gdb) si
395 STORE a0, 10*REGBYTES(sp) # save a0,a1 in the created CONTEXT
(gdb) si
396 STORE a1, 11*REGBYTES(sp)
(gdb) si
397 STORE ra, 1*REGBYTES(sp)
(gdb) si
398 STORE gp, 3*REGBYTES(sp)
(gdb) si
399 STORE tp, 4*REGBYTES(sp)
(gdb) si
400 STORE t0, 5*REGBYTES(sp)
(gdb) si
401 STORE t1, 6*REGBYTES(sp)
(gdb) si
402 STORE t2, 7*REGBYTES(sp)
(gdb) si
403 STORE s0, 8*REGBYTES(sp)
(gdb) si
404 STORE s1, 9*REGBYTES(sp)
(gdb) si
405 STORE a2,12*REGBYTES(sp)
(gdb) si
406 STORE a3,13*REGBYTES(sp)
(gdb) si
407 STORE a4,14*REGBYTES(sp)
(gdb) si
408 STORE a5,15*REGBYTES(sp)
(gdb) si
409 STORE a6,16*REGBYTES(sp)
(gdb) si
410 STORE a7,17*REGBYTES(sp)
(gdb) si
411 STORE s2,18*REGBYTES(sp)
(gdb) si
412 STORE s3,19*REGBYTES(sp)
(gdb) si
413 STORE s4,20*REGBYTES(sp)
(gdb) si
414 STORE s5,21*REGBYTES(sp)
(gdb) si
415 STORE s6,22*REGBYTES(sp)
(gdb) si
416 STORE s7,23*REGBYTES(sp)
(gdb) si
417 STORE s8,24*REGBYTES(sp)
(gdb) si
418 STORE s9,25*REGBYTES(sp)
(gdb) si
419 STORE s10,26*REGBYTES(sp)
(gdb) si
420 STORE s11,27*REGBYTES(sp)
(gdb) si
421 STORE t3,28*REGBYTES(sp)
(gdb) si
422 STORE t4,29*REGBYTES(sp)
(gdb) si
423 STORE t5,30*REGBYTES(sp)
(gdb) si
424 STORE t6,31*REGBYTES(sp)
(gdb) si
426 mv a0, sp # a0 <- regs
(gdb) si
432 csrr a1, mbadaddr # useful for anaysis when things go wrong
(gdb) si
433 csrr a2, mepc
(gdb) si
434 jal trap_from_machine_mode
(gdb) si
trap_from_machine_mode (regs=0x8041258, dummy=3149939, mepc=2147483696) at ../src/platform/mpfs_hal/common/mss_mtrap.c:760
760 volatile uintptr_t mcause = read_csr(mcause);
(gdb) si
0x0000000080000a6a 760 volatile uintptr_t mcause = read_csr(mcause);
(gdb) si
0x0000000080000a6c 760 volatile uintptr_t mcause = read_csr(mcause);
(gdb) si
0x0000000080000a6e 760 volatile uintptr_t mcause = read_csr(mcause);
(gdb) si
0x0000000080000a70 760 volatile uintptr_t mcause = read_csr(mcause);
(gdb) si
0x0000000080000a72 760 volatile uintptr_t mcause = read_csr(mcause);
(gdb) si
760 volatile uintptr_t mcause = read_csr(mcause);
(gdb) si
762 if (((mcause & MCAUSE_INT) == MCAUSE_INT) && ((mcause & MCAUSE_CAUSE) > 15U)&& ((mcause & MCAUSE_CAUSE) < 64U))
(gdb) si
0x0000000080000a7a 762 if (((mcause & MCAUSE_INT) == MCAUSE_INT) && ((mcause & MCAUSE_CAUSE) > 15U)&& ((mcause & MCAUSE_CAUSE) < 64U))
(gdb) si
766 else if (((mcause & MCAUSE_INT) == MCAUSE_INT) && ((mcause & MCAUSE_CAUSE) == IRQ_M_EXT))
(gdb) si
0x0000000080000aaa 766 else if (((mcause & MCAUSE_INT) == MCAUSE_INT) && ((mcause & MCAUSE_CAUSE) == IRQ_M_EXT))
(gdb) si
770 else if (((mcause & MCAUSE_INT) == MCAUSE_INT) && ((mcause & MCAUSE_CAUSE) == IRQ_M_SOFT))
(gdb) si
0x0000000080000ac2 770 else if (((mcause & MCAUSE_INT) == MCAUSE_INT) && ((mcause & MCAUSE_CAUSE) == IRQ_M_SOFT))
(gdb) si
774 else if (((mcause & MCAUSE_INT) == MCAUSE_INT) && ((mcause & MCAUSE_CAUSE) == IRQ_M_TIMER))
(gdb) si
0x0000000080000ada 774 else if (((mcause & MCAUSE_INT) == MCAUSE_INT) && ((mcause & MCAUSE_CAUSE) == IRQ_M_TIMER))
(gdb) si
778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000af2 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000af4 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000af6 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000af8 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000afa 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000afe 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000b02 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000b04 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000b06 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
0x0000000080000b0a 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
^[[A0x0000000080000b0c 778 else if ((mcause == CAUSE_STORE_ACCESS) | (mcause == CAUSE_LOAD_ACCESS) | (mcause == CAUSE_FETCH_ACCESS))
(gdb) si
^[[A805 i++; /* added some code as SC debugger hangs if in loop doing nothing */
(gdb) si
^[[A806 if(i == 0x1000U)
(gdb) si
805 i++; /* added some code as SC debugger hangs if in loop doing nothing */
If the illegal instruction exception is happening on the fscsr
instruction at 0x08000030 in the second disassembly listing, then it suggests to me that the target hart doesn't support floating point (F/D extension) - or maybe this extension has been disabled via $misa
? However, I'm a bit confused because there are three disassembly listings and I'm not sure which one relates to the debug session and trap scenario. The earlier two have a different instruction at 0x08000030, namely csrr
to the third (fscsr
). And earlier in the thread you said that the trap was happening at 0x08000018 so it's difficult to keep track of what the issue is and what scenario is being exercised, tested and debugged.
I think you probably need to take this up with Microchip customer support. It's not an OpenOCD issue as far as I can see.
The first disassembly listing was related to my custom app (the one for which 0x08000030 was flagged as the offending instruction), the second disassembly listing instead is what I get when I run objdump against the ELF I am loading trough openocd (always related to my custom app). I posted both just to show that openocd is correctly loading the ELF (so what it is memory matches what I have compiled).
The third disassembly listing instead is what I get when playing with the microchip example.
In both cases I get trapped (although the offending instruction is different for the two cases). I tried the Microchip example just to try some code which is supposed to be tested and working, and noticed it traps too (but the offending instruction is another one).
Anyway, thanks for all your help! I'll try to get in touch with Microchip support!
Seems that, regardless of what code you're using, the problem is always an illegal instruction exception. That being the case I doubt that any code that runs before your program proper (e.g. the bootloader) is relevant - unless, perhaps, it is disabling features/extensions by changing $misa
(e.g. such as switching off floating point F/D extensions) this disabling instructions on which your code depends. That's assuming that the target supports a "dynamic" $misa
in the first place. So either your program is using instructions that the target doesn't support (or, for which, support has been disabled by the default $misa
being changed) or the instruction fetched from memory is simply invalid or corrupted.
Just a brief update to let you know that I was able to achieve my goal using ebreak. Probably it is just an hack, but maybe could be useful for someone else in future.
Basically I have created a wait_for_debugger() function:
void wait_for_debugger(HLS_DATA* hls, MODE_CHOICE mode_choice,
uint64_t next_addr)
{
/* Store current hardid */
uint32_t hartid = read_csr(mhartid);
/* Restore PLIC to known state */
__disable_irq();
PLIC_init();
/* Disable all interrupts: */
write_csr(mie, 0);
switch (mode_choice) {
default:
case M_MODE:
/**
* User application execution should now start and never return
* here....
*/
write_csr(mepc, next_addr);
break;
case S_MODE:
/**
* User application execution should now start and never return
* here....
*/
write_csr(mepc, next_addr);
break;
}
register unsigned long a0 asm("a0") = hartid;
register unsigned long a1 asm("a1") = (unsigned long)hls;
/* Hold for debugger to upload the app */
__asm__("ebreak");
__asm__ __volatile__("mret" : : "r"(a0), "r"(a1));
__builtin_unreachable();
}
Of course I know the value of next_addr
as the entry point of the elf I want to load is the start of the LIM, so I can just hardcode it.
When my bootloader is started from a debugging environment (Trace32 or OpenOCD+GDB), it executes the function and then hits the ebreak, so wait for the application to be loaded in the LIM.
With GDB I do:
fcuzzocrea@Latitude-5420:~$ riscv64-unknown-elf-gdb build/debug/c3app.elf
GNU gdb (GDB) 10.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-linux-gnu --target=riscv64-unknown-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Really redefine built-in command "remote"? (y or n) [answered Y; input not from terminal]
Reading symbols from build/debug/c3app.elf...
(gdb) set mem inaccessible-by-default off
(gdb) set $target_riscv=1
(gdb) set arch riscv:rv64
The target architecture is set to "riscv:rv64".
(gdb) target extended-remote localhost:3333
Remote debugging using localhost:3333
0x000000002022149a in ?? ()
(gdb) start
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Function "main" not defined.
Make breakpoint pending on future shared library load? (y or [n]) n
Starting program: /home/fcuzzocrea/Documenti/Progetti/core3_template_app/build/debug/c3app.elf
Disabling abstract command writes to CSRs.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0000000020221ed8 in ?? ()
(gdb) load
Loading section .text, size 0x22200 lma 0x8000000
Loading section .sdata, size 0x70 lma 0x8022200
Loading section .data, size 0x3930 lma 0x8022270
Loading section .sdram, size 0x1388 lma 0x8025ba0
Start address 0x0000000008000000, load size 159528
Transfer rate: 9 KB/sec, 13294 bytes/write.
(gdb) continue
Continuing.
And my application loads correctly.
As I said, this probably it is just an hack, and can hardly be integrated into an IDE to allow interactive debugging, but at least it is working and I can use GDB from the CLI.
Also - regarding my dwarf messages, I was able to fix them, I was missing the -gdwarf-2 cflag.
Hello,
Could you please let me know the file and line number where you added the wait_for_debugger() function, as well as where you called it? Thanks a lot for your investigation.
Best regards,