ibex icon indicating copy to clipboard operation
ibex copied to clipboard

HW breakpoints: "invalid hex 116" message from GDB during Opentitan Verilator simulation

Open bilgiday opened this issue 3 years ago • 3 comments

My Environment

EDA tool and version: Verilator 4.210 2021-07-07 rev v4.210 (openocd) Open On-Chip Debugger 0.11.0 (riscv32-unknown-elf-gdb) GNU gdb (crosstool-NG 1.24.0.498_5075e1f) 11.1

Operating system: PRETTY_NAME="Debian GNU/Linux rodete" NAME="Debian GNU/Linux" VERSION_ID="rodete" VERSION="12 (rodete)" VERSION_CODENAME=rodete ID=debian

Version of the Ibex source code: Two versions:

  • Version 1: I work on the OpenTitan repo v: dd14447ef716bb62ce539e48b4841c7f4e1d2ed7

    • In [rtl] Add support for additional HW breakpoints #1125, DbgHwBreakNum were added to support multiple hardware breakpoints in the Ibex core. I did the following modifications in my local Opentitan repo to enable 8 breakpoints in the earlgrey configuration of Opentitan:
    diff --git a/hw/ip/rv_core_ibex/data/rv_core_ibex.hjson b/hw/ip/rv_core_ibex/data/rv_core_ibex.hjson
    index 0f05f52cf..d82feb8b9 100644
    --- a/hw/ip/rv_core_ibex/data/rv_core_ibex.hjson
    +++ b/hw/ip/rv_core_ibex/data/rv_core_ibex.hjson
    @@ -317,6 +317,13 @@
           expose:  "true"
         },
    
    +    { name:    "DbgHwBreakNum"
    +      type:    "int unsigned"
    +      default: "8"
    +      local:   "false"
    +      expose:  "true"
    +    },
    +
         { name:    "SecureIbex"
           type:    "bit"
           default: "0"
    diff --git a/hw/ip/rv_core_ibex/rtl/rv_core_ibex.sv b/hw/ip/rv_core_ibex/rtl/rv_core_ibex.sv
    index 847168636..56ca88c81 100644
    --- a/hw/ip/rv_core_ibex/rtl/rv_core_ibex.sv
    +++ b/hw/ip/rv_core_ibex/rtl/rv_core_ibex.sv
    @@ -29,6 +29,7 @@ module rv_core_ibex
       parameter bit                   ICacheScramble   = 1'b1,
       parameter bit                   BranchPredictor  = 1'b1,
       parameter bit                   DbgTriggerEn     = 1'b1,
    +  parameter int unsigned          DbgHwBreakNum     = 1,
       parameter bit                   SecureIbex       = 1'b1,
       parameter ibex_pkg::lfsr_seed_t RndCnstLfsrSeed  = ibex_pkg::RndCnstLfsrSeedDefault,
       parameter ibex_pkg::lfsr_perm_t RndCnstLfsrPerm  = ibex_pkg::RndCnstLfsrPermDefault,
    @@ -355,6 +356,7 @@ module rv_core_ibex
         .ICacheScramble           ( ICacheScramble           ),
         .BranchPredictor          ( BranchPredictor          ),
         .DbgTriggerEn             ( DbgTriggerEn             ),
    +    .DbgHwBreakNum            ( DbgHwBreakNum            ),
         // SEC_CM: LOGIC.SHADOW
         // SEC_CM: PC.CTRL_FLOW.CONSISTENCY, CTRL_FLOW.UNPREDICTABLE, CORE.DATA_REG_SW.SCA
         // SEC_CM: EXCEPTION.CTRL_FLOW.GLOBAL_ESC, EXCEPTION.CTRL_FLOW.LOCAL_ESC
    diff --git a/hw/top_earlgrey/data/top_earlgrey.hjson b/hw/top_earlgrey/data/top_earlgrey.hjson
    index e9422f146..6ae37d98c 100644
    --- a/hw/top_earlgrey/data/top_earlgrey.hjson
    +++ b/hw/top_earlgrey/data/top_earlgrey.hjson
    @@ -721,6 +721,7 @@
                        ICacheScramble: "1",
                        BranchPredictor: "0",
                        DbgTriggerEn: "1",
    +		   DbgHwBreakNum: "8",
                        SecureIbex: "1",
                        DmHaltAddr: "tl_main_pkg::ADDR_SPACE_RV_DM__ROM + dm::HaltAddress[31:0]",
                        DmExceptionAddr: "tl_main_pkg::ADDR_SPACE_RV_DM__ROM + dm::ExceptionAddress[31:0]",
    
    
  • Version 2: Unmodified, fresh Opentitan repo supporting only a single debug point (OT repo v: f7dd866865479c79e07a086f073dd44d3a87e13d)

Issue

  • I followed Opentitan Verilator Setup to build the Verilator executable, build the Opentitan firmware, and to connect to the Verilator via Openocd+GDB.
  • After running a few commands without any problems, GDB displays the following message and simulation suspends/freezes: Ignoring packet error, continuing...Invalid hex digit 116. After this point, the only way to end the simulation is to use kill -9 <verilator-proc-id> (See the examples below)
  • The same hardware configuration (with 8 debug points) runs without any problem on an CW310 FPGA board.
  • I also observe a similar behavior with the original Opentitan configuration with a single hardware breakpoint supported (see the GDB example 3 below).
  • I tried to increase the timeout for remote commands but it did not solve the issue: set remotetimeout 30.

Examples

  • Here are a few example outputs from different debug sessions:
    • verilator command: build-out/hw/sim-verilator/Vchip_sim_tb --meminit=rom,build-bin/sw/device/lib/testing/test_rom/test_rom_sim_verilator.scr.39.vmem --meminit=flash,build-bin/sw/device/tests/otp_ctrl_smoketest_sim_verilator.64.scr.vmem --meminit=otp,build-bin/sw/device/otp_img/otp_img_sim_verilator.vmem

    • openocd command and output: openocd -s util/openocd -f board/lowrisc-earlgrey-verilator.cfg

    Info : only one transport option; autoselect 'jtag'
    Info : Hardware thread awareness created
    force hard breakpoints
    Info : Listening on port 6666 for tcl connections
    Info : Listening on port 4444 for telnet connections
    Info : Initializing remote_bitbang driver
    Info : Connecting to localhost:44853
    Info : remote_bitbang driver initialized
    Info : This adapter doesn't support configurable speed
    Info : JTAG tap: riscv.tap tap/device found: 0x04f5484d (mfg: 0x426 (Google Inc), part: 0x4f54, ver: 0x0)
    Info : datacount=2 progbufsize=8
    Info : Examined RISC-V core; found 1 harts
    Info :  hart 0: XLEN=32, misa=0x40101106
    Info : starting gdb server for riscv.tap.0 on 3333
    Info : Listening on port 3333 for gdb connections
    Info : accepting 'gdb' connection on tcp/3333
    Warn : keep_alive() was not invoked in the 1000 ms timelimit. GDB alive packet not sent! (1043 ms). Workaround: increase "set remotetimeout" in GDB
    Info : [0] Found 8 triggers
    Warn : keep_alive() was not invoked in the 1000 ms timelimit. GDB alive packet not sent! (1083 ms). Workaround: increase "set remotetimeout" in GDB
    Warn : keep_alive() was not invoked in the 1000 ms timelimit. GDB alive packet not sent! (1018 ms). Workaround: increase "set remotetimeout" in GDB
    Warn : keep_alive() was not invoked in the 1000 ms timelimit. GDB alive packet not sent! (1008 ms). Workaround: increase "set remotetimeout" in GDB
    Warn : keep_alive() was not invoked in the 1000 ms timelimit. GDB alive packet not sent! (1003 ms). Workaround: increase "set remotetimeout" in GDB
    Warn : keep_alive() was not invoked in the 1000 ms timelimit. GDB alive packet not sent! (1091 ms). Workaround: increase "set remotetimeout" in GDB
    Warn : keep_alive() was not invoked in the 1000 ms timelimit. GDB alive packet not sent! (1002 ms). Workaround: increase "set remotetimeout" in GDB
    Warn : keep_alive() was not invoked in the 1000 ms timelimit. GDB alive packet not sent! (1059 ms). Workaround: increase "set remotetimeout" in GDB
    
    • GDB example 1: Modified OT, supporting 8 hardware breakpoints
      • gdb command and output: riscv32-unknown-elf-gdb -iex "set remotetimeout 30" -iex "target extended-remote :3333" -iex "info reg" build-bin/sw/device/lib/testing/test_rom/test_rom_sim_verilator.elf
    (gdb) tbreak rom_test_main 
    Temporary breakpoint 1 at 0x823a: file ../sw/device/lib/testing/test_rom/test_rom.c, line 52.
    (gdb) tbreak *0x20000480 ==> ENTRY POINT FOR THE FLASH
    Temporary breakpoint 2 at 0x20000480
    (gdb) j *0x8084 ==> CHIP RESET VECTOR'S ADDRESS
    Continuing at 0x8084.
    
    Temporary breakpoint 1, rom_test_main () at ../sw/device/lib/testing/test_rom/test_rom.c:52
    52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) c
    Continuing.
    Ignoring packet error, continuing...
    Warning:
    Cannot insert breakpoint 2: Invalid hex digit 116
    
    Command aborted.
    (gdb)
    
    • GDB example 2: Modified OT, supporting 8 hardware breakpoints :
    (gdb) tbreak rom_test_main 
    Temporary breakpoint 1 at 0x823a: file ../sw/device/lib/testing/test_rom/test_rom.c, line 52.
    (gdb) j *0x8084
    Line 91 is not in `mmio_region_read32'.  Jump anyway? (y or n) y
    Continuing at 0x8084.
    
    Temporary breakpoint 1, rom_test_main () at ../sw/device/lib/testing/test_rom/test_rom.c:52
    52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) si
    0x0000823c	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) si
    0x0000823e	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) si
    0x00008240	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) tbreak rom_test_main 
    Temporary breakpoint 2 at 0x823a: file ../sw/device/lib/testing/test_rom/test_rom.c, line 52.
    (gdb) j *0x8084
    Line 91 is not in `rom_test_main'.  Jump anyway? (y or n) y
    Continuing at 0x8084.
    
    Temporary breakpoint 2, rom_test_main () at ../sw/device/lib/testing/test_rom/test_rom.c:52
    52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) si
    0x0000823c	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) 
    0x0000823e	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) 
    0x00008240	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) 
    0x00008242	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) 
    0x00008246	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) 
    0x0000824a	52	  CHECK_DIF_OK(dif_pinmux_init(
    (gdb) 
    Ignoring packet error, continuing...
    Invalid hex digit 116
    (gdb) 
    
    • GDB example 3: Modified OT, supporting only a single hardware breakpoint:
      (gdb) si
      [0] Found 1 triggers
      mmio_region_write32 (base=..., offset=28, value=48) at /usr/local/google/home/bilgiday/Documents/opentitan/ot-repo3/sw/device/lib/base/mmio.h:145
      145	  ((volatile uint32_t *)base.base)[offset / sizeof(uint32_t)] = value;
      (gdb) tbreak rom_test_main 
      Temporary breakpoint 1 at 0x823a: file ../sw/device/lib/testing/test_rom/test_rom.c, line 52.
      (gdb) j *0x8084
      Line 91 is not in `mmio_region_write32'.  Jump anyway? (y or n) y
      Continuing at 0x8084.
    
      Temporary breakpoint 1, rom_test_main () at ../sw/device/lib/testing/test_rom/test_rom.c:52
      52	  CHECK_DIF_OK(dif_pinmux_init(
      (gdb) si
      0x0000823c	52	  CHECK_DIF_OK(dif_pinmux_init(
      (gdb) si
      0x0000823e	52	  CHECK_DIF_OK(dif_pinmux_init(
      (gdb) 
      0x00008242	52	  CHECK_DIF_OK(dif_pinmux_init(
      (gdb) 
      0x00008246	52	  CHECK_DIF_OK(dif_pinmux_init(
      (gdb) 
      0x0000824a	52	  CHECK_DIF_OK(dif_pinmux_init(
      (gdb) 
      Ignoring packet error, continuing...
      Invalid hex digit 116
      (gdb) 
    
    

Question

  • When I run the GDB sequences in the examples on Verilator, I consistently get the same errors. On the FPGA, everything works fine.
  • I am not sure if something is wrong with my environment (e.g, openocd/GDB settings) or if there is a problem with how Verilator model handles the requests from GDB/Openocd. I just wanted to open a thread to discuss the issue and check if anyone has an idea on the root cause of it.

bilgiday avatar Apr 26 '22 21:04 bilgiday

Thanks for the report. I suspect it's something to do with the Verilator/OpenOCD interface given it works fine on FPGA. I'll see if 8 hardware breakpoints works on Ibex Super System (FPGA system that gives you an Ibex core and debug and not much else, far simpler than OT so easier to debug) as a first step.

GregAC avatar Apr 27 '22 16:04 GregAC

@GregAC - not sure if there were any updates on this or if this is something that has been resolved in one-to-one communications? When you have a chance could you see if this is something that is still a live issue.

johngt avatar Jul 19 '22 12:07 johngt

Quick update: I've had a look at an Ibex Super System build with 8 trigger points on FPGA and all looks to work fine. So whatever is happening here is specific to the OT Verilator infrastructure.

GregAC avatar Sep 22 '22 10:09 GregAC