Checkpoint files are not generated
Before start
- [x] I have read the XiangShan Documents. 我已经阅读过香山文档。
- [x] I have searched the previous issues and did not find anything relevant. 我已经搜索过之前的 issue,并没有找到相关的。
- [x] I have searched the previous discussions and did not find anything relevant. 我已经搜索过之前的 discussions,并没有找到相关的。
- [x] I have reproduced the problem using the latest commit on the master branch. 我已经使用 master 分支最新的 commit 复现了问题。
Describe you problem
I am trying to generate checkpoints using NEMU so that I can run it on XiangShan. I am following the instructions to do so, but the checkpoint files are not generated. It appears that the profiling is done, and the clustering is done. But the generation of the checkpoints doesn't result in any files being generated.
What did you do before
Setup tools
git clone https://github.com/OpenXiangShan/xs-env.git
cd /xs-env && sudo -s ./setup-tools.sh && ./setup.sh && source env.sh && source update-submodule.sh
Setup NEMU and simpoint
cd $NEMU_HOME
git submodule update --init
cd $NEMU_HOME/resource/simpoint/simpoint_repo
make clean
make
cd $NEMU_HOME
make clean
make riscv64-xs-cpt_defconfig
make -j 8
cd $NEMU_HOME/resource/gcpt_restore
make
Set an example from nexus-am/apps for checkpoint
cd /xs-env/nexus-am/apps/hello/
Rework the hello.c to so that the traps are set.
#define DISABLE_TIME_INTR 0x100
#define NOTIFY_PROFILER 0x101
#define GOOD_TRAP 0x0
void nemu_signal(int a){
asm volatile ("mv a0, %0\n\t"
".insn r 0x6B, 0, 0, x0, x0, x0\n\t"
:
: "r"(a)
: "a0");
}
#include <klib.h>
int main()
{
nemu_signal(DISABLE_TIME_INTR);
nemu_signal(NOTIFY_PROFILER);
printf("Hello, XiangShan!\n");
nemu_signal(GOOD_TRAP);
return 0;
}
Compile hello
make ARCH=riscv64-xs
Run the checkpoint steps
I used the following script.
#!/bin/bash
# prepare env
export NEMU_HOME=/xs-env/NEMU
export NEMU=$NEMU_HOME/build/riscv64-nemu-interpreter
export GCPT=$NEMU_HOME/resource/gcpt_restore/build/gcpt.bin
export SIMPOINT=$NEMU_HOME/resource/simpoint/simpoint_repo/bin/simpoint
export WORKLOAD_ROOT_PATH=/xs-env/nexus-am/apps/hello/build/
export LOG_PATH=$NEMU_HOME/hello/logs
export RESULT=$NEMU_HOME/hello_result
export profiling_result_name=simpoint-profiling
export PROFILING_RES=$RESULT/$profiling_result_name
export interval=$((2))
# Profiling
# using config: riscv64-xs-cpt_defconfig
profiling(){
set -x
workload=$1
log=$LOG_PATH/profiling_logs
mkdir -p $log
$NEMU ${WORKLOAD_ROOT_PATH}/${workload}.bin \
-D $RESULT -w $workload -C $profiling_result_name \
-b --simpoint-profile --cpt-interval ${interval} > $log/${workload}-out.txt 2>${log}/${workload}-err.txt
}
export -f profiling
# Cluster
cluster(){
set -x
workload=$1
export CLUSTER=$RESULT/cluster/${workload}
mkdir -p $CLUSTER
random1=`head -20 /dev/urandom | cksum | cut -c 1-6`
random2=`head -20 /dev/urandom | cksum | cut -c 1-6`
log=$LOG_PATH/cluster_logs/cluster
mkdir -p $log
$SIMPOINT \
-loadFVFile $PROFILING_RES/${workload}/simpoint_bbv.gz \
-saveSimpoints $CLUSTER/simpoints0 -saveSimpointWeights $CLUSTER/weights0 \
-inputVectorsGzipped -maxK 30 -numInitSeeds 2 -iters 1000 -seedkm ${random1} -seedproj ${random2} \
> $log/${workload}-out.txt 2> $log/${workload}-err.txt
}
export -f cluster
# Checkpointing
# using config: riscv64-xs-cpt_defconfig
checkpoint(){
set -x
workload=$1
export CLUSTER=$RESULT/cluster
log=$LOG_PATH/checkpoint_logs
mkdir -p $log
$NEMU ${WORKLOAD_ROOT_PATH}/${workload}.bin \
-D $RESULT -w ${workload} -C spec-cpt \
-b -S $CLUSTER --cpt-interval $interval \
--checkpoint-format zstd > $log/${workload}-out.txt 2>$log/${workload}-err.txt
}
export -f checkpoint
profiling hello-riscv64-xs
cluster hello-riscv64-xs
checkpoint hello-riscv64-xs
The files I see generated
tree NEMU/hello*
NEMU/hello
`-- logs
|-- checkpoint_logs
| |-- hello-riscv64-xs-err.txt
| `-- hello-riscv64-xs-out.txt
|-- cluster_logs
| `-- cluster
| |-- hello-riscv64-xs-err.txt
| `-- hello-riscv64-xs-out.txt
`-- profiling_logs
|-- hello-riscv64-xs-err.txt
`-- hello-riscv64-xs-out.txt
NEMU/hello_result
|-- cluster
| `-- hello-riscv64-xs
| |-- simpoints0
| `-- weights0
|-- simpoint-profiling
| `-- hello-riscv64-xs
| `-- simpoint_bbv.gz
`-- spec-cpt
`-- hello-riscv64-xs
`-- 1
Environment
- XiangShan branch: master
- XiangShan commit id: 4bbdccbb077840af5e1b65c7138d31af3966f625
- NEMU commit id: 4a24b77a61505e34745667b1ad712a817b090cf8
- SPIKE commit id:
- Operating System: Ubuntu 22.04
- gcc version: 11.4.0
- mill version: 0.12.10
- java version: 11.0.26
Additional context
I also tried this with the application stream (as it has been used in some of the tutorials such as ASPLOS 2025), but I had the same problem: nexus-am/apps/stream.
I think the cause of this issue is that the app in nexus-am runs entirely in M-mode, and NEMU requires the --cpt-mmode option to generate checkpoints in M-mode.
While this option does allow checkpoint generation, the generated checkpoints cannot be used for restore with emu, because resource/gcpt_restore in NEMU does not support restoring M-mode checkpoints.
Therefore, my suggestion is to wrap the workload you want to checkpoint inside OpenSBI and Linux, and run it as a user-space program.
Additionally, the stream used in the tutorial you mentioned is not directly built from the stream in nexus-am, but instead, as explained above, it is packaged as a user-space program under Linux.