rt-thread icon indicating copy to clipboard operation
rt-thread copied to clipboard

[Bug] k230 utest opencv 19_getStructuringElement 运行出错

Open yixinghua121 opened this issue 11 months ago • 4 comments

RT-Thread Version

c1166e0bf16aeda85e5facd01506973afcf6e8c6

Hardware Type/Architectures

k230

Develop Toolchain

GCC

Describe the bug

1.将 opencv 仓库中的 tests 文件夹复制到 software/userapps/testcases/opencv 中 2.将 opencv 仓库中的 build/install 文件夹复制到 software/userapps/testcases/opencv 中 3.编译: cd software # 在 software 目 录 下 source ./smart-env.sh riscv64 # 运 行 smart-env.sh 脚 本, 配 置 为 riscv64 环 境 cd userapps/testcases/opencv python3 build.py 4.生成的产物将存放在 software/userapps/testcases/opencv/root 中 5.root 目录可参照 maix3 仓库下的 readme 直接对其进行编译打包,并下载到 Maix3 平台上运行(注: 若是通过 romfs 打包方式,请一次只打包一个 elf 文件)tests.tar.gz

Other additional context

25_opencv_calib3d.elf 图像显示深度信息有问题

yixinghua121 avatar Jan 10 '25 07:01 yixinghua121

让lwp_munmap直接返回,运行不崩溃,问题范围可以大概锁定在lwp_mmap2和lwp_munmap之间

image

image

heyuanjie87 avatar Jan 15 '25 13:01 heyuanjie87

目前这个问题取得了一些比较有帮助的调试信息: 出问题的地址由malloc调用mmap分配,其中前64字节用于malloc的管理算法,后面部分由用户使用。 musl如下:

void *malloc(size_t n)
{
	struct chunk *c;
	int i, j;

	if (adjust_size(&n) < 0) return 0;

	if (n > MMAP_THRESHOLD) {
		size_t len = n + OVERHEAD + PAGE_SIZE - 1 & -PAGE_SIZE;
		char *base = __mmap(0, len, PROT_READ|PROT_WRITE,
			MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
		if (base == (void *)-1) return 0;
		c = (void *)(base + SIZE_ALIGN - OVERHEAD);
		c->csize = len - (SIZE_ALIGN - OVERHEAD);
		c->psize = SIZE_ALIGN - OVERHEAD;
		return CHUNK_TO_MEM(c);
	}
}

由此可见前64字节区域是必定最先被写入的,也就是理论上前面64字节区域最先发生page-fault异常。 然而实际与预期有所不同:出问题的页page-fault地址在64字节之外(也就是首次写入成功时物理页还没被映射)

下面是日志:

/* 打印page-fault地址 */
int rt_aspace_fault_try_fix(rt_aspace_t aspace, struct rt_aspace_fault_msg *msg)
{
    int err = MM_FAULT_FIXABLE_FALSE;
    uintptr_t va = (uintptr_t)msg->fault_vaddr;
    if (va >= 0x618000 && va < 0x100000000) //过滤掉一些不关心的地址
    {
        rt_kprintf("fault addr %p\n", va);
    }
}
mmap addr = 0x0000000000618000, 41000 /* 在mmap系统调用中打印分配到的地址 */
fault addr 0x0000000000618018 /* 正常情况 */
fault addr 0x0000000000658642

mmap addr = 0x00000000007da000, 181000
fault addr 0x00000000007da018 /* 正常情况 */

mmap addr = 0x0000000000618000, 41000
fault addr 0x0000000000618243  /*  page-fault地址不在预期范围 */
fault addr 0x0000000000619000
munmap addr = 0x0000000000618030 length = 0[I/mm.aspace] rt_aspace_unmap_range(addr=0x0000000000618030): Unaligned address

heyuanjie87 avatar Mar 05 '25 07:03 heyuanjie87

退出时清理dcache后不出现未对齐的问题,但是第二次运行时出现了其它更严重的问题

Image

heyuanjie87 avatar Mar 10 '25 05:03 heyuanjie87

https://github.com/RT-Thread/rt-thread/pull/10100

在deepseek的提示下五秒定位了问题

DeepSeek-V3
CPU的数据缓存(DCache)采用 PIPT(Physically Indexed, Physically Tagged) 策略的优缺点如下:

/* 其它删除了留下了下面关键提示 */
缺点
延迟增加 :
    需先完成虚拟地址到物理地址的转换,增加了访问延迟。

下面是修复后k230测试日志(无异常信息):

msh />19_getStructuringElement.elf
msh />[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!

msh />19_getStructuringElement.elf
msh />[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!

msh />19_getStructuringElement.elf
msh />[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!

msh />
msh />19_getStructuringElement.elf
msh />[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!
[W/time] Cannot find a RTC device!

heyuanjie87 avatar Mar 10 '25 10:03 heyuanjie87