PaddleSeg
PaddleSeg copied to clipboard
FatalError: `Segmentation fault` is detected by the operating system.
问题确认 Search before asking
- [X] 我已经查询历史issue(包括open与closed),没有发现相似的bug。I have searched the open and closed issues and found no similar bug report.
Bug描述 Describe the Bug
在执行https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/docs/quick_start_cn.md demo时,python train.py及python val.py均能正常运行,但是在执行python predict.py时报错:
C++ Traceback (most recent call last):
0 ImagingZipEncode 1 deflateReset
Error Message Summary:
FatalError: Segmentation fault is detected by the operating system.
[TimeInfo: *** Aborted at 1707286977 (unix time) try "date -d @1707286977" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x0) received by PID 4193083 (TID 0x7f6cb3650480) from PID 0 ***]
Segmentation fault (core dumped)
经过Debug,确认是在脚本的最后执行 pred_mask.save(pred_saved_path) 语句时报错的,进一步调试发现,是在执行ImageFile.py中的errcode, data = encoder.encode(bufsize)[1:]语句时返回的。
复现环境 Environment
------------Environment Information------------- platform: Linux-5.19.0-50-generic-x86_64-with-glibc2.35 Python: 3.9.18 (main, Sep 11 2023, 13:41:44) [GCC 11.2.0] Paddle compiled with cuda: True NVCC: Build cuda_11.7.r11.7/compiler.31294372_0 cudnn: 8.4 GPUs used: 1 CUDA_VISIBLE_DEVICES: 0 GPU: ['GPU 0: NVIDIA RTX', 'GPU 1: NVIDIA RTX', 'GPU 2: NVIDIA RTX', 'GPU 3: NVIDIA RTX'] GCC: gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0 PaddleSeg: 2.9.0 PaddlePaddle: 2.6.0 OpenCV: 4.5.5
使用conda安装的Paddle: conda install paddlepaddle-gpu==2.6.0 cudatoolkit=11.6 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge
Bug描述确认 Bug description confirmation
- [X] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.
是否愿意提交PR? Are you willing to submit a PR?
- [ ] 我愿意提交PR!I'd like to help by submitting a PR!
你好,根据你提供的信息,这个错误出现在文件保存中,请进一步查看pred_mask中是否有非法值。
I also faced similar issue as you when I was training my own model which is HrSegNet. I could train my own model in Windows OS but not Linux, my Linux environment is the same as yours. Not sure where cause the issues.