RocketQA
RocketQA copied to clipboard
训练为何会报这个错?你们的训练example.py
训练文件 example.py
报错信息
Traceback (most recent call last):
File "/opt/qa/RocketQA/examples/example.py", line 66, in
你好,训练不了,运行会报错。
可以把device_id设成0试试
我是cuda 11.2 ,cuDNN Version: 8.1 报错
Error Message Summary:
FatalError: Segmentation fault
is detected by the operating system.
[TimeInfo: *** Aborted at 1655791849 (unix time) try "date -d @1655791849" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x8497) received by PID 1872 (TID 0x7f275a7fc700) from PID 33943 ***]
网上说,应该是这个模型还不支持cuDNN Version: 8 导致的。累
请问有修复该问题的打算?
用容器运行,并且把device_id改成0即可解决。
docker pull paddlepaddle/paddle:2.3.1-gpu-cuda11.2-cudnn8