InternEvo icon indicating copy to clipboard operation
InternEvo copied to clipboard

[Bug] 昇腾910微调internLM报错

Open rourouZ opened this issue 1 year ago • 3 comments

Describe the bug

Traceback (most recent call last): File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/pool.py", line 131, in worker put((job, i, result)) File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/queues.py", line 368, in put self._writer.send_bytes(obj) File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes self._send_bytes(m[offset:offset + size]) File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes self._send(header + buf) File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/pool.py", line 131, in worker put((job, i, result)) File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 368, in _send n = write(self._handle, buf) File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/queues.py", line 368, in put self._writer.send_bytes(obj) File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes self._send_bytes(m[offset:offset + size]) BrokenPipeError: [Errno 32] Broken pipe File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes self._send(header + buf)

During handling of the above exception, another exception occurred:

File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 368, in _send n = write(self._handle, buf) Traceback (most recent call last): BrokenPipeError: [Errno 32] Broken pipe

Environment

python==3.8 torch==2.0.1

Other information

No response

rourouZ avatar Apr 23 '24 12:04 rourouZ

hello @rourouZ ,您好,看起来torchnpu输出的报错堆栈包含的有效信息不多,我们这边适配华为NPU使用的环境是:

        torch: 2.1.0+cpu
        torch_npu: 2.1.0.post3+git7c4136d
        cann: 8.0.RC1.alpha003

您可以试试用这个环境跑下,我这边测试应该是ok的,如果您有任何问题internlm交流群@我也可以

SolenoidWGT avatar Apr 25 '24 03:04 SolenoidWGT

可以麻烦提供下运行成功的npu镜像吗?多谢!

weiliangxiong avatar May 16 '24 05:05 weiliangxiong

可以麻烦提供下运行成功的npu镜像吗?多谢!

可以试下这个 docker pull internlm/opencompass:opencompass-20240607

li126com avatar Jun 27 '24 10:06 li126com