File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
Traceback (most recent call last):
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
Traceback (most recent call last):
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
Traceback (most recent call last):
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
Traceback (most recent call last):
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402Traceback (most recent call last):
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
[2024-02-09 11:16:17,756] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 17760) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
/content/ChatDoctor/train_lora.py FAILED
Failures:
[1]:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 17761)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[2]:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 17762)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[3]:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 17763)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[4]:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 4 (local_rank: 4)
exitcode : 1 (pid: 17764)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[5]:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 5 (local_rank: 5)
exitcode : 1 (pid: 17765)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure):
[0]:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 17760)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html