Hello, thank you very much for the code you provided, but I have such an error in operation, may I ask how to solve it
Open
ThelilinNB
opened this issue 1 year ago
•
0 comments
[INFO: 2023-07-30 00:36:50,416] Model cotnet50 created, flops_count: 3.29 GMac, param count: 22.22 M
[INFO: 2023-07-30 00:36:50,474] AMP not enabled. Training in float32.
[INFO: 2023-07-30 00:36:50,474] Using native Torch DistributedDataParallel.
Traceback (most recent call last):
File "/data/master21/lipl/CoTNet-master/train.py", line 379, in
main()
File "/data/master21/lipl/CoTNet-master/train.py", line 321, in main
loader_train, mixup_active, mixup_fn = setup_loader(data_config)
File "/data/master21/lipl/CoTNet-master/train.py", line 145, in setup_loader
assert os.path.exists(train_dir)
AssertionError
Traceback (most recent call last):
File "/data/master21/lipl/CoTNet-master/train.py", line 379, in
main()
File "/data/master21/lipl/CoTNet-master/train.py", line 321, in main
loader_train, mixup_active, mixup_fn = setup_loader(data_config)
File "/data/master21/lipl/CoTNet-master/train.py", line 145, in setup_loader
assert os.path.exists(train_dir)
AssertionError
Traceback (most recent call last):
File "/data/master21/lipl/CoTNet-master/train.py", line 379, in
main()
File "/data/master21/lipl/CoTNet-master/train.py", line 321, in main
loader_train, mixup_active, mixup_fn = setup_loader(data_config)
File "/data/master21/lipl/CoTNet-master/train.py", line 145, in setup_loader
assert os.path.exists(train_dir)
AssertionError
Traceback (most recent call last):
File "/data/master21/lipl/CoTNet-master/train.py", line 379, in
main()
File "/data/master21/lipl/CoTNet-master/train.py", line 321, in main
loader_train, mixup_active, mixup_fn = setup_loader(data_config)
File "/data/master21/lipl/CoTNet-master/train.py", line 145, in setup_loader
assert os.path.exists(train_dir)
AssertionError
Traceback (most recent call last):
File "/data/master21/lipl/CoTNet-master/train.py", line 379, in
main()
File "/data/master21/lipl/CoTNet-master/train.py", line 321, in main
loader_train, mixup_active, mixup_fn = setup_loader(data_config)
File "/data/master21/lipl/CoTNet-master/train.py", line 145, in setup_loader
assert os.path.exists(train_dir)
AssertionError
Traceback (most recent call last):
File "/data/master21/lipl/CoTNet-master/train.py", line 379, in
main()
File "/data/master21/lipl/CoTNet-master/train.py", line 321, in main
loader_train, mixup_active, mixup_fn = setup_loader(data_config)
File "/data/master21/lipl/CoTNet-master/train.py", line 145, in setup_loader
assert os.path.exists(train_dir)
AssertionError
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 59360) of binary: /home/lipl/anaconda3/envs/dot/bin/python
Traceback (most recent call last):
File "/home/lipl/anaconda3/envs/dot/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/lipl/anaconda3/envs/dot/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/lipl/anaconda3/envs/dot/lib/python3.9/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/lipl/anaconda3/envs/dot/lib/python3.9/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/lipl/anaconda3/envs/dot/lib/python3.9/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/lipl/anaconda3/envs/dot/lib/python3.9/site-packages/torch/distributed/run.py", line 710, in run
elastic_launch(
File "/home/lipl/anaconda3/envs/dot/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/lipl/anaconda3/envs/dot/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: