Missing logger folder: outputs/aedet_lss_r50_256x704_128x128_24e_2key/lightning_logs
Restoring states from the checkpoint path at /home/ww/Coding/AeDet/data/nuscenes/nuscenes_12hz_infos_train.pkl
Traceback (most recent call last):
File "/home/ww/Coding/AeDet/exps/aedet/aedet_lss_r50_256x704_128x128_24e_2key.py", line 109, in
run_cli()
File "/home/ww/Coding/AeDet/exps/aedet/aedet_lss_r50_256x704_128x128_24e_2key.py", line 105, in run_cli
main(args)
File "/home/ww/Coding/AeDet/exps/aedet/aedet_lss_r50_256x704_128x128_24e_2key.py", line 75, in main
trainer.fit(model, ckpt_path=args.ckpt_path)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit
self._call_and_handle_interrupt(
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 722, in _call_and_handle_interrupt
return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch
return function(*args, **kwargs)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1180, in _run
self._restore_modules_and_callbacks(ckpt_path)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1140, in _restore_modules_and_callbacks
self._checkpoint_connector.resume_start(checkpoint_path)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 84, in resume_start
self._loaded_checkpoint = self._load_and_validate_checkpoint(checkpoint_path)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 88, in _load_and_validate_checkpoint
loaded_checkpoint = self.trainer.strategy.load_checkpoint(checkpoint_path)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 316, in load_checkpoint
return self.checkpoint_io.load_checkpoint(checkpoint_path)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/plugins/io/torch_plugin.py", line 85, in load_checkpoint
return pl_load(path, map_location=map_location)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/pytorch_lightning/utilities/cloud_io.py", line 47, in load
return torch.load(f, map_location=map_location)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/torch/serialization.py", line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/ww/.conda/envs/aedet2/lib/python3.8/site-packages/torch/serialization.py", line 780, in _legacy_load
raise RuntimeError("Invalid magic number; corrupt file?")
RuntimeError: Invalid magic number; corrupt file?
What is your training script?
sudo /home/ww/.conda/envs/aedet2/bin/python /home/ww/Coding/AeDet/exps/aedet/aedet_lss_r101_512x1408_256x256_24e_2key.py --amp_backend native -b 8 --gpus 1 --ckpt_path /home/ww/Coding/AeDet/data/nuScenes/nuscenes_12hz_infos_train.pkl
--ckpt_path means the path of the model checkpoint, and you should remove it, namely:
sudo /home/ww/.conda/envs/aedet2/bin/python /home/ww/Coding/AeDet/exps/aedet/aedet_lss_r101_512x1408_256x256_24e_2key.py --amp_backend native -b 8 --gpus 1
very very very appreciate it!!
this problem has been solved.
from now on, i'll order food delivery by Meittuan.
thanks again!!