basedet
basedet copied to clipboard
训练报错
python版本:3.8 训练命令:basedet_train -f playground/examples/atss/config.py -n 4 报错日志如下
/home/csy/.local/lib/python3.8/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/csy/.local/lib/python3.8/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
return self._float_to_str(self.smallest_subnormal)
/home/csy/.local/lib/python3.8/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/csy/.local/lib/python3.8/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
return self._float_to_str(self.smallest_subnormal)
(py3.8) csy@hpc:~/megvii/basedet$ basedet_train -f playground/examples/atss/config.py -n 4
/home/csy/.local/lib/python3.8/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/csy/.local/lib/python3.8/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
return self._float_to_str(self.smallest_subnormal)
/home/csy/.local/lib/python3.8/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/csy/.local/lib/python3.8/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
return self._float_to_str(self.smallest_subnormal)
2023-08-26 11:59:26.656 | INFO | basedet.tools.det_train:launch_workers:69 - Init process group for gpu3 done
2023-08-26 11:59:26.668 | INFO | basedet.tools.det_train:launch_workers:69 - Init process group for gpu0 done
2023-08-26 11:59:26.668 | INFO | basedet.tools.det_train:launch_workers:69 - Init process group for gpu1 done
2023-08-26 11:59:26.670 | INFO | basedet.tools.det_train:launch_workers:69 - Init process group for gpu2 done
Process Process-2:
Traceback (most recent call last):
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/site-packages/megengine/distributed/launcher.py", line 52, in _run_wrapped
ret = func(*args, **kwargs)
File "/home/csy/megvii/basedet/basedet/tools/det_train.py", line 88, in launch_workers
setup_basedet_logger(log_path=cfg.GLOBAL.OUTPUT_DIR, to_loguru=True)
File "/home/csy/megvii/basedet/basedet/utils/logger_utils.py", line 35, in setup_basedet_logger
logger.add(
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/site-packages/loguru/_logger.py", line 776, in add
wrapped_sink = FileSink(path, **kwargs)
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/site-packages/loguru/_file_sink.py", line 194, in __init__
self._create_dirs(path)
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/site-packages/loguru/_file_sink.py", line 226, in _create_dirs
os.makedirs(dirname, exist_ok=True)
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
[Previous line repeated 2 more times]
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/os.py", line 223, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/data'
2023-08-26 11:59:27.541 | ERROR | __main__:<module>:33 - An error has been caught in function '<module>', process 'MainProcess' (1554369), thread 'MainThread' (139692293293120):
Traceback (most recent call last):
> File "/home/csy/anaconda3/envs/py3.8/bin/basedet_train", line 33, in <module>
sys.exit(load_entry_point('basedet', 'console_scripts', 'basedet_train')())
│ │ └ <function importlib_load_entry_point at 0x7f0ca4627700>
│ └ <built-in function exit>
└ <module 'sys' (built-in)>
File "/home/csy/megvii/basedet/basedet/tools/det_train.py", line 150, in main
run()
└ <function main.<locals>.run at 0x7f0b93e799d0>
File "/home/csy/megvii/basedet/basedet/tools/det_train.py", line 139, in run
train(args, cfg)
│ │ └ ╒═════════╤════════════════════════════════════════════════════════════════════════════════╕
│ │ │ keys │ values ...
│ └ Namespace(amp=False, debug_mode=False, dir=None, dtr=False, ema=False, fastrun=False, file='playground/examples/atss/config.p...
└ <megengine.distributed.launcher.launcher object at 0x7f0b93ef7be0>
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/site-packages/megengine/distributed/launcher.py", line 149, in __call__
assert (
AssertionError: subprocess 0 exit with code 1
Traceback (most recent call last):
File "/home/csy/anaconda3/envs/py3.8/bin/basedet_train", line 33, in <module>
sys.exit(load_entry_point('basedet', 'console_scripts', 'basedet_train')())
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/site-packages/loguru/_logger.py", line 1251, in catch_wrapper
return function(*args, **kwargs)
File "/home/csy/megvii/basedet/basedet/tools/det_train.py", line 150, in main
run()
File "/home/csy/megvii/basedet/basedet/tools/det_train.py", line 139, in run
train(args, cfg)
File "/home/csy/anaconda3/envs/py3.8/lib/python3.8/site-packages/megengine/distributed/launcher.py", line 149, in __call__
assert (
AssertionError: subprocess 0 exit with code 1
按照install.md安装的
PermissionError: [Errno 13] Permission denied: '/data'
Plz read your error log carefully and search the friendly web.