ICON
ICON copied to clipboard
Multiprocessing error in visibility phase
I followed instructions in dataset.md to process THuman2.0.
Rendering phase works fine using python -m scripts.render_batch -debug -headless
.
However, running visibility phase using python -m scripts.visibility_batch -debug
failed:
(dev) pawting@pc0809:/media/pawting/SN640/hello_worlds/ICON$ python -m scripts.visibility_batch_mod -debug
Start Visibility Computing thuman2 with 36 views.
Output dir: ./debug/thuman2_36views
0%| | 0/2 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/media/pawting/SN640/hello_worlds/ICON/scripts/visibility_batch_mod.py", line 36, in visibility_subject
smpl_verts = torch.from_numpy(rescale_fitted_body.vertices).to(device).float()
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/site-packages/torch/cuda/__init__.py", line 284, in _lazy_init
raise RuntimeError(
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/media/pawting/SN640/hello_worlds/ICON/scripts/visibility_batch_mod.py", line 122, in <module>
for _ in tqdm(
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
for obj in iterable:
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/multiprocessing/pool.py", line 873, in next
raise value
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
I'm not familiar with muiltiprocessing, maybe it's related to operating .to(device)
in subprocesses?
I've tried suggestion here to force 'spawn' as start method, it wont work:
(dev) pawting@pc0809:/media/pawting/SN640/hello_worlds/ICON$ python -m scripts.visibility_batch -debug
Start Visibility Computing thuman2 with 36 views.
Output dir: ./debug/thuman2_36views
0%| | 0/2 [00:06<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/media/pawting/SN640/hello_worlds/ICON/scripts/visibility_batch.py", line 25, in visibility_subject
gpu_id = queue.get()
NameError: name 'queue' is not defined. Did you mean: 'Queue'?
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/media/pawting/SN640/hello_worlds/ICON/scripts/visibility_batch.py", line 97, in <module>
for _ in tqdm(
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
for obj in iterable:
File "/home/pawting/anaconda3/envs/dev/lib/python3.10/multiprocessing/pool.py", line 873, in next
raise value
NameError: name 'queue' is not defined