Hierarchical-Localization
Hierarchical-Localization copied to clipboard
Fix: Retrieval OOM for large datasets
I tried running pairs_from_retrieval with a database of > 100.000 images.
[2023/12/21 10:41:51 hloc INFO] Extracting image pairs from a retrieval database.
Traceback (most recent call last):
File "/home/xxxxxx/miniconda3/envs/sdfstudio/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/xxxxxx/miniconda3/envs/sdfstudio/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/xxxxxx/programs/Hierarchical-Localization/hloc/pairs_from_retrieval.py", line 122, in <module>
main(**args.__dict__)
File "/home/xxxxxx/programs/Hierarchical-Localization/hloc/pairs_from_retrieval.py", line 98, in main
sim = torch.einsum('id,jd->ij', query_desc.to(device), db_desc.to(device))
File "/home/xxxxxx/miniconda3/envs/sdfstudio/lib/python3.10/site-packages/torch/functional.py", line 378, in einsum
return _VF.einsum(equation, operands) # type: ignore[attr-defined]
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 140.13 GiB (GPU 0; 39.43 GiB total capacity; 2.96 GiB already allocated; 35.52 GiB free; 2.96 GiB reserved in total by PyTorch) If rese..
It runs out of gpu memory. This is also a significant amount of CPU memory, therefore I added an argument to process the sim matrix in chuncks. (--chunk_size)
I verified that the outputs are the same but with different ordering.