RAGatouille
RAGatouille copied to clipboard
Indexing failing: subcommand issues
I'm testing the Studio Ghibli sample code in the README and running on an Ubuntu 22.04.3 dist. of an EC2 machine, Python 3.10.
The problem seems to be related to CUDA. Here is my code:
from ragatouille import RAGPretrainedModel
from ragatouille.utils import get_wikipedia_page
from ragatouille.data import CorpusProcessor
import os
os.environ['CUDA_HOME'] = '/usr/local/cuda-12.3'
RAG = RAGPretrainedModel.from_pretrained('colbert-ir/colbertv2.0')
my_documents = [get_wikipedia_page("Hayao_Miyazaki"), get_wikipedia_page("Studio_Ghibli")]
processor = CorpusProcessor()
processed_docs = processor.process_corpus(my_documents)
index_path = RAG.index(index_name="ghibli_test", collection=processed_docs)
and here is the output:
[Jan 16, 14:37:46] #> Creating directory .ragatouille/colbert/indexes/ghibli_test
#> Starting...
nranks = 1 num_gpus = 1 device=0
[Jan 16, 14:37:49] [0] #> Encoding 96 passages..
[Jan 16, 14:37:51] [0] avg_doclen_est = 189.6770782470703 len(local_sample) = 96
[Jan 16, 14:37:51] [0] Creating 2,048 partitions.
[Jan 16, 14:37:51] [0] *Estimated* 18,208 embeddings.
[Jan 16, 14:37:51] [0] #> Saving the indexing plan to .ragatouille/colbert/indexes/ghibli_test/plan.json ..
WARNING clustering 17299 points to 2048 centroids: please provide at least 79872 training points
Process Process-2:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
subprocess.run(
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/.local/lib/python3.10/site-packages/colbert/infra/launcher.py", line 115, in setup_new_process
return_val = callee(config, *args)
File "/home/ubuntu/.local/lib/python3.10/site-packages/colbert/indexing/collection_indexer.py", line 33, in encode
encoder.run(shared_lists)
File "/home/ubuntu/.local/lib/python3.10/site-packages/colbert/indexing/collection_indexer.py", line 68, in run
self.train(shared_lists) # Trains centroids from selected passages
File "/home/ubuntu/.local/lib/python3.10/site-packages/colbert/indexing/collection_indexer.py", line 229, in train
bucket_cutoffs, bucket_weights, avg_residual = self._compute_avg_residual(centroids, heldout)
File "/home/ubuntu/.local/lib/python3.10/site-packages/colbert/indexing/collection_indexer.py", line 307, in _compute_avg_residual
compressor = ResidualCodec(config=self.config, centroids=centroids, avg_residual=None)
File "/home/ubuntu/.local/lib/python3.10/site-packages/colbert/indexing/codecs/residual.py", line 24, in __init__
ResidualCodec.try_load_torch_extensions(self.use_gpu)
File "/home/ubuntu/.local/lib/python3.10/site-packages/colbert/indexing/codecs/residual.py", line 103, in try_load_torch_extensions
decompress_residuals_cpp = load(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1308, in load
return _jit_compile(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
_write_ninja_file_and_build_library(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'decompress_residuals_cpp': [1/2] /usr/local/cuda-12.3/bin/nvcc -DTORCH_EXTENSION_NAME=decompress_residuals_cpp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /usr/local/lib/python3.10/dist-packages/torch/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.10/dist-packages/torch/include/THC -isystem /usr/local/cuda-12.3/include -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++17 -c /home/ubuntu/.local/lib/python3.10/site-packages/colbert/indexing/codecs/decompress_residuals.cu -o decompress_residuals.cuda.o
FAILED: decompress_residuals.cuda.o
/usr/local/cuda-12.3/bin/nvcc -DTORCH_EXTENSION_NAME=decompress_residuals_cpp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /usr/local/lib/python3.10/dist-packages/torch/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.10/dist-packages/torch/include/THC -isystem /usr/local/cuda-12.3/include -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++17 -c /home/ubuntu/.local/lib/python3.10/site-packages/colbert/indexing/codecs/decompress_residuals.cu -o decompress_residuals.cuda.o
/bin/sh: 1: /usr/local/cuda-12.3/bin/nvcc: not found
ninja: build stopped: subcommand failed.
Clustering 17299 points in 128D to 2048 clusters, redo 1 times, 20 iterations
Preprocessing in 0.00 s
Iteration 19 (0.05 s, search 0.04 s): objective=2913.76 imbalance=1.486 nsplit=0
[Jan 16, 14:37:51] Loading decompress_residuals_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
When trying to run this without setting CUDA_HOME, indexing fails as well. Looking for help on this, thanks!
Hey! This clearly appears to be from a problem loading the custom torch/cuda files from upstream ColBERT.
This bit in particular:
/bin/sh: 1: /usr/local/cuda-12.3/bin/nvcc: not found
Seems to indicate that it's simply not managing to find nvcc
at the cuda path? I'm assuming that it is properly there though?
Even stranger, I've managed to uncover a similar problem in Colab, which ran fine using the exact same code yesterday (as reported by others as well), but now takes over 3 minutes to silently build the .cpp
extensions... It's quite unclear why, I'm trying to figure it out.
Indexer is really a mess. I also have this problem. I have nvcc at: /usr/bin/nvcc So I guess this is the problem with my cuda installation (really don't have the nerves to deal with it, just to check an indexer).
So I tried to index using colab instead of my local machine.
But on colab I still get:
WARNING! You have a GPU available, but only faiss-cpu
is currently installed.
This means that indexing will be slow. To make use of your GPU.
Please install faiss-gpu
by running:
pip uninstall --y faiss-cpu & pip install faiss-gpu
Will continue with CPU indexing in 5 seconds...
Even I have installed faiss-gpu (once it managed to load it - but it still took about 5 minutes with one document (something about residuals, prob the same .cpp
issue you mention))
Hey, thanks for helping us track it down. Seems like there are a few separate issues, that have appeared around the same time: the custom cpp/cu code taking a long time to load on colab, and some CUDA problems.
There's been upstream work on colbert-ai to lessen reliance on multiprocessing (cc @Anmol6 ) and I'm interested in figuring out whether this was the cause of the issue. Does local indexing work on your machine work if you downgrade colbert-ai
from 0.2.17
to 0.2.16
?
I installed cuda following https://developer.nvidia.com/cuda-downloads and have no longer those problems. stuck on:
[Jan 17, 16:27:27] Loading packbits_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
with both versions
with VERBOSE=True
Detected CUDA files, patching ldflags
Emitting ninja build file /home/sjao/.cache/torch_extensions/py310_cu121/decompress_residuals_cpp/build.ninja...
Building extension module decompress_residuals_cpp...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module decompress_residuals_cpp...
Using /home/sjao/.cache/torch_extensions/py310_cu121 as PyTorch extensions root...
@gsajko could you provide the output of your pip freeze
?
aiohttp==3.9.1
aiosignal==1.3.1
altair==5.2.0
annotated-types==0.6.0
anyio==4.2.0
asttokens==2.4.1
async-timeout==4.0.3
attrs==23.2.0
backoff==2.2.1
beautifulsoup4==4.12.2
bitarray==2.9.2
black==23.12.1
blinker==1.7.0
blis==0.7.11
cachetools==5.3.2
catalogue==2.0.10
certifi==2023.11.17
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
cloudpathlib==0.16.0
colbert-ai==0.2.16
comm==0.2.1
confection==0.1.4
cymem==2.0.8
dataclasses-json==0.6.3
datasets==2.14.4
debugpy==1.8.0
decorator==5.1.1
Deprecated==1.2.14
deprecation==2.1.0
dill==0.3.7
distro==1.9.0
emoji==2.9.0
exceptiongroup==1.2.0
executing==2.0.1
faiss-cpu==1.7.4
filelock==3.13.1
filetype==1.2.0
Flask==3.0.0
frozenlist==1.4.1
fsspec==2023.12.2
git-python==1.0.3
gitdb==4.0.11
GitPython==3.1.41
greenlet==3.0.3
h11==0.14.0
httpcore==1.0.2
httpx==0.26.0
huggingface-hub==0.20.2
idna==3.6
importlib-metadata==7.0.1
ipykernel==6.29.0
ipython==8.20.0
itsdangerous==2.1.2
jedi==0.19.1
Jinja2==3.1.3
joblib==1.3.2
jsonpatch==1.33
jsonpath-python==1.0.6
jsonpointer==2.4
jsonschema==4.21.0
jsonschema-specifications==2023.12.1
jupyter_client==8.6.0
jupyter_core==5.7.1
lancedb==0.4.4
langchain==0.1.1
langchain-community==0.0.13
langchain-core==0.1.11
langcodes==3.3.0
langdetect==1.0.9
langsmith==0.0.81
llama-index==0.9.32
lxml==5.1.0
Markdown==3.5.2
markdown-it-py==3.0.0
MarkupSafe==2.1.3
marshmallow==3.20.2
matplotlib-inline==0.1.6
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.15
murmurhash==1.0.10
mypy-extensions==1.0.0
nest-asyncio==1.5.9
networkx==3.2.1
ninja==1.11.1.1
nltk==3.8.1
numpy==1.26.3
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.3.101
nvidia-nvtx-cu12==12.1.105
onnx==1.15.0
openai==1.8.0
overrides==7.4.0
packaging==23.2
pandas==2.1.4
parso==0.8.3
pathspec==0.12.1
pexpect==4.9.0
pillow==10.2.0
platformdirs==4.1.0
preshed==3.0.9
prompt-toolkit==3.0.43
protobuf==4.25.2
psutil==5.9.7
ptyprocess==0.7.0
pure-eval==0.2.2
py==1.11.0
pyarrow==14.0.2
pydantic==2.5.3
pydantic_core==2.14.6
pydeck==0.8.0
Pygments==2.17.2
pylance==0.9.6
python-dateutil==2.8.2
python-dotenv==1.0.0
python-iso639==2024.1.2
python-magic==0.4.27
pytz==2023.3.post1
PyYAML==6.0.1
pyzmq==25.1.2
RAGatouille==0.0.4b2
rapidfuzz==3.6.1
ratelimiter==1.2.0.post0
referencing==0.32.1
regex==2023.12.25
requests==2.31.0
retry==0.9.2
rich==13.7.0
rpds-py==0.17.1
ruff==0.1.13
safetensors==0.4.1
scikit-learn==1.3.2
scipy==1.11.4
semver==3.0.2
sentence-transformers==2.2.2
sentencepiece==0.1.99
six==1.16.0
smart-open==6.4.0
smmap==5.0.1
sniffio==1.3.0
soupsieve==2.5
spacy==3.7.2
spacy-legacy==3.0.12
spacy-loggers==1.0.5
SQLAlchemy==2.0.25
srsly==2.4.8
stack-data==0.6.3
streamlit==1.30.0
sympy==1.12
tabulate==0.9.0
tenacity==8.2.3
thinc==8.2.2
threadpoolctl==3.2.0
tiktoken==0.5.2
tokenizers==0.15.0
toml==0.10.2
tomli==2.0.1
toolz==0.12.0
torch==2.1.2
torchvision==0.16.2
tornado==6.4
tqdm==4.66.1
traitlets==5.14.1
transformers==4.36.2
triton==2.1.0
typer==0.9.0
typing-inspect==0.9.0
typing_extensions==4.9.0
tzdata==2023.4
tzlocal==5.2
ujson==5.9.0
unstructured==0.11.8
unstructured-client==0.15.2
urllib3==2.1.0
validators==0.22.0
voyager==2.0.2
wasabi==1.1.2
watchdog==3.0.0
wcwidth==0.2.13
weasel==0.3.4
Werkzeug==3.0.1
wrapt==1.16.0
xxhash==3.4.1
yarl==1.9.4
zipp==3.17.0
@gsajko in colab, after your RAGatouille install, can you try:
pip uninstall -y faiss-cpu
pip install faiss-gpu
the above fixes the slow indexing on a colab A100 for me. Can you tell me what GPU you're using on colab?
I was using with T4. This didn't work, for me, tried many times. I got either "module not found", or it reverted to cpu version.
locally I use poetry for managing env, 0.0.4a2
doesn't match any version (but installing with pip works)
@gsajko could you please try again with the latest version (0.0.6b2
)?
Hey! This clearly appears to be from a problem loading the custom torch/cuda files from upstream ColBERT.
This bit in particular:
/bin/sh: 1: /usr/local/cuda-12.3/bin/nvcc: not found
Seems to indicate that it's simply not managing to find
nvcc
at the cuda path? I'm assuming that it is properly there though?Even stranger, I've managed to uncover a similar problem in Colab, which ran fine using the exact same code yesterday (as reported by others as well), but now takes over 3 minutes to silently build the
.cpp
extensions... It's quite unclear why, I'm trying to figure it out.
Got it working! The problem was with CUDA, not Ragatouille.
The CUDA installation was not fully complete as I am running this on an EC2 with a GPU preconfigured, so the toolkit was not included. For anyone who has this issue, I found this article helpful: https://blog.devgenius.io/tutorial-setting-up-cuda-gpu-drivers-and-cudnn-on-ubuntu-22-04-928138d66fc6
I'm encountering the same issue on my Nvidia-3090 Gpu in Ubuntu 22.04. My Cuda version is 12.2(from nvidia-smi) and gcc version is 11.4:
`[Feb 27, 14:51:47] #> Will delete 1 files already at .ragatouille/colbert/indexes/hallucination_papers in 20 seconds...
[Feb 27, 14:52:09] [0] #> Encoding 894 passages..
[Feb 27, 14:52:11] [0] avg_doclen_est = 170.8467559814453 len(local_sample) = 894
[Feb 27, 14:52:11] [0] Creating 4,096 partitions.
[Feb 27, 14:52:11] [0] Estimated 152,736 embeddings.
[Feb 27, 14:52:11] [0] #> Saving the indexing plan to .ragatouille/colbert/indexes/hallucination_papers/plan.json ..
WARNING clustering 145101 points to 4096 centroids: please provide at least 159744 training points
Clustering 145101 points in 128D to 4096 clusters, redo 1 times, 20 iterations
Preprocessing in 0.01 s
Iteration 19 (0.50 s, search 0.41 s): objective=31844.2 imbalance=1.401 nsplit=0
[Feb 27, 14:52:12] Loading decompress_residuals_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
Traceback (most recent call last):
File "/home/cribin/Documents/FaithfulSummarization/Repositories/Dspy_rag_tests/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2096, in _run_ninja_build
subprocess.run(
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/cribin/Documents/FaithfulSummarization/Repositories/Dspy_rag_tests/ragatouille_academic_papers_loader.py", line 45, in
Here are my installed packages: aiohttp==3.9.1 aiosignal==1.3.1 annotated-types==0.6.0 anyio==4.3.0 arxiv==2.1.0 async-timeout==4.0.3 attrs==23.2.0 bitarray==2.9.2 blinker==1.7.0 catalogue==2.0.10 certifi==2024.2.2 charset-normalizer==3.3.2 click==8.1.7 colbert-ai==0.2.19 dataclasses-json==0.6.4 datasets==2.17.1 Deprecated==1.2.14 dill==0.3.8 dirtyjson==1.0.8 distro==1.9.0 exceptiongroup==1.2.0 faiss-gpu @ https://github.com/kyamagu/faiss-wheels/releases/download/v1.7.3/faiss_gpu-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=a85a3101d1d865646d074c0fb266b97cd8dc85a3b1825b721a91aaf77493a830 feedparser==6.0.10 filelock==3.13.1 Flask==3.0.2 frozenlist==1.4.1 fsspec==2023.10.0 git-python==1.0.3 gitdb==4.0.11 GitPython==3.1.42 greenlet==3.0.3 h11==0.14.0 httpcore==1.0.4 httpx==0.27.0 huggingface-hub==0.21.0 idna==3.6 itsdangerous==2.1.2 Jinja2==3.1.3 joblib==1.3.2 jsonpatch==1.33 jsonpointer==2.4 langchain==0.1.9 langchain-community==0.0.24 langchain-core==0.1.27 langsmith==0.1.9 llama-index==0.9.48 MarkupSafe==2.1.5 marshmallow==3.21.0 mpmath==1.3.0 multidict==6.0.5 multiprocess==0.70.16 mypy-extensions==1.0.0 nest-asyncio==1.6.0 networkx==3.2.1 ninja==1.11.1.1 nltk==3.8.1 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.19.3 nvidia-nvjitlink-cu12==12.2.140 nvidia-nvtx-cu12==12.1.105 onnx==1.15.0 openai==1.12.0 orjson==3.9.15 packaging==23.2 pandas==2.2.1 pillow==10.2.0 protobuf==4.25.3 pyarrow==15.0.0 pyarrow-hotfix==0.6 pydantic==2.6.2 pydantic_core==2.16.3 pypdf==4.0.2 python-dateutil==2.8.2 python-dotenv==1.0.1 pytz==2024.1 PyYAML==6.0.1 RAGatouille==0.0.7.post7 regex==2023.12.25 requests==2.31.0 ruff==0.1.15 safetensors==0.4.2 scikit-learn==1.4.1.post1 scipy==1.12.0 sentence-transformers==2.4.0 sgmllib3k==1.0.0 six==1.16.0 smmap==5.0.1 sniffio==1.3.1 SQLAlchemy==2.0.27 srsly==2.4.8 sympy==1.12 tenacity==8.2.3 threadpoolctl==3.3.0 tiktoken==0.6.0 tokenizers==0.15.2 torch==2.2.1 tqdm==4.66.2 transformers==4.38.1 triton==2.2.0 typing-inspect==0.9.0 typing_extensions==4.10.0 tzdata==2024.1 ujson==5.9.0 urllib3==2.2.1 voyager==2.0.2 Werkzeug==3.0.1 wrapt==1.16.0 xxhash==3.4.1 yarl==1.9.4 Any idea why this issue occurs?