I'm trying to run beir/examples/retrieval/evaluation/dense/evaluate_sbert_multi_gpu.py. Doing do I end up with the below error.
Traceback (most recent call last):
File "evaluate_sbert_multi_gpu.py", line 62, in
results = retriever.retrieve(corpus, queries)
File "/data/user/beir/beir/retrieval/evaluation.py", line 23, in retrieve
return self.retriever.search(corpus, queries, self.top_k, self.score_function, **kwargs)
File "/data/user/beir/beir/retrieval/search/dense/exact_search_multi_gpu.py", line 150, in search
cos_scores_top_k_values, cos_scores_top_k_idx, chunk_ids = metric.compute()
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/evaluate/module.py", line 433, in compute
self._finalize()
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/evaluate/module.py", line 390, in _finalize
self.data = Dataset(**reader.read_files([{"filename": f} for f in file_paths]))
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/arrow_reader.py", line 260, in read_files
pa_table = self._read_files(files, in_memory=in_memory)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/arrow_reader.py", line 195, in _read_files
pa_table: Table = self._get_table_from_filename(f_dict, in_memory=in_memory)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/arrow_reader.py", line 331, in _get_table_from_filename
table = ArrowReader.read_table(filename, in_memory=in_memory)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/arrow_reader.py", line 352, in read_table
return table_cls.from_file(filename)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/table.py", line 1065, in from_file
table = _memory_mapped_arrow_table_from_file(filename)
File "/home/user/miniconda3/envs/beir/lib/python3.7/site-packages/datasets/table.py", line 52, in _memory_mapped_arrow_table_from_file
pa_table = opened_stream.read_all()
File "pyarrow/ipc.pxi", line 750, in pyarrow.lib.RecordBatchReader.read_all
File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
OSError: Expected to be able to read 80088040 bytes for message body, got 80088032
--
command used: CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python evaluate_sbert_multi_gpu.py
@thakur-nandan Any idea how to proceed?
The reason for this error is insufficient host memory (CPU ram). I would suggest evaluating on a larger GPU cluster or try reducing the batch size.