FARM icon indicating copy to clipboard operation
FARM copied to clipboard

IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed

Open ShuhaoZhangTony opened this issue 10 months ago • 0 comments

Describe the bug A clear and concise description of what the bug is.

I'm trying to use haystack's API to build a RAG pipeline. I'm using FAISSDocumentStore and EmbeddingRetriever.

Works like the following:

# Create the document store using the factory
document_store = create_document_store(store_type, **store_config)

documents = []
documents_dir = args.docs_path
for filename in os.listdir(documents_dir):
    file_path = os.path.join(documents_dir, filename)
    if os.path.isfile(file_path):
        with open(file_path, 'r', encoding='utf-8') as file:
            content = file.read()
            document = Document(content=content)
            documents.append(document)
document_store.write_documents(documents)

# Ensure the retriever is initialized before updating embeddings
retriever = RetrieverFactory.get_retriever(retriever_type=args.retriever_type,
                                           document_store=document_store,
                                           query_embedding_model=args.query_embedding_model,
                                           passage_embedding_model=args.passage_embedding_model
                                           )

# Update embeddings right after writing documents
if hasattr(document_store,
           'update_embeddings'):  # check ensures that this code block only executes if the document_store instance has the update_embeddings method.
    document_store.update_embeddings(retriever=retriever, batch_size=10)

Error message Error that was thrown (if available)

haystack/modeling/model/language_model.py", line 222, in _pool_tokens ignore_mask_3d[:, :, :] = ignore_mask_2d[:, :, np.newaxis] ~~~~~~~~~~~~~~^^^^^^^^^ IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed

Expected behavior A clear and concise description of what you expected to happen.

Additional context Add any other context about the problem here, like type of downstream task, part of etc..

To Reproduce Steps to reproduce the behavior

System:

  • OS: Ubuntu 18.04
  • GPU/CPU:
  • FARM version:

ShuhaoZhangTony avatar Apr 06 '24 08:04 ShuhaoZhangTony