haystack `Table_QA` gives error when querying on the table

Background

I have been following this tutorial to query structured data. There I found that we can also combine the structured data (tables) with the unstructured data (text) in the DocumentStore and then query it at once.

Data used

So I used and stored this data:

Unstructured content of Olympics Games (wiki)
Unstructured content of Ukraine War (wiki)
Unstructured content of World Heritage sites (wiki)
Structured tabes for Covid 19 in India, In World and Sales data (thus 3 tables)

Error message

But while querying (with GPU and without GPU) I get some error related to CUDA.

Exception: Exception while running node 'TableReader': index out of range in self
Enable debug logging to see the data that was passed when the pipeline failed.

With GPU runtime you will get

Exception: Exception while running node 'EmbeddingRetriever': CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasLtMatmul( ltHandle, computeDesc.descriptor(), &alpha_val, mat1_ptr, Adesc.descriptor(), mat2_ptr, Bdesc.descriptor(), &beta_val, result_ptr, Cdesc.descriptor(), result_ptr, Cdesc.descriptor(), &heuristicResult.algo, workspace.data_ptr(), workspaceSize, at::cuda::getCurrentCUDAStream())`Enable debug logging to see the data that was passed when the pipeline failed. site:stackoverflow.com

Expected behavior

It should give the response to the query. Which it does when querying "When was Olympics 2020 hosted?" -> Japan.

But if I query like: "In which city GrossSales was the highest?" -> Error.

Resources:

Reproducible colab: Notebook
Faiss Index: Drive link
Faiss Config: Drive link
DB: Drive link

All you have to do is to download these 3 files and upload in the runtime in colab. And then just run the NON-COMMENTED cells linearly. It should lead you to the error.

Thanks 🙏🏻

Mar 16 '23 07:03 AayushSameerShah

Similar error in issue #1833.

In addition, your tables seem very long (one is 14K rows), so also this issue can be related: #1723

Mar 20 '23 14:03 anakin87

Thanks @anakin87 ! For this issue I have thought of a workaround. I would use the haystack document store and retrievers to fetch the documents and then I will pass them in the context as the prompt to some generative model. Like...

# Step 1: Create document store (in haystack) and store all documents

# Step 2: Ask question. So assuming based on the question we have received top 10 relavant documents 

# Step 3: Append these documents together and pass as the context.

# Step 4: Create the prompt
prompt = \
f"""Answer the question from the context below. And don't try to make up an answer. 
If you don't know the answer, then say I don't know.

Context: {context}

Question: {query}

Answer:"""

# Step 5: Create a model and try generating the answers
# using gpt-neo-125m for now
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M")
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M")

# Step 6: Get the answers
tokens = tokenizer(query, return_tensors="pt")
output = model.generate(**tokens, 
    temperature=0.5, 
    min_length=5,              
    max_length=200,
    early_stopping=True,
    do_sample=True,
    num_beams=8,
    repetition_penalty=2.0, top_k=50)

print(tokenizer.decode(output[0]))

But the question is...

Sometimes the size of context becomes so big (2000 tokens) that it returns the error like:

Input length of input_ids is 553, but max_length is set to 200. This can lead to unexpected behavior. You should consider increasing max_new_tokens.

And... The expanded size of the tensor (200) must match the existing size (554) at non-singleton dimension 0. Target sizes: [200]. Tensor sizes: [554]

So obviously, I need to truncate the text somehow. But truncating the context in the prompt will lose the information. So is there any way to pass the prompt in batch or something like that?

Please help. Thanks :pray:

Mar 23 '23 07:03 AayushSameerShah

@sjrl can you take a look at this? Thanks!

Mar 29 '23 13:03 silvanocerza