paper-qa save and import embeddings for faster answer generation

Hello,

Thank you for the fantastic work on PaperQA. I’ve been able to use it to ask questions by providing over 100 papers as input, and I’ve been using only local models via Ollama. Everything is working well, but I’d like to know how I can avoid reloading the same files and retraining an embedding model each time I have a new Question.

Is there a way to save the vector store and load it later, so it can be used by the LLM to generate answers? I couldn't find documentation about that, I found something re- caching, but it's unclear to me how to use it properly. Can you provide some help?

best wishes

My code so far:

from paperqa import Settings, ask

question = "Can you list all climate-sensitive pathways that affect rotavirus incidence?"



model = "ollama/llama3.1" # "ollama/llama3.2" 
embedding = "ollama/nomic-embed-text" # "nomic-embed-text"

local_llm_config = {
    "model_list": [
        {
            "model_name": model,
            "litellm_params": {
                "model": model,
                "api_base": "http://localhost:11434",
            },
            "answer": {
                "evidence_k": 40,
                "evidence_detailed_citations": True,
                "evidence_summary_length": "about 100 words",
                "answer_max_sources": 5,
                "answer_length": "about 300 words, but can be longer",
                "max_concurrent_requests": 4,
                "answer_filter_extra_background": False
            }
            
        }
    ]
}

answer = ask(
    question,
    settings=Settings(
        llm=model,
        llm_config=local_llm_config,
        summary_llm=model,
        summary_llm_config=local_llm_config,
        agent=AgentSettings(
            agent_llm=model, agent_llm_config=local_llm_config
        ),
        embedding=embedding,
        paper_directory="papers/"
    ),
)

Nov 26 '24 14:11 n400peanuts

Hey @n400peanuts 👋, I'm here to help you with any issues or questions you have about PaperQA. Let's work together to solve any bugs or address any questions you have.

To save and import embeddings in PaperQA and avoid reloading files and retraining an embedding model each time you have a new question, you can pickle the Docs object. This will cache the embeddings, allowing you to load them later without re-embedding your documents. Here's how you can do it:

import pickle
from paperqa import Docs

# Assuming you have already added documents to your Docs object
docs = Docs()
# Add documents to docs...

# Save the Docs object with embeddings
with open("my_docs.pkl", "wb") as f:
    pickle.dump(docs, f)

# Later, load the Docs object with embeddings
with open("my_docs.pkl", "rb") as f:
    docs = pickle.load(f)

By pickling the Docs object, you ensure that the embeddings are cached and can be reused, thus avoiding the need to reload files and retrain the embedding model for each new question ^[1].

To continue talking to Dosu, mention @dosu.

Nov 26 '24 14:11 dosubot[bot]

ok let's do a step back, @dosu. Show me:

How to add my documents to the Docs object and how this is used in to generate the embeddings
Show me how to use the loaded embeddings in the code to generate the answer

Nov 26 '24 14:11 n400peanuts

To add your documents to the Docs object and generate embeddings, you can use the add, add_file, or add_url methods, depending on the source of your document. Here's how you can do it:

Add a document from a file:

from pathlib import Path
from paperqa import Docs

docs = Docs()
file_path = Path("path/to/your/document.pdf")
docs.add(file_path)

Add a document from a URL:

from paperqa import Docs

docs = Docs()
url = "http://example.com/document.pdf"
docs.add_url(url)

Add a document from a binary file object:

from paperqa import Docs

docs = Docs()
with open("path/to/your/document.pdf", "rb") as file:
    docs.add_file(file)

Once your documents are added, embeddings are generated for them. To use these embeddings to generate an answer, you can use the query method:

query = "Your question here"
session = docs.query(query)
answer = session.formatted_answer
print(answer)

This process retrieves relevant texts using the embeddings and generates an answer based on your question ^[1]^[2].

To continue talking to Dosu, mention @dosu.

Nov 26 '24 14:11 dosubot[bot]

@dosu, Show me how to set to use the following embedding model when generating the embeddings stored into Docs

embedding = "ollama/nomic-embed-text" # "nomic-embed-text"

Nov 26 '24 14:11 n400peanuts

@n400peanuts Have you figured this out? I'm in same boat, using llama3.2 and mxbai embdedding to pickle the docs, but can't load them. I get openai authentication error. Any help would be greatly appreciated.

Jan 14 '25 14:01 rahmanmus

To set the embedding model to "ollama/nomic-embed-text" when generating embeddings stored in the Docs object, you can specify the embedding model directly in the add method of the Docs object. Here's how you can do it:

from paperqa import Docs

docs = Docs()
docs.add("path/to/your/document.pdf", embedding_model="ollama/nomic-embed-text")

This code sets the embedding model to "ollama/nomic-embed-text" for the document being added to the Docs object ^[1]^[2].

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

Jan 14 '25 14:01 dosubot[bot]

No, I have already embedded the model and pickled it. I then try to load it and query it like so:

`with open("docs.pkl", "rb") as f: docs = pickle.load(f)

query = "Test query" session = docs.query( query, # settings=Settings( # embedding="ollama/mxbai-embed-large", # llm="ollama/llama3.2", # llm_config=local_llm_config, # ), # embedding_model="ollama/mxbai-embed-large", # llm_model="ollama/llama3.2", ) answer = session.format_answer() print(answer) `

When I do so, I get this error:

litellm.exceptions.AuthenticationError: litellm.AuthenticationError: AuthenticationError: OpenAIException - Error code: 401 - {'error': {'message': 'Incorrect API key provided: foo. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}} LiteLLM Retried: 3 times

Jan 14 '25 15:01 rahmanmus

Hi @rahmanmus if you get weird openai authentication errors in my case it meant that somewhere there was a variable which called a model and the default was calling openai. Dig into the settings.py to see where/when the package calls which llm and set all variables to call your local llm/embeddings model. This should solve the issue.

For instance, this is what I did:

#set a meaningless api for openai

api_key="sk-no-key-required"
os.environ['OPENAI_API_KEY'] = api_key

model = "ollama/llama3.1:latest" # "ollama/llama3.2" 
embedding = "ollama/mxbai-embed-large:latest" # "nomic-embed-text"

local_llm_config = {
    "model_list": [
        {
            "model_name": model,
            "litellm_params": {
                "model": model,
                "api_base": "http://0.0.0.0:11434",
            },
            "answer": {
                "evidence_k": 40,
                "evidence_detailed_citations": True,
                "evidence_summary_length": "about 100 words",
                "answer_max_sources": 10,
                "answer_length": "about 600 words, but can be longer",
                "max_concurrent_requests": 10,
                "answer_filter_extra_background": False
            },
            "parsing": {
                "use_doc_details": True
            }
            
        }
    ]
}

settings=Settings(
    llm=model,
    llm_config=local_llm_config,
    summary_llm=model,
    summary_llm_config=local_llm_config,
    paper_directory="./papers",
    embedding=embedding,
    agent=AgentSettings(
        agent_llm=model, 
        agent_llm_config=local_llm_config,
        return_paper_metadata = True
        )
    )

So basically I checked every setting and I made sure that anytime the package would call a llm this would be one of my choice and not the openai default

Jan 14 '25 16:01 n400peanuts

@n400peanuts Thank you so much! I got it to work by modifying the source code in settings.py as you said, and replacing all of the default values that call openai to my local model. It then worked!

Jan 14 '25 17:01 rahmanmus

@rahmanmus yay! happy that it helped!

Jan 14 '25 19:01 n400peanuts