paper-qa
paper-qa copied to clipboard
Incorrect LLM Used and API Environment Variable Warnings
When running the script, I encountered two main issues:
- The library defaults to using the GPT-4o model despite setting the LLM to gpt-4o-mini, leading to higher costs.
- I received warnings about missing environment variables for external APIs, for reasons unknown to me.
Code Snippet:
import os
from paperqa import Docs, Settings
.
.
.
doc_paths = get_absolute_paths(folder_path)
docs = Docs()
for doc in doc_paths:
docs.add(doc)
settings = Settings()
settings.llm = "gpt-4o-mini"
settings.summary_llm = "gpt-4o-mini"
settings.prompts.qa = my_qa_prompt
# settings.answer.answer_max_sources = 3
answer = docs.query(
question,
settings=settings,
)
print(answer.formatted_answer)
Warnings:
SEMANTIC_SCHOLAR_API_KEY environment variable not set. Semantic Scholar API rate limits may apply.
CROSSREF_MAILTO environment variable not set. Crossref API rate limits may apply.
CROSSREF_API_KEY environment variable not set. Crossref API rate limits may apply.
Hey there, @jukti3742! I'm Dosu, a bot here to help you with bugs, questions, and becoming a contributor. Let's tackle this issue together while we wait for a human maintainer.
-
LLM Defaulting Issue: The
paperqalibrary defaults to using "gpt-4o-2024-08-06" as the LLM. If your script is still using this model despite settingsettings.llmandsettings.summary_llmto "gpt-4o-mini", ensure that these settings are correctly applied before thequerymethod is called. Double-check that there are no other parts of your code or library defaults that might override these settings. -
Environment Variable Warnings: The warnings about missing environment variables indicate that the library is trying to access external APIs without the necessary credentials. For Semantic Scholar and Crossref, you need to set the
SEMANTIC_SCHOLAR_API_KEY,CROSSREF_MAILTO, andCROSSREF_API_KEYenvironment variables to avoid rate limits. You can set these in your environment using:export SEMANTIC_SCHOLAR_API_KEY='your_api_key' export CROSSREF_MAILTO='your_email' export CROSSREF_API_KEY='your_api_key'These variables are crucial for accessing the APIs without hitting rate limits [1][2].
To continue talking to Dosu, mention @dosu.
Not helpful, since settings are correctly set to "gpt-4o-mini" before the query method is called. Please see the code snippet.
For the warnings, please read https://github.com/Future-House/paper-qa#installation
For the model selection issue:
I had the same problem as you, and realized that the documentation is not updated to reflect API changes.
The model settings are actually taken from the agent key of the Settings object, and not the llm or summary_llm as the docs would suggest. The fix is to also give your model name inside that key as follows:
from paperqa.agents.main import AgentSettings
settings=Settings(
agent=AgentSettings(
agent_llm="gpt-4o-mini", # your desired LLM
),
)
If this fixes your problem, please close the issue, I'll be opening a new issue for the stale docs.