paper-qa
paper-qa copied to clipboard
Example using locally-hosted model is not working
I am trying to use paper-qa with a locally-hosted model. However, the provided example:
from paperqa import Settings, ask
local_llm_config = dict(
model_list=[
dict(
model_name="my_llm_model",
litellm_params=dict(
model="my-llm-model",
api_base="http://localhost:8080/v1",
api_key="sk-no-key-required",
temperature=0.1,
frequency_penalty=1.5,
max_tokens=512,
),
)
]
)
answer = ask(
"What manufacturing challenges are unique to bispecific antibodies?",
settings=Settings(
llm="my-llm-model",
llm_config=local_llm_config,
summary_llm="my-llm-model",
summary_llm_config=local_llm_config,
),
)
raises the following exception:
litellm.exceptions.BadRequestError: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=my-llm-model
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers
Hello,
PaperQA' documantation is not very clear about this... I made much trials to understand what's up. You have to specify inference endpoint. If you use a llamafile server, you'll have to specify "openai/my-llm-model" as model name, with ollama "ollama/my-llm-model".
Example for llamafile hosted locally :
local_llm_config = dict(
model_list=[
dict(
model_name=f"openai/my-llm-model",
litellm_params=dict(
model=f"openai/my-llm-model",
api_base="http://localhost:8080/v1",
api_key="sk-no-key-required",
temperature=0.1,
frequency_penalty=1.5,
max_tokens=1024,
),
)
]
)
However, you'll still have an API connexion error that seems due to embedding model... So I don't use 'ask' function and use doc.query instead as follows :
embedding_model = SparseEmbeddingModel(ndim=256)
docs = Docs()
for doc in tqdm(file_list):
try:
docs.add(str("./Papers/"+str(doc)),
citation="File " + doc, docname=doc,
settings=settings,
embedding_model=embedding_model)
except Exception as e:
# sometimes this happens if PDFs aren't downloaded or readable
print("Could not read", doc, e)
continue
answer = docs.query(
"Your question.",
settings=settings,
embedding_model=embedding_model,
)
I guess a clear and complete documentation would be welcomed.
Hope it helps.
Best regards.
I think it's simply not implemented or merged into main branch. By searching "API_BASE" in the project, you can't find any relevant code. https://github.com/search?q=repo%3AFuture-House%2Fpaper-qa+API_BASE&type=code
Hi @thiner, we use litellm and it handles that kind of config. It should be parsed and you can find more information here
I see. But why runs a LiteLLM inside PQA? It's better delpoy the service independently, decouple the model variation from PQA itself. It's also common we have already had the LiteLLM running, configure the other LiteLLM instance seems redundant.
Hi @Snikch63200, thank you for your help.
Could you also share the content of the settings variable, please?
Sure,
settings=Settings(
llm=f"openai/my-llm-model",
llm_config=local_llm_config,
summary_llm=f"openai/my-llm-model",
summary_llm_config=local_llm_config,
index_directory="indexes",
paper_directory="./Papers",
)
Best regards.
NB : Cannot use local LLM embedding model, I use (by default) SparseEmbeddingModel that is not optimal because doesn't really understand meaning of the question and search by keywords.
We've added a new feature to use the local sentence transformers library, which may be an easier way than trying to get litellm configured correctly for the using local embeddings:
https://github.com/Future-House/paper-qa?tab=readme-ov-file#local-embedding-models-sentence-transformers