exo Using LlamaIndex Openai like API to request exo via the ChatGPT API endpoint

Hey,

I've tried to use LlamaIndex OpenAI like API wi the exo ChatGPT API endpoint in order to be able to use a ChromaDB RAG with it but when calling this code : from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(model="llama3.1:8b", api_base="http://127.0.0.1:8000/v1", api_key="fake")

response = llm.complete("Hello World!") print(str(response))

I have this error : openai.APIStatusError: 405: Method Not Allowed

and with this code : from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(model="llama3.1:8b", api_base="http://127.0.0.1:8000/v1/chat/completions", api_key="fake")

response = llm.complete("Hello World!") print(str(response))

I have the same error : openai.APIStatusError: 405: Method Not Allowed

Am I missing something ? as the exo API is openai like compliant and so is llama index.

Thx for your help !

Sep 02 '24 09:09 Pentar0o

from llama_index.llms.openai_like import OpenAILike
from llama_index.core.base.llms.types import ChatMessage


messages = [ChatMessage(role="user", content="Hello World!")]

llm = OpenAILike(model="llama-3.1-8b", api_base="http://127.0.0.1:8000/v1", is_chat_model=True, api_key="fake")
res = llm.chat(messages)
print(res)

This works @Pentar0o. You can get the names of the models from models.py. I suppose the request requires a list of messages, so the chat model.

Sep 03 '24 20:09 pranav4501

Thanks ! Is it possible to have a context memory of the chat to keep the conversation going ? or is llamaIndex managing it automatically ?

Sep 07 '24 12:09 Pentar0o

You can pass context in the list of messages with all previous messages you want to include.

Sep 08 '24 21:09 pranav4501