llama_index icon indicating copy to clipboard operation
llama_index copied to clipboard

[Bug]: llama_index.llms.ollama httpx.HTTPStatusError: Server error '500 Internal Privoxy Error' for url 'http://localhost:11434/api/chat'

Open Lyy0617 opened this issue 7 months ago • 4 comments

Bug Description

I am building my local question-answering system using LlamaIndex, where LLM is deployed through Ollama. Initially, it worked fine, but when I started the global Privoxy, there were issues with LLM based on Ollama. However, deploying the local embedding model based on HuggingFaceEmbedding did not encounter any problems. The specific error message is: httpx.HTTPStatusError: Server error '500 Internal Privoxy Error' for url 'http://localhost:11434/api/chat' For more information, check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

Version

0.10.55

Steps to Reproduce

embed_args = {'model_name': './bce-embedding-base_v1', 'max_length': 512, 'embed_batch_size': 32, 'device': 'cuda'} embed_model = HuggingFaceEmbedding(**embed_args) #token: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 reranker_model = SentenceTransformerRerank( top_n = 5, model = "./bce-reranker-base_v1", device='cuda' ) llm = Ollama(model="qwen2:latest", request_timeout=60.0) Settings.embed_model=embed_model Settings.llm = llm

db = chromadb.PersistentClient(path="./ChromaDB") chroma_collection = db.get_or_create_collection("MyDocuments") vector_store = ChromaVectorStore(chroma_collection=chroma_collection) index = VectorStoreIndex.from_vector_store(vector_store)

vector_retriever = VectorIndexRetriever(index=index, similarity_top_k=5) response_synthesizer = get_response_synthesizer( # llm=llm, response_mode="tree_summarize", streaming=True ) query_engine = RetrieverQueryEngine( retriever=vector_retriever, response_synthesizer=response_synthesizer, node_postprocessors=[reranker_model], ) query_engine.query(“XX”).print_response_stream()

Relevant Logs/Tracbacks

No response

Lyy0617 avatar Jul 22 '24 16:07 Lyy0617