llama_index
llama_index copied to clipboard
[Bug]: llama_index.llms.ollama httpx.HTTPStatusError: Server error '500 Internal Privoxy Error' for url 'http://localhost:11434/api/chat'
Bug Description
I am building my local question-answering system using LlamaIndex, where LLM is deployed through Ollama. Initially, it worked fine, but when I started the global Privoxy, there were issues with LLM based on Ollama. However, deploying the local embedding model based on HuggingFaceEmbedding did not encounter any problems. The specific error message is: httpx.HTTPStatusError: Server error '500 Internal Privoxy Error' for url 'http://localhost:11434/api/chat' For more information, check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
Version
0.10.55
Steps to Reproduce
embed_args = {'model_name': './bce-embedding-base_v1', 'max_length': 512, 'embed_batch_size': 32, 'device': 'cuda'} embed_model = HuggingFaceEmbedding(**embed_args) #token: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 reranker_model = SentenceTransformerRerank( top_n = 5, model = "./bce-reranker-base_v1", device='cuda' ) llm = Ollama(model="qwen2:latest", request_timeout=60.0) Settings.embed_model=embed_model Settings.llm = llm
db = chromadb.PersistentClient(path="./ChromaDB") chroma_collection = db.get_or_create_collection("MyDocuments") vector_store = ChromaVectorStore(chroma_collection=chroma_collection) index = VectorStoreIndex.from_vector_store(vector_store)
vector_retriever = VectorIndexRetriever(index=index, similarity_top_k=5) response_synthesizer = get_response_synthesizer( # llm=llm, response_mode="tree_summarize", streaming=True ) query_engine = RetrieverQueryEngine( retriever=vector_retriever, response_synthesizer=response_synthesizer, node_postprocessors=[reranker_model], ) query_engine.query(“XX”).print_response_stream()
Relevant Logs/Tracbacks
No response