private-gpt [BUG] Summarize fails even when a model response is generated with the error "HTTP request failed: POST predict: Post "http://127.0.0.1:36333/completion": EOF"

Pre-check

[X] I have searched the existing issues and none cover this bug.

Description

while using summarize i keep getting the below error. I had to fix the summarize_service.py and ui.py to catch it in a good way. here is the returned error 10:30:57.046 [ERROR ] private_gpt.server.recipes.summarize.summarize_service - HTTP request failed: POST predict: Post "http://127.0.0.1:36333/completion": EOF

It looked to be a backend issue but we can clearly see that the response is created correctly by the Ollama backend

Given the information from multiple sources and not prior knowledge, answer the query. Query: Provide a comprehensive summary of the provided context information. The summary should cover all the key points and main ideas presented in the original text, while also condensing the information into a concise and easy-to-understand format. Please ensure that the summary includes relevant details and examples that support the main ideas, while avoiding any unnecessary information or repetition.

Answer:

** Response: ** assistant: The provided context information outlines various aspects of user management, software usage tracking, and computer inventory in an IT system. Here's a comprehensive summary:

now, i had to fix a lot in the code. fixed all of these:

Summary of the Main Problem

The main problem involved handling asynchronous operations correctly in the summarization service and the UI. Specifically, the issues were:

The code was having errors due to nested async calls, which required the use of nest_asyncio to allow nested event loops.
The _summarize method needed to be converted to an async generator to handle streaming responses correctly. For that the stream_summarize and summarize methods needed to use the async generator correctly
Added Proper error handling for exceptions such asyncio.CancelledError, ResponseError, and StopAsyncIteration.
Fix The _chat method in the UI needed to handle async streaming responses correctly and making sure that stream_summarize was used correctly in an async for loop in the UI.

As i see it the error we have here is due to the following since i have ruled out the server side and networking issues. since we can clearly see that a response is generated my the model

Timeout: so , if the server takes too long to respond, the client might close the connection which can result in EOF error.
incorrect query parameters .. so if these are incorrect of malfformed the server will close the connection

Steps to Reproduce

Rag some bigger docs
summarize

Expected Behavior

No errors and correct summarization

Actual Behavior

Summarize fails even a model response is generated with the error "HTTP request failed: POST predict: Post "http://127.0.0.1:36333/completion": EOF"

Environment

CUDA12, Ubuntu, Ollama profile

Additional Information

No response

Version

No response

Setup Checklist

[X] Confirm that you have followed the installation instructions in the project’s documentation.
[X] Check that you are using the latest version of the project.
[X] Verify disk space availability for model storage and data processing.
[X] Ensure that you have the necessary permissions to run the project.

NVIDIA GPU Setup Checklist

[X] Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to CUDA's documentation)
[X] Ensure an NVIDIA GPU is installed and recognized by the system (run nvidia-smi to verify).
[X] Ensure proper permissions are set for accessing GPU resources.
[ ] Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi)

Nov 28 '24 09:11 SuperSonnix71

adding my changes changes_summarize_service.txt changes_ui.txt

Nov 28 '24 10:11 SuperSonnix71

so, the summarizeservice is retrieved from the request state using SummarizeService = request.state.injector.get(SummarizeService) adn for the streaming Response uses to_openai_stream to convert the response to a SSE stream. the issue is somewhere here

Nov 28 '24 10:11 SuperSonnix71

also tested to increase the timeout value here which did not have any effect.. so query_engine = summary_index.as_query_engine( llm=self.llm_component.llm, response_mode=ResponseMode.TREE_SUMMARIZE, streaming=stream, use_async=self.settings.summarize.use_async, timeout=360 # <------------------Increase timeout to 360 seconds ) as you can see the nested async issue is solved using my code changes attached. we are at least seeing clear logs

Software management includes three sub-tabs: Applications, Raw Usage, and Active Usage.
These tabs display information about software applications used by users, such as total usage, active usage, raw usage data, and more.

PART OF THE MODEL RESPONSE:

... Overall, the system provides a comprehensive platform for managing subscriptions, uploaded files, integration platforms, user profiles, and software usage. The system's features are designed to streamline processes, provide insights into user behavior, and support various workflows.

11:37:04.888 [INFO ] 11:37:04.888 [INFO ] 11:37:04.889 [INFO ] 11:37:04.889 [INFO ] 11:37:04.889 [INFO ] 11:37:04.889 [INFO ] 11:37:04.890 [INFO ] 11:37:04.890 [INFO ] 11:37:04.890 [INFO ] 11:37:04.890 [INFO ] 11:37:04.890 [INFO ] 11:37:04.891 [INFO ] 11:37:04.891 [INFO ] 11:37:04.891 [INFO ] 11:37:04.891 [INFO ] 11:37:04.891 [ERROR 11:37:04.893 [INFO ] 11:37:04.893 [INFO ] 11:37:04.893 [INFO ] 11:37:04.934 [INFO ] 11:37:04.935 [INFO ] 11:37:04.935 [INFO ] 11:37:04.974 [INFO ] httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" ] private_gpt.server.recipes.summarize.summarize_service - HTTP request failed: POST predict: Post "http://127.0.0.1:33899/completion": EOF httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" uvicorn.access - 127.0.0.1:53322 - "POST /run/predict HTTP/1.1" 200

Nov 28 '24 10:11 SuperSonnix71

Can you check ollama server logs? The problem is ollama related to, since it's throwing 500. Something with context window, computer resources, etc

Dec 02 '24 07:12 jaluma

any fix for this issue?

Jan 15 '25 04:01 SMAntony