haystack
haystack copied to clipboard
Docker image crashes upon querying
Describe the bug
Short Issue: The docker container which runs a webserver running haystack, crashes while running a pipeline.
Long Text: I have a webapp which exposes query API. When someone makes a POST Call, the controller runs a RAG Pipeline. This keeps on crashing everytime the pipeline is run. I have been unable to identify any error log. This does not happen if i run the webserver natively on my OS
Error message No error message is thrown
Expected behavior The RAG Pipeline should run.
Additional context
I have enabled DEBUG in haystack & my app logs and attaching the log file docker_crash.log. It crashes exactly after outputting Batches: 0%| | 0/1 [00:00<?, ?it/s]% text. My Dockerfile is pasted below
FROM python:3.11 AS base
WORKDIR /app
COPY . .
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r ./requirements.txt
EXPOSE 8080
CMD ["uvicorn", "friday.server:app", "--host", "0.0.0.0", "--port", "8080"]
To Reproduce
FAQ Check
- [x] Have you had a look at our new FAQ page?
System:
- OS: Alpine
- GPU/CPU: CPU
- Haystack version (commit or version number): 1.22.1
- DocumentStore: ElasticSearch
- Reader: None. Reranker present
- Retriever: BM25Retriever, EmbeddingRetriever
In the logs I see:
DEBUG - haystack.telemetry - Telemetry couldn't make a POST request to PostHog.
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/haystack/telemetry.py", line 98, in send_event
json.dumps({**self.event_properties, **dynamic_specs, **event_properties}, sort_keys=True)
File "/usr/local/lib/python3.11/json/__init__.py", line 238, in dumps
**kw).encode(obj)
^^^^^^^^^^^
File "/usr/local/lib/python3.11/json/encoder.py", line 200, in encode
chunks = self.iterencode(o, _one_shot=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/json/encoder.py", line 258, in iterencode
return _iterencode(o, 0)
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/json/encoder.py", line 180, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type PromptTemplate is not JSON serializable
Can you try disabling telemetry by setting an environment variable HAYSTACK_TELEMETRY_ENABLED=False in your docker container?
Tried with telemetry disabled. No change in the behaviour. I am on Apple M1. Here's additional details I did
- Instead of
docker run -p 8080:8080 --env-file .env friday, I went into an interactive terminal in the container and started the uvicorn from there. I notice that while pipe run, there's a segmentation fault that happens. - I noticed that I have
use_gpu=Truein myEmbeddingRetriever, so I turned it to False and rebuilt the image and repeated step 1 - That didn't change anything, so I attached a debugger
- I
btd the debugger and attaching logs here. I am not much of an expert in low level debugging but it seems like something to do with pytorch libs. docker_crash_with_gdb.log
I will continue to fiddle more.
Can you try setting also HAYSTACK_MPS_ENABLED : false?
I tried with both HAYSTACK_MPS_ENABLED=False and HAYSTACK_MPS_ENABLED=false in the env, and it still crashes. Attached is the gdb log when HAYSTACK_MPS_ENABLED is set to false.
docker_crash2.log
I tried to reproduce on my Mac using the docker-compose file from here https://github.com/deepset-ai/haystack-demos/tree/main/explore_the_world pulling 1.21.1 but it works. If you can post a minimal setup I can reproduce locally I'll try again.
I faced same issue but then realized that the problem happens due to OpenSearch wrong credentials, I don't know why the error message is so off the reason of the error