Make query endpoint ready for production workloads

Open julian-risch opened this issue 3 years ago • 0 comments

Problem Statement As a developer running Haystack in a production environment, I want the /query endpoint to be scalable and reliable so that my system is stable.

User Tasks

Pull a Docker image
Customize the default Haystack setup
Configure additional services
Configure a pipeline
Deploy the container in a pod
- 🔴 The official Helm chart is not working with the new Haystack images
Ensure the container is healthy
- 🔴 Pain point: we don't know if there was a problem caching the model until we use it
Receive query from user
Ensure we have enough resources to respond fast
- 🔴 No idea if there's enough GPU RAM left for larger batches
- 🔴 Missing guidance on Autoscaling horizontally
- 🔴 Concurrency is an issue
Combine queries in batches
- 🔴 Feature doesn't exist, look at other ML frameworks for inspiration
Send queries to Haystack
Run pipeline
- 🟡 We suspect it can be more efficient
Get the query result
Handle result schema problems
- 🔴 Schema change shouldn't go unnoticed
Send results to user
Change Haystack version
- 🔴 update is risky as we don't know in advance if upgrade causes problems
ensure the deployment is healthy
- 🔴 very limited service observability

### Tasks
- [ ] https://github.com/deepset-ai/haystack-helm/issues/3 
- [ ] https://github.com/deepset-ai/haystack-demos/issues/6
- [ ] https://github.com/deepset-ai/haystack/issues/3910
- [ ] https://github.com/deepset-ai/haystack/issues/3618
- [x] https://github.com/deepset-ai/haystack/issues/3870
- [ ] https://github.com/deepset-ai/haystack/issues/3911
- [ ] https://github.com/deepset-ai/haystack/issues/3912
- [x] https://github.com/deepset-ai/haystack/issues/3913
- [ ] https://github.com/deepset-ai/haystack/issues/3914
- [ ] https://github.com/deepset-ai/haystack/issues/3790
- [ ] https://github.com/deepset-ai/haystack/issues/3915
- [ ] https://github.com/deepset-ai/haystack-private/issues/46

Dec 23 '22 07:12 julian-risch