haystack
haystack copied to clipboard
Make query endpoint ready for production workloads
Problem Statement
As a developer running Haystack in a production environment, I want the /query endpoint to be scalable and reliable so that my system is stable.
User Tasks
- Pull a Docker image
- Customize the default Haystack setup
- Configure additional services
- Configure a pipeline
- Deploy the container in a pod
- 🔴 The official Helm chart is not working with the new Haystack images
- Ensure the container is healthy
- 🔴 Pain point: we don't know if there was a problem caching the model until we use it
- Receive query from user
- Ensure we have enough resources to respond fast
- 🔴 No idea if there's enough GPU RAM left for larger batches
- 🔴 Missing guidance on Autoscaling horizontally
- 🔴 Concurrency is an issue
- Combine queries in batches
- 🔴 Feature doesn't exist, look at other ML frameworks for inspiration
- Send queries to Haystack
- Run pipeline
- 🟡 We suspect it can be more efficient
- Get the query result
- Handle result schema problems
- 🔴 Schema change shouldn't go unnoticed
- Send results to user
- Change Haystack version
- 🔴 update is risky as we don't know in advance if upgrade causes problems
- ensure the deployment is healthy
- 🔴 very limited service observability
### Tasks
- [ ] https://github.com/deepset-ai/haystack-helm/issues/3
- [ ] https://github.com/deepset-ai/haystack-demos/issues/6
- [ ] https://github.com/deepset-ai/haystack/issues/3910
- [ ] https://github.com/deepset-ai/haystack/issues/3618
- [x] https://github.com/deepset-ai/haystack/issues/3870
- [ ] https://github.com/deepset-ai/haystack/issues/3911
- [ ] https://github.com/deepset-ai/haystack/issues/3912
- [x] https://github.com/deepset-ai/haystack/issues/3913
- [ ] https://github.com/deepset-ai/haystack/issues/3914
- [ ] https://github.com/deepset-ai/haystack/issues/3790
- [ ] https://github.com/deepset-ai/haystack/issues/3915
- [ ] https://github.com/deepset-ai/haystack-private/issues/46