Trino Ops: Implement KEDA to scale trino to 0-N
What is it?
I only learned of this recently, but there's a tool (listed below) that has a queuing http proxy that can be used to scale any deployment to 0. The queuing proxy allows us to sleep some of the services. I believe this would be very good for many of the things we run but particularly, trino (at least for now)
Tool:
- https://github.com/kedacore/keda
- https://github.com/kedacore/http-add-on
This will be good for when running Trino on-demand (e.g. through website/API). For now, we scale things via Dagster
After the nessie work, I think this is someting we should pick up @IcaroG
This will be deprioritized for now.
The current state:
Keda has been deployed in the keda namespace with the HTTP addon.
We still need to map the requests to the http proxy.
For consumer trino, we will need to map the ingress to the http-proxy service that then redirect to the consumer trino.
For producer trino, we need to change all the internal references to point to the http-proxy service. We might need to check how to add the Host header to all the requests (required for the http addon).
There is still a question on how to scale the worker when we scale the coordinator from the http requests, and also scale it based on trino metrics. For trino metrics, we need to enable PodMonitoring for it.