trino-gateway icon indicating copy to clipboard operation
trino-gateway copied to clipboard

Trino gateway health state design

Open andythsu opened this issue 1 year ago • 22 comments

Problem: Currently the health state of trino cluster is stored in-memory in trino-gateway. This is a problem because there can be multiple instances of trino-gateway, and their trino cluster states can be inconsistent. This can lead to displaying different information on gateway UI.

Solution: Create a DB to store health state of trino cluster for ALL instances of trino-gateway. The DB will have a table named backend_state, with the table schema (name: String, is_healthy: boolean, last_update_time: timestamp)

Each trino-gateway instance should have different taskDelayMin, as defined here: https://github.com/trinodb/trino-gateway/blob/9bbf62c4da2084159374eac4bacdd5592035eee4/gateway-ha/src/main/java/io/trino/gateway/ha/clustermonitor/ActiveClusterMonitor.java#L35

(Note: currently this line is not configurable, but we will create a PR to make it configurable.)

Then, once trino-gateway is up, it will get health state from the DB before checking with trino clusters at intervals of taskDelayMin. If the last_update_time is greater than X amount of time (this will also be configurable) or if is_healthy=false, then trino-gateway will get the health status from trino backend and store the result in DB. Otherwise, it will just use the state from the DB

If trino-gateway sees the health state is unhealthy, then it will maintain an in-memory storage where it will make Y tries (again, configurable) to trino backend. Trino-gateway will only treat trino backend as healthy AFTER it has Z successes (again, configurable) from the trino backend. After it reaches Z successes, trino-gateway will update the trino backend record in DB.

In the beginning, all trino backends will start with PENDING in DB.

andythsu avatar Jan 30 '24 18:01 andythsu