cortex icon indicating copy to clipboard operation
cortex copied to clipboard

Improve inter-replica fairness for real-time APIs

Open deliahu opened this issue 5 years ago • 0 comments
trafficstars

Description

Currently requests are assigned to replicas at random. A smarter approach would be to assign based on least recently accessed (i.e. strict ordering), smallest queue size, or something similar.

Notes

Istio's destination rules might be relevant for this, e.g. something like:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: api-iris-classifier
spec:
  host: api-iris-classifier.default.svc.cluster.local
  trafficPolicy:
    loadBalancer:
      simple: LEAST_CONN  # or ROUND_ROBIN

deliahu avatar Jul 29 '20 02:07 deliahu