kvass icon indicating copy to clipboard operation
kvass copied to clipboard

Support HA for shards

Open jet-go opened this issue 2 years ago • 1 comments

Hello Currently i'm trying to setup shards with high availability ie running the multiple replicas of each shard. I'm not sure whether this is already possible. Could you please let me know whether this can be achieved or not? I did try to deploy 2 statefulsets with same config/coordinator, but the targets distribution was not the same in both sts pods. If it's not supported, would love to help with the implementation (with some guidance to start with).

Example: Let's assume for 2 sts with 2 shards - I would expect each shard/pods (prom-replica-0-0 & prom-replica-0-1) to scrape the same targets & so on for other shards.

sts: prom-replica-0
pod: prom-replica-0-$(SHARD)
----config----
global:
  external_labels:
    prometheus: prom-kvass-$(SHARD)
    prometheus_replica: $(POD_NAME)
scrape_configs:
- job_name: metrics
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 20s
  metrics_path: /metrics
  scheme: http
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  proxy_url: http://127.0.0.1:8008
  follow_redirects: true
  relabel_configs:
  - separator: ;
    regex: __invalid_label_(.+)
    replacement: $1
    action: labelmap
  static_configs:
  - targets:
    - "TARGET_0"
    labels:
      __address__: "TARGET_0"
      __metrics_path__: /metrics
      __param__hash: "TARGET_0_HASH"
      __param__jobName: metrics
      __param__scheme: http
      __scheme__: http
....
sts: prom-replica-1
pod: prom-replica-1-$(SHARD)
----config----
global:
  external_labels:
    prometheus: prom-kvass-$(SHARD)
    prometheus_replica: $(POD_NAME)
scrape_configs:
- job_name: metrics
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 20s
  metrics_path: /metrics
  scheme: http
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  proxy_url: http://127.0.0.1:8008
  follow_redirects: true
  relabel_configs:
  - separator: ;
    regex: __invalid_label_(.+)
    replacement: $1
    action: labelmap
  static_configs:
  - targets:
    - "TARGET_0"
    labels:
      __address__: "TARGET_0"
      __metrics_path__: /metrics
      __param__hash: "TARGET_0_HASH"
      __param__jobName: metrics
      __param__scheme: http
      __scheme__: http

Thanks, Jet

jet-go avatar Jun 27 '22 10:06 jet-go

Hello Currently i'm trying to setup shards with high availability ie running the multiple replicas of each shard. I'm not sure whether this is already possible. Could you please let me know whether this can be achieved or not? I did try to deploy 2 statefulsets with same config/coordinator, but the targets distribution was not the same in both sts pods.

I also want to setup shards with high availability. Now I run two coordinators with different shard.selector to do HA.

u-kyou avatar Jul 06 '22 12:07 u-kyou