Joseph Zhang

Results 13 comments of Joseph Zhang

Hi, is there any plan to update rapidyaml version in jsonnet to address the issue? Not sure why this issue wasn't fixed in the v0.20.0 cycle.

Yea, that's a good idea to try to bring the stream and consumer back in sync without losing the state. Since in my case the leader (nats-0) had sequence number...

I think the impact of this issue is concerning. Would be great if it can be addressed. I'm happy to provide all the deployment setups for troubleshooting. Though I'm not...

Did more testing in a test environment with the same nats setup, found that this issue isn't related to NATs version upgrade. A rolling restart of NATs statefulset could trigger...

Above graph shows the `nats_stream_last_seq` of my test stream. In theory three lines representing three replicas should stay closely together. But we can see that sometimes one or two lines...

We are using the latest nats helm chart (`1.1.6`) which [provides](https://github.com/nats-io/k8s/blob/main/helm/charts/nats/files/stateful-set/nats-container.yaml#L38-L49) this startupProbe by default: ``` startupProbe: failureThreshold: 90 httpGet: path: /healthz port: monitor scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10...

`successThreshold` of `startupProbe` can't be set to anything other than 1. Instead I added a 30s preStop sleep before NATs enters lame duck mode so that there's 30s gap between...

Indeed. and nice theory about the `publishNotReadyAddresses`. In my setup although we are using the latest helm chart, but some clients are connecting through headless service, so if an early...

Just moved all nats clients to the ClusterIP SVC (well except NATs itself which uses headless SVC for discovery as expected). Unfortunately issue is still reproducible that after a rolling...

@derekcollison @wallyqs : here are the `-DV` log of a pod that had its stream seq num reset after restart. I grepped the log by the RAFT group id of...