congqixia
congqixia
do we have any coredump file for this run? @elstic
 pod was killed by SIGTERM due to health check failure ``` {"metadata":{"name":"perf-standalone95800-5-80-8957-milvus-standalone-6fc69f7bckqdsm.17a6e1a22dbbdd13","namespace":"qa-milvus","uid":"c0e649ac-930e-4ffc-ab15-4cc00e34a83b","resourceVersion":"555389320","creationTimestamp":"2024-01-03T15:59:55Z","managedFields":[{"manager":"kubelet","operation":"Update","apiVersion":"v1","time":"2024-01-03T15:59:55Z","fieldsType":"FieldsV1","fieldsV1":{"f:count":{},"f:firstTimestamp":{},"f:involvedObject":{},"f:lastTimestamp":{},"f:message":{},"f:reason":{},"f:source":{"f:component":{},"f:host":{}},"f:type":{}}}]},"involvedObject":{"kind":"Pod","namespace":"qa-milvus","name":"perf-standalone95800-5-80-8957-milvus-standalone-6fc69f7bckqdsm","uid":"f62b8a09-bf84-49bd-bf72-25f8e671fc9b","apiVersion":"v1","resourceVersion":"555373580","fieldPath":"spec.containers{standalone}"},"reason":"Unhealthy","message":"Liveness probe failed: HTTP probe failed with statuscode: 500","source":{"component":"kubelet","host":"4am-node33"},"firstTimestamp":"2024-01-03T15:59:55Z","lastTimestamp":"2024-01-03T15:59:55Z","count":1,"type":"Warning","eventTime":null,"reportingComponent":"","reportingInstance":""} ``` It looks like a known issue for healthz...
BTW standalone was building index during the healthz check failure period
> > do we have any coredump file for this run? @elstic > > No. It's not on by default. Do you need coredump? @elstic since it's caused by SIGTERM,...
find lots of "request resource failed" may trigger connect reset issue as well fixed in 2.3 latest: pr #29061
since the log is not complete, cannot be sure what is the root cause for segment `446490888338773121` went missing could be known issue fixed by #29344
I will check the tsafe lag problem
The latest data flow shall be 