kr521
kr521
I deleted all the data and rebuilt the machine-IDs for all node nodes, then restarted all services. It takes about 1 hour from startup for the service to encounter a...
I cleared Prometheus' data and modified the startup parameters to add --query.max-samples=50000000. After running for about 40 minutes, a 422 error started to appear. `1bl5lkk5|rate(container_kafka_requests_duration_seconds_total_bucket[$RANGE])|1719885105|422 Unprocessable Entity`
The error was likely due to the short data retention period in Prometheus. I increased the --storage.tsdb.retention.time=1d parameter, and the error disappeared. I will monitor it for a while.
> I guess the error is related to the query.max-samples limit in Prometheus. Could you please try to increase it? > > ``` > --query.max-samples=50000000 > Maximum number of samples...
> it happened for me several times, every time to fix it I need to delete coroot/prometheus PVC (not sure if the latter is needed), I have retention period set...
[new issue](https://github.com/coroot/coroot-node-agent/issues/103)
> @kr521, `--query.max-samples=50000000` is the default value, so you changed nothing. changed to --query.max-samples=5000000000
> @def keep increasing it, however it is dependent on the size of the cluster , woulb be better if coroot could handle error in prom queries and at least...