--watch-progress-notify-interval doesn't work through JSON config
Bug report criteria
- [x] This bug report is not security related, security issues should be disclosed privately via [email protected].
- [x] This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
- [x] You have read the etcd bug reporting guidelines.
- [x] Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.
What happened?
I got this error from etcd when setting "watch-progress-notify-interval": "1m", in a JSON config. I assume that passing it as a flag from the commandline does work.
[etcdmain/etcd.go:66] failed to verify flags
error unmarshaling JSON: while decoding JSON: json: cannot unmarshal string into Go struct field configYAML.watch-progress-notify-interval of type time.Duration
I suspect this is because time.Duration doesn't have an UnmarshalJSON method, so there's nothing doing the duration parsing. It can be given as an integer, but then it's interpreted as nanoseconds.
Both watch-progress-notify-interval and experimental-watch-progress-notify-interval exhibit this behavior.
What did you expect to happen?
It to work the same as passing --watch-progress-notify-interval=1m.
How can we reproduce it (as minimally and precisely as possible)?
Add "experimental-watch-progress-notify-interval": "1m", to JSON.
Etcd version (please run commands below)
etcd Version: 3.6.1
Git SHA: a4708be
Go Version: go1.23.10
Go OS/Arch: linux/amd64
I was able to reproduce it for other configuration options, that are durations - basically it fails if the configuration contains anything that is not integer (mostly it matters for duration strings
What is interesting in cli it works conversely - it fails with integer, but works with duration string
@Jille So the issue is caused by the fact, that time.Duration is fancy way to present int64 - therefore it can be unmarshalled as integer in case it is provided from yaml configuration - if it is provided from cli as flag it is processed by flag.DurationVar, which is fancy and more useful way of presenting the time.ParseDuration which accepts strings.
This is not consistent behaviour and good luck using nanoseconds in your yaml config :).
What is interesting in cli it works conversely - it fails with integer, but works with duration string
@BBQing, can you provide an example or a reference to this issue?
From seeing #20360, we have many time.Durations, not only this one. And changing the behaviour for this one will make it inconsistent for others.
@ivanvc if I run etcd with this args:
./bin/etcd --watch-progress-notify-interval 5m
it runs (it spits lots of logs).
but if I create config file (from provided config sample) and I put in the config file watch-progress-notify-interval: 5m
I receive this log:
vscode ➜ /workspaces/etcd (main) $ ./bin/etcd --config-file ./files/config,yaml {"level":"info","ts":"2025-07-25T20:20:10.986152Z","caller":"etcdmain/etcd.go:63","msg":"Running: ","args":["./bin/etcd","--config-file","./files/config,yaml"]} {"level":"warn","ts":"2025-07-25T20:20:10.986374Z","caller":"etcdmain/etcd.go:65","msg":"failed to verify flags","error":"error unmarshaling JSON: while decoding JSON: json: cannot unmarshal string into Go struct field configYAML.Config.watch-progress-notify-interval of type time.Duration"}
but with watch-progress-notify-interval: 5000_000_000 in config file
it runs well again.
I would expect same behavior for cmd args and for config file - here it would mean to use duration string. But at the moment it works with duration string from cmd args and number from config file.
Thanks for the explanation, @BBQing. The issue is that this is not the only duration that I saw in your pull request. So, perhaps we should address the rest of them as well? We should also ensure that it's backward compatible. Otherwise, I would lean towards documenting the behavior.
Thanks for the explanation, @BBQing. The issue is that this is not the only duration that I saw in your pull request. So, perhaps we should address the rest of them as well? We should also ensure that it's backward compatible. Otherwise, I would lean towards documenting the behavior.
The implementation I have proposed, keeps the numbers from config file, so anybody using it, should not feel any difference. I would not document the numbers in config file - because I have a feeling, it works by accident and not by intent. And I would create some feature request for version 4.0, that would forbid to use numbers in config file (it is breaking change). Of course it would be accompanied deprecation warnings.
@BBQing, I see your point. Thanks for the clarification. Additionally, upon reviewing the pull request, it appears to be backward compatible.
For now, let's keep it as you proposed, and let's not consider further changes to the schema for 4.0.
#20429 This issue is to some extent similar, even though it seems the ConfigFromFile path of embed.Config is not unmarshaling and not setting the rafthttp.timeouts
There were discussions on this topic in both etcd and golang communities. Some people proposed to resolve this in golang stdlib; however, It seems that golang eventually rejected the change due to being a breaking change.
In etcd, I propose that we only update document to clarify that users should configure an integer (nanoseconds) for time.Duration fields in config file rather than implementing a customized time.Duration type. Eventually I'd still expect golang to fix it in stdlib.
In etcd,
- https://github.com/etcd-io/etcd/issues/13480
- https://github.com/ahrtr/etcd-issues/tree/master/issues/13480
In golang,
- https://github.com/golang/go/issues/10275
- https://go-review.googlesource.com/c/go/+/24473
@ahrtr thank you for the insights and references - let's hope the eventually won't be another 10 years
I have tried to make the change minimal, so that is stays within configFromFile so as a result most of etcd won't be notice the change. So once the json unmarshalling default is set within stdlib it won't be painful to refactor the code to stdlib (at that time it will be easily detected and refactored by capable agentic developers). The only thing that is ugly is the overhead of six or seven JSON suffixed fields in the config struct +2 for the rafthttp timeouts
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.