API to validate config before saving
Provide an API to run amtool config check. This will help in validating config before saving/reloading configuration
You only mean to check the general syntax, correct? Does this program solve your problem?
package main
import (
"fmt"
"os"
"github.com/alecthomas/kingpin/v2"
"github.com/prometheus/alertmanager/config"
)
func main() {
configFile := kingpin.Flag("config.file", "Alertmanager configuration file name.").Default("alertmanager.yml").String()
kingpin.Parse()
_, err := config.LoadFile(*configFile)
if err != nil {
println(fmt.Sprintf("ERROR: %s", err))
os.Exit(78)
}
println("OK.")
}
Hi, No not general syntax, we want to do the exact validations whatever amtool does.
if a api is available to do something like -> i pass a yaml string -> it does "amtool check-config yaml-string" on the passed object, it will be very helpful for us
Ok, so this means you want to take the existing amtool check-config command and expose it as an endpoint?
@SoloJacobs Yes that would be great, as before committing changes to alert manager a check will be helpful.
What's wrong with a pipeline step which runs amtool check-config in your CI/CD pipeline that ships the config change?
We deploy alertmanager with helm chart and update configmap, it will check and reject the config but upon restart alertmanager will go to crash loopback state due to faulty config in configmap.
The API we requested will be helpful in validating this config before updating config map
I also use helm and prometrheus operator. My deployment pipeline has a stage before applying the config which renders the the config, runs amtool check config and the result is then used in a later stage to deploy the k8s secret with that config consumed by the altermanager resource.
In any cas you need to run some command to do the validations. Why is posting your config to a http-api for validation better or easier than calling amtool ?
We have a different service to manage config. We have to fail early and show the exception when user try to save config. Since alert manager runs as a different service we need an api to validate the config before pushing to config map
To be honest the issue still not so clear to me. If I read this correctly, the service you use to manage the config map will use the endpoint of alertmanager to verify the configuration. But what is special about this service, that it could not use amtool?
Ideally, I would need some kind of minimal setup so I can play around with the problem on my machine. But before you invest the effort: I currently feel that I have to prioritize some other issues, and therefore can't promise that this feature will ever completed.
Say I have a java or python service, how do we run amtool? Are you suggesting to bundle amtool along with service and run the tool?
We have exposed an api using which you can update the alertmanager config.
How we do it is we directly update the kubernetes configmap at runtime. Once the configmap is updated, amtool picks those changes and runs validation on it. Now if validation fails and alertmanager restarts pod will go into crash loop backoff.
To avoid this currently we are have bundled amtool binary and executing validation commands using ProcessBuilder in java.
Now if alertmanager exposes an api for amtool validation we can skip the ProcessBuilder step which is susceptible to timeout and other command line issues.
I'm sorry to say that this feature does not fit well with the project as a whole, and thus won't be implemented. This decision was made during the bug scrub and it was unanimous.
Here are some suggestions, which I would explore if I was in your position:
- Alertmanager offers a reload option
curl -X POST http://<alertmanager-host>:<port>/-/reload
This could make it easier to detect whether a faulty configuration was deployed.
- Fork
alertmanageroramtooland extend it with the functionality you need. - I still think packaging
amtoolwith yourpythonorjavaservice is the correct way to approach this.
Kind regards
@SoloJacobs We have tried all available solutions! I understand your concern, but this usecase will really help our project, can i try raising a pull request? If you would be willing to accept it, I will try to solve this?
As I already mentioned, I did not make this decision by myself. You would have to convince a number of people. The implementation is not the main concern:
You have give an explanation why the avenues I have provided are not viable. This explanation needs to make sense to an outsider of your company (like me).
That being said, providing an implementation will certainly help your chances overall. Just be aware that is still likely to be declined.
Ya, I understand the concerns that come along with a big project like alertmanager, I will try my best and raise a pull request to resolve this use-case asap, and I will hope that you pick it in some of your upcoming release.
I will try to raise a pull request within a month, can you leave the issue open till then please?
You can close the issue if there is no progress even after a month.