webhook-example icon indicating copy to clipboard operation
webhook-example copied to clipboard

failed with: OpenAPI spec does not exist

Open deyaeddin opened this issue 4 years ago • 13 comments

Hi, I'm trying to implement Hetzner DNS API based on this sample, Everything is working as expected and the webhook issue the certificates normally, however, I'm getting this log error constantly:

controller.go:129] OpenAPI AggregationController: action for item v1alpha1.acme.mycompany.com: Rate Limited Requeue.
controller.go:116] loading OpenAPI spec for "v1alpha1.acme.mycompany.com" failed with: OpenAPI spec does not exist

any idea how to fix it?

deyaeddin avatar Jun 03 '21 11:06 deyaeddin

I have got the same error with the freenom dsn resolver.

Still no idea on how to fix it.

andreee94 avatar Aug 16 '21 10:08 andreee94

I have the same issue with Cert Manager Webhook for Dynu

Any idea?

rbaumgar avatar Feb 17 '22 14:02 rbaumgar

controller.go:116] loading OpenAPI spec for "v1alpha1.acme.mycompany.com" failed with: OpenAPI spec does not exist

v1alpha1.acme.mycompany.com is the default API GroupVersion in this sample webhook repository. The error comes from when a new apiserver config is created in Kubernetes.

Perhaps the group name in chart values has not been changed https://github.com/cert-manager/webhook-example/blob/master/deploy/example-webhook/values.yaml#L9 ?

If it seems like the webhook repository does not reference v1alpha1.acme.mycompany.com anywhere, but the error is still present, it would be good if someone could add instructions how to reproduce it.

irbekrm avatar Jan 20 '23 15:01 irbekrm

Looking at the referenced cert manager webhooks (dynu, freenom, and others on GitHub) and my own, it looks like the error messages are present even if we update the group name to something explicit.

It seems another project implements the openapi spec https://github.com/kubernetes-sigs/prometheus-adapter/pull/335

May need to implement this: https://github.com/kubernetes/kube-openapi

atsai1220 avatar Jan 20 '23 17:01 atsai1220

I got the same problem with PowerDNS webhook, still no fix?

vdobes avatar Mar 29 '23 11:03 vdobes

Update: Ignore this, my issue is different, seems to be a problem with using too new of a version of k8s.io/api

I hit what appears to be a variation of this on my attempt at a custom webhook for Porkbun

cert-manager: error executing command" err="error installing APIGroup for solvers: unable to get openapi models: OpenAPIV3 config must not be nil

I dug into the webhook code and nothing obvious stuck out to me.

bcspragu avatar Aug 22 '23 15:08 bcspragu

Update: Ignore this, my issue is different, seems to be a problem with using too new of a version of k8s.io/api

I hit what appears to be a variation of this on my attempt at a custom webhook for Porkbun

cert-manager: error executing command" err="error installing APIGroup for solvers: unable to get openapi models: OpenAPIV3 config must not be nil

I dug into the webhook code and nothing obvious stuck out to me.

@bcspragu if you are still interested / haven't solved your problem yet, consider this: I ran into the same error message and as it turns out, the problem was the go version used in the Dockerfile, i.e. the go version used for building the hook.

I overcame the problem by looking at other working hooks, realizing the most recent version any of them used was:

FROM golang:1.19-alpine AS build_deps

If you set your Dockerfile to use v1.19, your problem will likely disappear. To make my build work, I referenced the webhook for google-domains, merging their go.mod and go.sum files with the one go v1.21 created for my own project, simply adding the additional lines required for my project to their files.

irreleph4nt avatar Sep 16 '23 08:09 irreleph4nt

@bcspragu if you are still interested / haven't solved your problem yet, consider this: I ran into the same error message and as it turns out, the problem was the go version used in the Dockerfile, i.e. the go version used for building the hook.

Thanks for the tip! In my case, I don't remember the exact resolution, but I think I had fixed my issue by just not upgrading k8s.io/api (i.e. reverting most of my go.mod changes). That Porkbun webhook now works excellent for my use case, and I'm running it on Go 1.21

bcspragu avatar Sep 17 '23 03:09 bcspragu

This issue should be now solved.

  • compile the webhook against cert-manager 1.13.0
  • run the webhook with cert-manager 1.13.0

I haven't seen a single error across my cluster for the last 24 hours.

Note that upgrading to cert-manager to 1.13.0 alone isn't enough. The webhook needs to be compiled against cert-manager 1.13.0.

Here is an excerpt of my go.mod

go 1.20

require (
	github.com/cert-manager/cert-manager v1.13.0
	github.com/ovh/go-ovh v1.4.2
	k8s.io/api v0.28.1
	k8s.io/apiextensions-apiserver v0.28.1
	k8s.io/apimachinery v0.28.1
	k8s.io/client-go v0.28.1
)

aureq avatar Sep 18 '23 04:09 aureq

This issue should be now solved.

  • compile the webhook against cert-manager 1.13.0
  • run the webhook with cert-manager 1.13.0

I haven't seen a single error across my cluster for the last 24 hours.

Note that upgrading to cert-manager to 1.13.0 alone isn't enough. The webhook needs to be compiled against cert-manager 1.13.0.

Here is an excerpt of my go.mod

go 1.20

require (
	github.com/cert-manager/cert-manager v1.13.0
	github.com/ovh/go-ovh v1.4.2
	k8s.io/api v0.28.1
	k8s.io/apiextensions-apiserver v0.28.1
	k8s.io/apimachinery v0.28.1
	k8s.io/client-go v0.28.1
)

Whilst I can confirm building against v1.13 works when also deploying with the same version, the information provided is incomplete. More development has to be done for the hook to actually work:

  • between v1.11 and v1.13, cert-manager moved the test/acme/dns folder contents out of the repo and back in. They messed up however and obmitted the /dns/ folder when moving back in, which requires an update to the test suite in main_test.go
  • the controller-runtime in v1.13 has a bug, making it complain log.SetLogger(...) has never been called, requiring adjustments in the solver code
  • v1.13 is significantly more aggressive when triggering requests in the extended test suite. This got out of control so much that a rework of a solver was required to adjust for API rate limits. The same solver works perfectly fine against the same API when compiled with v1.11

irreleph4nt avatar Sep 22 '23 23:09 irreleph4nt

@irreleph4nt I'm not part of the cert-manager/jetstack team but what you describe sounds like separate issues (though connected perhaps). Have you considered raising separate issues and linking them to this one?

On your 2nd point, do you have more details and some code to propose?

aureq avatar Sep 23 '23 01:09 aureq

I really appreciate you all commenting here and discussing this! We (the maintainer team) are stretched pretty thin at the moment and it looks like the webhook-example has passed us by a bit. I've put out a call to action to hopefully get someone to take a look at this repo soon!

Thank you all for being part of the community, appreciate you all! ❤️

SgtCoDFish avatar Nov 16 '23 09:11 SgtCoDFish

Likewise @SgtCoDFish we all also very much appreciate all the work you and the entire team are doing.

aureq avatar Nov 16 '23 10:11 aureq