kubespray icon indicating copy to clipboard operation
kubespray copied to clipboard

calico: CRD missing for IPAMConfig

Open hakoerber opened this issue 2 years ago • 15 comments

Environment:

  • "Bare metal" (A Hetzner VM)

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):

Linux 3.10.0-1160.49.1.el7.x86_64 x86_64
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7
  • Version of Ansible (ansible --version):
ansible 2.9.14
  • Version of Python (python --version):
3.10.4

Kubespray version (commit) (git rev-parse --short HEAD):

1f65e6d3 (tag v2.19.0)

Network plugin used: Calico


My ansible run, updating to v2.19.0, failed with the following message:

TASK [network_plugin/calico : Calico | Create ipamconfig resources] *******************************************************************************************
fatal: [cloud01]: FAILED! => {"changed": false, "msg": "error running kubectl (/usr/local/bin/kubectl apply --force --filename=/etc/kubernetes/calico-ipamconfig.yml) command (rc=1), out='', err='error: unable to recognize \"/etc/kubernetes/calico-ipamconfig.yml\": no matches for kind \"IPAMConfig\" in version \"crd.projectcalico.org/v1\"\n'"}

I ran the command manually on the host and got the same output:

[root@cloud01 bin]# /usr/local/bin/kubectl apply --filename=/etc/kubernetes/calico-ipamconfig.yml
error: unable to recognize "/etc/kubernetes/calico-ipamconfig.yml": no matches for kind "IPAMConfig" in version "crd.projectcalico.org/v1"

The manifest looks like this:

apiVersion: crd.projectcalico.org/v1
kind: IPAMConfig
metadata:
  name: default
spec:
  autoAllocateBlocks: True
  strictAffinity: False
  maxBlocksPerHost: 0

It looks to me like the CRD for IPAMConfig is missing. I cannot find a place in kubespray where it should be applied to the cluster.

I couldn't find any information about the CRD via google (there is very little information about calico's CRDs out there).

Can someone advise?

Relates to #8839 (the change that added the manifest)

hakoerber avatar Jun 02 '22 23:06 hakoerber

/cc @liupeng0518

@hakoerber which version of calico are you using?

cristicalin avatar Jun 05 '22 19:06 cristicalin

I too am blocked with this - I notice that in k8_cluster calico_ipam_host_local: true makes no difference

TomMcAvoy avatar Jun 06 '22 13:06 TomMcAvoy

Could you share some details about your setup see my question above about calico version in particular. We don't see this error in CI and I have not observed it on any production system.

cristicalin avatar Jun 06 '22 14:06 cristicalin

Same error here, upgrading from 2.18.1 to 2.19.0 on a bare-metal cluster.
So that's calico v3.20.3 => v3.22.3 (default versions of each kubespray tag).

Nothing special on this cluster I think except that I had to follow https://github.com/kubernetes-sigs/kubespray/pull/8434 while upgrading otherwise it failed saying inventory was not in line with current cluster configuration.
So I had to specify

calico_ipip_mode: Always
calico_network_backend: bird
calico_vxlan_mode: Never

olevitt avatar Jun 08 '22 10:06 olevitt

Checking CRDs available on the cluster and related to calico :

gon@laboitemagique:~$ kubectl get crd | grep calico
bgpconfigurations.crd.projectcalico.org               2020-10-22T09:47:27Z
bgppeers.crd.projectcalico.org                        2020-10-22T09:47:27Z
blockaffinities.crd.projectcalico.org                 2020-10-22T09:47:27Z
clusterinformations.crd.projectcalico.org             2020-10-22T09:47:27Z
felixconfigurations.crd.projectcalico.org             2020-10-22T09:47:27Z
globalnetworkpolicies.crd.projectcalico.org           2020-10-22T09:47:27Z
globalnetworksets.crd.projectcalico.org               2020-10-22T09:47:27Z
hostendpoints.crd.projectcalico.org                   2020-10-22T09:47:27Z
ipamblocks.crd.projectcalico.org                      2020-10-22T09:47:27Z
ipamconfigs.crd.projectcalico.org                     2020-10-22T09:47:27Z
ipamhandles.crd.projectcalico.org                     2020-10-22T09:47:27Z
ippools.crd.projectcalico.org                         2020-10-22T09:47:27Z
kubecontrollersconfigurations.crd.projectcalico.org   2020-10-22T09:47:27Z
networkpolicies.crd.projectcalico.org                 2020-10-22T09:47:27Z
networksets.crd.projectcalico.org                     2020-10-22T09:47:27Z

So those CRD dates back to the initial install of the cluster.
IpamConfig does not contain the maxBlocksPerHost directive :

gon@laboitemagique:~$ kubectl get crd ipamconfigs.crd.projectcalico.org -o yaml
...
 properties:
              autoAllocateBlocks:
                type: boolean
              strictAffinity:
                type: boolean
...

So maybe it should be updated somewhere in the process ?

olevitt avatar Jun 08 '22 10:06 olevitt

A workaround that seems to work for us is commenting the maxBlocksPerHost directive in roles/network_plugin/calico/templates/calico-ipamconfig.yml.j2 as it's not a required field according to https://doc.crds.dev/github.com/kubernetes-sigs/cluster-api-provider-azure/crd.projectcalico.org/IPAMConfig/[email protected] (could not find any better source for the definition).
EDIT : this link is better for official CRD : https://github.com/projectcalico/calico/blob/v3.22.3/libcalico-go/config/crd/crd.projectcalico.org_ipamconfigs.yaml

I think finding a way to upgrade the CRDs somewhere in the upgrade process would be a better solution

olevitt avatar Jun 08 '22 10:06 olevitt

Also of note, we are running calico in etcd datastore mode (calico_datastore: "etcd").
From what we have seen reading the source code of Kubespray, calico CRDs installation / upgrade may be tied to kdd (see https://github.com/kubernetes-sigs/kubespray/blob/master/roles/download/defaults/main.yml#L1437-L1452 ).
That may explain why CRDs are not installed / upgraded, leading to this error

olevitt avatar Jun 08 '22 14:06 olevitt

when calico running in etcd datastore mode, It seems don't need to install crd resources. maybe fixed by folllwing:

- name: Calico | Create ipamconfig resources
  kube:
    kubectl: "{{ bin_dir }}/kubectl"
    filename: "{{ kube_config_dir }}/calico-ipamconfig.yml"
    state: "latest"
  when:
    - inventory_hostname == groups['kube_control_plane'][0]
    - calico_datastore == "kdd"  // add this line

https://github.com/kubernetes-sigs/kubespray/blob/57c3aa4560f31b95cf098ede4a89d0327607a503/roles/network_plugin/calico/tasks/install.yml#L586

I'm not sure, WDYT?

cyclinder avatar Jun 09 '22 01:06 cyclinder

yes adding the line calico_datastore == "kdd" in etcd datastore mode worked. Can we assume calico is working properly that way?

joceluss avatar Jun 10 '22 13:06 joceluss

My understanding is that this line will prevent applying the resource in the case of non-kdd datastore. Isn't that resource needed even in non-kdd datastore ? If so, this line will work for upgrades (as they will keep the existing IPAMConfig resource) but will break new installs (as they won't have this resource at all).

olevitt avatar Jun 10 '22 14:06 olevitt

calico can running on non-kubernetes environment in etcd datastore mode, so it does not require CRD resources, and all data is stored in etcd. https://projectcalico.docs.tigera.io/getting-started/kubernetes/hardway/the-calico-datastore

cyclinder avatar Jun 13 '22 06:06 cyclinder

Thanks for the explanations!

joceluss avatar Jun 13 '22 20:06 joceluss

Is this not the issue? https://github.com/projectcalico/calico/issues/5950

EDIT: nevermind! jumped to conclusions because I'm hitting this trying to deploy a fresh cluster in 2.19 using etcd backend

DomHoney avatar Jun 28 '22 19:06 DomHoney

Being unfamiliar with the codebase, a naive search through /tests suggests there is only test coverage using calico_datastore: etcd for Canal. If I'm correct, is it worth adding a new test case for Calico? Personally we use Calico this way to simplify deployment and initial scaling, and ideally would like to continue deployments using etcd directly assuming its still supported by Calico.

DomHoney avatar Jun 28 '22 20:06 DomHoney

I encountered the same problem. However, my case is different. I have installed and I have been upgrading my cluster with Kubespray since version 2.13. Now I attempted to upgrade it to 2.19 and stumbled upon this problem.

I realized that the Calico backend has been changed from etcd to kdd by default around version 2.15. Calico upgraded just fine until 2.18 and surprisingly Kubespray did not migrate it to kdd. I just realized that I have kdd in settings and it won't install the CRDs. Then I changed it back to etcd and it would still fail with:

no matches for kind "IPAMConfig" in version "crd.projectcalico.org/v1"

I added the line as given by https://github.com/kubernetes-sigs/kubespray/issues/8917#issuecomment-1150588144. I also had to change defaults related to bird as given by https://github.com/kubernetes-sigs/kubespray/issues/8917#issuecomment-1149737374. Let's see how it goes. (38 minutes later) It worked.

zzvara avatar Jul 30 '22 15:07 zzvara

@oomichi looks like PR https://github.com/kubernetes-sigs/kubespray/pull/8839 forces the use of CRD's, which aren't supported when Calico is installed with etcd as backend. Am I correct in concluding that ipamconfig wouldn't be applied / supported anymore with this PR merged?

Jeroen0494 avatar Aug 24 '22 07:08 Jeroen0494

@oomichi looks like PR #8839 forces the use of CRD's, which aren't supported when Calico is installed with etcd as backend. Am I correct in concluding that ipamconfig wouldn't be applied / supported anymore with this PR merged?

You are correct. It seems #8839 broke calico_datastore: etcd and removed ipamconfig support for etcd mode since there's no longer feature parity for ipamconfig between both calico_datastore options.

To fix builds, I will submit a PR with the ipamconfig tasks disabled when calico_datastore == etcd. If someone wants to restore ipamconfig support for calico_datastore: etcd please submit a follow up PR for that.

chadswen avatar Aug 29 '22 16:08 chadswen

@chadswen Thank you for the fix! Will this PR get backported to Kubespray 2.19? Right now this release is still broken for me.

Jeroen0494 avatar Aug 30 '22 07:08 Jeroen0494

It could be 😄

floryut avatar Aug 30 '22 07:08 floryut

It could be smile

I haven't signed the CLA, would you be so kind?

Jeroen0494 avatar Aug 30 '22 08:08 Jeroen0494

@chadswen Thank you for the fix! Will this PR get backported to Kubespray 2.19? Right now this release is still broken for me.

@Jeroen0494 release-2.19 backport PR created: #9234

chadswen avatar Aug 30 '22 15:08 chadswen