kops How to use a Cinder block volume as root volume on Openstack with Kops ?

/kind bug

1. What kops version are you running? The command kops version, will display this information.

Client version: 1.27.0 (git-v1.27.0)

2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag.

Client Version: v1.28.0 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3

3. What cloud provider are you using?

Openstack 2023.1 (kolla-ansible)

4. What commands did you run? What is the simplest way to reproduce this issue?

kops create cluster \
  --cloud openstack \
  --name dust.k8s.local \
  --state ${KOPS_STATE_STORE} \
  --zones nova \
  --network-cidr 10.0.0.0/24 \
  --image debian-10-amd64 \
  --master-count=3 \
  --node-count=5 \
  --node-size k1.worker \
  --master-size k1.master \
  --api-loadbalancer-type public \
  --etcd-storage-type __DEFAULT__ \
  --topology private \
  --bastion \
  --ssh-public-key ~/.ssh/id_rsa.pub \
  --networking calico \
  --os-octavia=true \
  --os-ext-net public

kops update cluster --name dust.k8s.local --yes --admin

5. What happened after the commands executed?

Hello, I am running an Openstack AIO with Cinder LVM for volume backend as the default storage for volumes.

When deploying a cluster, I encounter this error in the last steps (creation of VMs)

W0822 03:19:56.109181  359260 executor.go:139] error running task "Instance/nodes-nova-5-dust-k8s-local" (1m57s remaining to succeed): Error creating instance: unable to create server: {500 2023-08-22 01:19:55 +0000 UTC  No valid host was found. There are not enough hosts available.}
W0822 03:19:56.109218  359260 executor.go:139] error running task "Instance/nodes-nova-3-dust-k8s-local" (1m57s remaining to succeed): Error creating instance: unable to create server: {500 2023-08-22 01:19:54 +0000 UTC  No valid host was found. There are not enough hosts available.}
W0822 03:19:56.109233  359260 executor.go:139] error running task "Instance/nodes-nova-4-dust-k8s-local" (1m57s remaining to succeed): Error creating instance: unable to create server: {500 2023-08-22 01:19:55 +0000 UTC  No valid host was found. There are not enough hosts available.}
W0822 03:19:56.109247  359260 executor.go:139] error running task "Instance/nodes-nova-2-dust-k8s-local" (1m57s remaining to succeed): Error creating instance: unable to create server: {500 2023-08-22 01:19:54 +0000 UTC  No valid host was found. There are not enough hosts available.}

I searched for the root of the issue and figured out that Openstack refuses to schedule the instance as its running low on disk space after creating a few instances.

I can create tens of instances using Terraform or the Horizon WebUI but not using Kops because for some reason, it tries to provision VMs using the hypervisor local storage (which is limited to a few 10s of GBs as its not really meant to be used for my use-case) and not my Cinder LVM storage.

My guess is that I'm probably missing an argument to change this behaviour. This argument looks promising but there is no documentation about it that I can find.

cloudConfig:
    openstack:
      blockStorage:

Could also be related to my Openstack setup.

6. What did you expect to happen?

Kops not to use the hypervisor local storage for VM data.

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: "2023-08-22T01:48:51Z"
  name: dust.k8s.local
spec:
  api:
    loadBalancer:
      type: Public
  authorization:
    rbac: {}
  channel: stable
  cloudConfig:
    openstack:
      blockStorage:
        bs-version: v3
        clusterName: dust.k8s.local
        ignore-volume-az: false
      loadbalancer:
        floatingNetwork: public
        floatingNetworkID: 36d62804-9ed9-44fa-97d3-b27f67fdbd5b
        method: ROUND_ROBIN
        provider: octavia
        useOctavia: true
      monitor:
        delay: 15s
        maxRetries: 3
        timeout: 10s
      router:
        externalNetwork: public
  cloudControllerManager:
    clusterName: dust.k8s.local
  cloudProvider: openstack
  configBase: swift://dust-k8s-kops-state/dust.k8s.local
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - instanceGroup: control-plane-nova-1
      name: etcd-1
      volumeType: __DEFAULT__
    - instanceGroup: control-plane-nova-2
      name: etcd-2
      volumeType: __DEFAULT__
    - instanceGroup: control-plane-nova-3
      name: etcd-3
      volumeType: __DEFAULT__
    manager:
      backupRetentionDays: 90
    memoryRequest: 100Mi
    name: main
  - cpuRequest: 100m
    etcdMembers:
    - instanceGroup: control-plane-nova-1
      name: etcd-1
      volumeType: __DEFAULT__
    - instanceGroup: control-plane-nova-2
      name: etcd-2
      volumeType: __DEFAULT__
    - instanceGroup: control-plane-nova-3
      name: etcd-3
      volumeType: __DEFAULT__
    manager:
      backupRetentionDays: 90
    memoryRequest: 100Mi
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubelet:
    anonymousAuth: false
  kubernetesApiAccess:
  - 0.0.0.0/0
  - ::/0
  kubernetesVersion: 1.27.4
  networkCIDR: 10.0.0.0/24
  networking:
    calico: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 0.0.0.0/0
  - ::/0
  subnets:
  - cidr: 10.0.0.32/27
    name: nova
    type: Private
    zone: nova
  - cidr: 10.0.0.0/30
    name: utility-nova
    type: Utility
    zone: nova
  topology:
    dns:
      type: Private
    masters: private
    nodes: private

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-08-22T01:48:51Z"
  labels:
    kops.k8s.io/cluster: dust.k8s.local
  name: bastions
spec:
  image: debian-10-amd64
  machineType: m1.small
  maxSize: 1
  minSize: 1
  role: Bastion
  subnets:
  - nova

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-08-22T01:48:51Z"
  labels:
    kops.k8s.io/cluster: dust.k8s.local
  name: control-plane-nova-1
spec:
  image: debian-10-amd64
  machineType: k1.master
  maxSize: 1
  minSize: 1
  role: Master
  rootVolumeSize: 1
  subnets:
  - nova

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-08-22T01:48:51Z"
  labels:
    kops.k8s.io/cluster: dust.k8s.local
  name: control-plane-nova-2
spec:
  image: debian-10-amd64
  machineType: k1.master
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
  - nova

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-08-22T01:48:51Z"
  labels:
    kops.k8s.io/cluster: dust.k8s.local
  name: control-plane-nova-3
spec:
  image: debian-10-amd64
  machineType: k1.master
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
  - nova

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-08-22T01:48:51Z"
  labels:
    kops.k8s.io/cluster: dust.k8s.local
  name: nodes-nova
spec:
  image: debian-10-amd64
  machineType: k1.worker
  maxSize: 5
  minSize: 5
  role: Node
  subnets:
  - nova

8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here.

log with -v 10

9. Anything else do we need to know?

I'll try to check how Kops creates VMs on Openstack tomorrow and compare it to how they are created on Terraform / Horizon. I think it'll help.

Thanks !

Aug 22 '23 02:08 NoOverflow

Comparing the Kops request vs Terraform's with Fiddler, I noticed a field "os:scheduler_hints" that was not present in Terraform requests.

Removing this field by using breakpoints allowed me to provision all machines with no problems. I assume this flag is set by Kops to put the instance in a specific instance group ?

Anyway, I think the issue is mostly coming from an Openstack misconfiguration on my end, but is there a way for me to remove this flag using a command line argument so I can use Kops while I figure out what's wrong ?

Aug 22 '23 13:08 NoOverflow

Ok, I figured out why provisioning VMs from Horizon directly wasn't increasing my local hypervisor storage.

By default Kops and Terraform will create the root volume from the image directly, and store it on the local storage:

{{
  "server": {
    "availability_zone": "nova",
    "config_drive": false,
    "flavorRef": "07ed0b5e-1daa-47b8-882e-c2e2d1a6cdac",
    "imageRef": "e2330d75-9703-492d-89f8-0d2b186cb959",
    "key_name": "test-keypair",
    "name": "basic",
    "networks": [
      {
        "uuid": "ae7f60ff-6e97-4e1a-93e7-97807737f63f"
      }
    ],
    "security_groups": [
      {
        "name": "default"
      }
    ],
    "user_data": ""
  }
}

On the other hand, Horizon will create a new volume based on the image UUID, and then mount it on the VM, therefore not using the precious hypervisor local storage: (note the block_device_mapping_v2 field)

{
  "availability_zone": "nova",
  "config_drive": false,
  "user_data": "",
  "default_user_data": "",
  "disk_config": "AUTO",
  "instance_count": 1,
  "name": "test5",
  "scheduler_hints": {},
  "security_groups": [
    "691575f1-84bc-4081aea7b6ebe01980e2"
  ],
  "create_volume_default": true,
  "hide_create_volume": false,
  "source_id": null,
  "block_device_mapping_v2": [
    {
      "source_type": "image",
      "destination_type": "volume",
      "delete_on_termination": false,
      "uuid": "e2330d75-9703-492d-89f8-0d2b186cb959",
      "boot_index": "0",
      "volume_size": 20
    }
  ],
  "flavor_id": "07ed0b5e-1daa-47b8-882e-c2e2d1a6cdac",
  "nics": [
    {
      "net-id": "ae7f60ff-6e97-4e1a-93e7-97807737f63f",
      "v4-fixed-ip": ""
    }
  ],
  "key_name": "test-keypair"
}

Therefore the issue is not coming from Kops directly, but from my cluster instances configuration. Just need to figure out from the documentation how to get Kops to create instances with a volume mapping on Openstack (did it once on AWS so it shouldn't be that different)

Aug 22 '23 14:08 NoOverflow

Here's what a Terraform script that allows me to use a block device served by cinder as a root volume looks like;

resource "openstack_compute_instance_v2" "test-server" {
    availability_zone = "nova"
    name            = "basic"
    flavor_id       = "07ed0b5e-1daa-47b8-882e-c2e2d1a6cdac"
    key_pair        = "test-keypair"
    security_groups = ["default"]

    block_device {
        uuid                  = "e2330d75-9703-492d-89f8-0d2b186cb959"  // My Image UUID (debian)
        source_type           = "image"
        destination_type      = "volume"
        boot_index            = 0
        volume_size           = 20
        delete_on_termination = true
    }

    network {
        name = "test"
    }
    count = 2
}

I can't seem to find a way to modify InstanceRootVolumeSpec to achieve the same thing, is it even possible in Kops or should I open a feature request ?

Aug 22 '23 15:08 NoOverflow

/kind support

(Moving this to tag kind/support as it's not a bug but a question, don't know how to remove the kind/bug tag)

Aug 22 '23 15:08 NoOverflow

@zetaab Any ideas?

Aug 22 '23 17:08 hakman

you need to define annotations to instancegroups https://github.com/kubernetes/kops/blob/a913d3c0dba757653761ee8d2f0b16bedab0d34a/pkg/model/openstackmodel/servergroup.go#L119-L125

for instance

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  annotations:
    openstack.kops.io/osVolumeBoot: "true"
    openstack.kops.io/osVolumeSize: "10"
  name: nodes
spec:
...

Aug 23 '23 04:08 zetaab

Hi, thanks for the response !

Unfortunately I had already tried that, but I may have an idea why it still uses local hypervisor storage, when sniffing the request Kops makes with Fiddler, while it did provision a boot volume on cinder correctly, I noticed that it still sent the imageRef argument which seems to confuse Openstack.

Trying to replicate this, it seems that Terraform does not send the imageRef field even if set manually:

Which creates the instances with no problem and without using the hypervisor local storage:

Aug 23 '23 10:08 NoOverflow

this feature was created in https://github.com/kubernetes/kops/pull/7652 However, I have never used it because I have no need for that. I cannot say is it working correctly or not

Aug 23 '23 11:08 zetaab

I mean, the feature does work correctly, it provision a volume and boots from it but still uses another volume on the local storage (probably due to that imageRef field). A quick temporary fix would be to try and remove the imageRef field manually to match Terraform's behavior and see if it works. I can try that on my end

Aug 23 '23 11:08 NoOverflow

Removing the imageRef field using this fiddler script:

if (oSession.HTTPMethodIs("POST") && oSession.PathAndQuery.StartsWith("/v2.1/servers") && oSession.HostnameIs("192.168.1.190:8774")) {
    FiddlerObject.log("Detecting server creation request, setting imageRef to empty...");
    oSession.utilDecodeRequest();
            
    var requestBody = System.Text.Encoding.UTF8.GetString(oSession.requestBodyBytes);
      
    if (!requestBody.Contains("bastions")) {
        requestBody = requestBody.Replace("\"imageRef\":\"e2330d75-9703-492d-89f8-0d2b186cb959\"", "\"imageRef\":\"\"");
        FiddlerObject.log(requestBody);
        oSession.utilSetRequestBody(requestBody);
    } 
}

Kops managed to create all VMs without problem and without using the hypervisor local storage.

I think kops default behaviour should be that if the openstack.kops.io/osVolumeBoot is set to true, the imageRef field should be set to blank (for Openstack, that is).

Aug 23 '23 13:08 NoOverflow

Would be more than happy to dig in the Openstack API documentation to confirm that this behaviour is intended, and open a PR to fix it in Kops. Even though I'm not really familiar with Go, it should only be an additional condition around here in instance.go

Aug 25 '23 09:08 NoOverflow

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jan 27 '24 02:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Feb 26 '24 03:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Mar 27 '24 04:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Mar 27 '24 04:03 k8s-ci-robot