cluster-api-provider-openstack `additionalBlockDevices` should consider the flavor's `OS-FLV-EXT-DATA:ephemeral` property

/kind feature

Describe the solution you'd like

(Heads up: this is speculative based on code inspection. I have not yet attempted to reproduce the scenarios described here to prove things work the way I've said. I will though...unless someone beats me to it)

additionalBlockDevices is expected to represent all non-root block devices attached to the instance. As of #1692, we are able to indicate the type of block device by setting the type property of the given block device, which allows us to create Nova-provisioned (a.k.a. "ephemeral") disk as well as the Cinder-provisioned disks previously supported. However, there are two nuances with Nova-provisioned disks that I don't think are accounted for. Firstly, while you can create multiple Nova-provisioned disks, the total capacity of those disks cannot exceed the GB value given in the OS-FLV-EXT-DATA:ephemeral property of the flavour. This will presumably cause non-obvious reconciler issue if a user tries to create e.g. two local type BDMs of 10GB but the flavor has OS-FLV-EXT-DATA:ephemeral = 15. Secondly, if no block device mapping is given but your flavor specifies a non-zero property for OS-FLV-EXT-DATA:ephemeral, you will still get a single volume corresponding to that property. This won't cause any reconciliation issues, but will lead to a situation where k8s' view of the server resource diverges from OpenStack's, which sounds like a Bad Thing :tm:

Anything else you would like to add:

I recently wrote a blog about block devices in OpenStack to jot down my own understanding of BDMs in OpenStack. It may or may not be helpful. In any case, it can be found here https://that.guru/blog/block-devices-in-openstack/. Nova's own docs on the matter are spread out but the most important pieces (IMO) can be found here and here.

Aug 29 '24 15:08 stephenfin

Awesome inputs @stephenfin , thanks for looking at it.

I see two things here:

When providing additional block devices, we should get the total size of them and make sure the OpenStackMachine flavour can support that size in OS-FLV-EXT-DATA:ephemeral. Send an error before creating the Machine.
When not providing additional block devices, we should validate that the flavor doesn't have a > 0 size value in OS-FLV-EXT-DATA:ephemeral. Maybe send a warning and not an error? since it won't cause reconciliation issue.

Aug 29 '24 15:08 EmilienM

Mostly agreed on both. I'd probably lean towards an error for the latter case also due to the aforementioned divergence between k8s'/OS' view of the world, but I can see both sides of the argument.

Aug 29 '24 15:08 stephenfin

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Nov 27 '24 15:11 k8s-triage-robot

/remove-lifecycle stale

Nov 27 '24 16:11 EmilienM

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Feb 25 '25 16:02 k8s-triage-robot

/remove-lifecycle stale

Feb 25 '25 16:02 stephenfin

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

May 26 '25 17:05 k8s-triage-robot

/remove-lifecycle stale

May 26 '25 21:05 stephenfin

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Aug 24 '25 22:08 k8s-triage-robot

/remove-lifecycle stale

Aug 25 '25 19:08 stephenfin

cluster-api-provider-openstack cluster-api-provider-openstack copied to clipboard

`additionalBlockDevices` should consider the flavor's `OS-FLV-EXT-DATA:ephemeral` property

cluster-api-provider-openstack
cluster-api-provider-openstack copied to clipboard