spec icon indicating copy to clipboard operation
spec copied to clipboard

[WIP] Relax offline criteria

Open gnufied opened this issue 4 years ago • 11 comments

Also overload meaning of readonly expansion on node.

xref https://github.com/container-storage-interface/spec/issues/423

gnufied avatar Apr 29 '20 12:04 gnufied

I have been thinking some more about this and I think that - removing/deprecating plugin level online/offline capability is fine.

Pros:

  1. Since no CO actually uses this capability, removing the capability while is not strictly backward compatible change but it will not actually break any plugin (even older versions!). For example, a plugin compiled with 1.2.0 may still advertise those capabilities but CO which uses 1.3.0( or even 1.2.0) does not simply uses it.
  2. Another problem with these capabilities is - they enforce a ordering of RPC calls via capabilities. There are other CSI RPC calls that have similar problems for example - can you attach a volume that is being deleted?, can you delete a volume that is attached?, can you snapshot a volume that is being expanded? Having these capabilities for volume expansion causes snowflakes which is hard to maintain.

Cons:

  1. A biggest con I think is communication around whether we support offline expansion at all or not. This will become more implicit (still supported though).
  2. Plugins will have to delete the capability code when they compile themselves against 1.3.0 (or whichever version removes this capability).

For me I think biggest pro is - not having to enforce ordering via capabilities. We have learnt that, this does not work well and is hard to implement correctly. cc @saad-ali @xing-yang @msau42 @jsafrane

gnufied avatar Apr 29 '20 19:04 gnufied

Deprecating the capability until the next major release would allow us to avoid making a breaking change, I think. Plugins using the value could continue to advertise the capability, and COs could continue to ignore it. We'll discuss it more in the call next week, but I would vote for deprecation over removal unless we have some other reason we need to cut a 2.0 release.

julian-hj avatar Apr 29 '20 21:04 julian-hj

Offline volume expansion is a released feature that is already implemented by some CSI drivers and used by customers. It also has well-documented procedure in both Kubernetes and CSI spec on how to achieve that. Yes, there are issues, but it is still a useful feature. For some CSI drivers, this is the only way for users to expand an existing volume.

I also think that we should separate the issue of controller offline expansion vs node offline expansion. According to the current CSI spec, offline expansion refers to controller offline expansion only. Node offline expansion (a way for a plugin to opt-in to not call NodeExpandVolume automatically when volume is in-use by a running workload) is a new feature that has not been addressed. In stead of removing the offline capability altogether, I think we should keep the existing offline expansion capability for controller expansion, and introduce a new alpha capability for Node offline expansion.

If there is a strong reason to absolutely remove this capability, then I agree with Julian that we need to have a deprecation period. Isn’t the deprecation period usually 2 releases? I don’t think we should remove a released capability without a deprecation period.

I also think that we should come up with a replacement plan first before deprecating an already released capability. Can we discuss how this new alpha capability would work first? Once we can agree upon a new alpha capability, then we can introduce the new alpha capability and in the same time deprecate the existing capability.

xing-yang avatar Apr 30 '20 00:04 xing-yang

Just to clarify - we are not removing offline capability. Volumes can be still expanded offline but what we are talking about removing/deprecating is, ability for a plugin to declare that it only supports offline expansion.

Plugins can still only support offline expansion and in which case they would report Volume in-use error and CO should respect that and not retry and until volume has been unpublished.

gnufied avatar Apr 30 '20 01:04 gnufied

After the ControllerExpandVolume had been called and expansion successful, how often the NodeExpandVolume will be called automatically when volume is in-use by a running workload? @gnufied @xing-yang

zhucan avatar Apr 30 '20 02:04 zhucan

Just to clarify - we are not removing offline capability. Volumes can be still expanded offline but what we are talking about removing/deprecating is, ability for a plugin to declare that it only supports offline expansion.

Plugins can still only support offline expansion and in which case they would report Volume in-use error and CO should respect that and not retry and until volume has been unpublished.

Removing the ability for plugin to declare that it only supports offline expansion makes it look like offline expansion support is no longer official. I don't see how the spec can be clarified so that a plugin cannot declare that it only supports offline expansion but somehow offline expansion is still supported. I think in that case, it is better not to make any changes to the existing CSI spec. I think removing it will add confusion and will not make things better.

xing-yang avatar Apr 30 '20 03:04 xing-yang

We had another meeting to discuss this today.

There are two areas we want fix: "offline controller expand" and "offline node expand".

For "offline controller expand" the following options were discussed:

  1. Don't change anything in CSI, and do "best effort" on Kubernetes side.
    • This violates the spec and may cause confusion amongst SPs who depend on this functionality.
  2. Change wording to say "offline" capability may not always be respected, and do "best effort" on Kubernetes side.
    • Why have offline capability at all?
  3. Deprecate "online/offline" capability BUT allow for the offline use case via the "Volume In Use" error code being returned by the plugin and Kubernetes handling that correctly.
    • Technically means CSI 1.x would break backwards compatibility.
  4. Cut CSI 2.0 and reintroduce as volume expansion as alpha
    • Very heavy weight option.

Folks on the call seem to be leaning towards option 3 above.

For "offline node expand" the following options were discussed:

  1. Do nothing, the existing says that VolumeInUse error should be returned if "online node expand" is not supported.
    • Confusion around how Kubernetes handles this code today and current Kubernetes behavior around NodeStageVolume (NodeExpand always called after NodeStage but before NodePublish - but that doesn't help plugins that don't implement NodeStage).
  2. Decouple NodeExpand from NodeStage. If there is a pending expand, and workload is started always call NodeExpand immediately, let SP return VolumeInUse if not supported, and always call NodeExpand before NodePublish.
    • Some storage systems may support only ONLINE or only OFFLINE.
    • Would unblock offline only volumes, but may cause issues for online only volumes.

Folks on the call seem to be leaning towards option 1 above, but I'm not so sure.

We want to make sure that this does not indefinitely block us from cutting a CSI 1.3 with VolumeHealth.

Next steps:

  1. Follow up meeting to 1) get @jdef's input on proposal for "offline controller expand", 2) agree on wording if we agree to deprecate, 3) discuss and come to conclusion on "offline node expand", and 4) get @cduchesne's input on "offline node expand" since he has a driver that depends on it.

  2. Finalize any changes required to CSI for volume expansion by May 22. CSI release will happen week of May 25-29 with our without volume expansion changes. This will allow VolumeHealth to be implemented in Kubernetes using the CSI 1.3 release from June 1-25 for Kubernetes 1.19 release.

saad-ali avatar May 07 '20 18:05 saad-ali

Spoke with @cduchesne offline on slack and their use case is to support expansion on node, after NodeStage but before NodePublish. This is already supported and we discussed if a driver does not support calling NodeExpandVolume after NodePublish it could return volume-in-use error that already exists.

Having said that - he has a concern that since readonly flag present in pod.Spec.Volumes is not available inside NodeStageVolume, the CSI driver at that point does not know if volume is readonly. I think that is a valid general concern but is kinda out of scope for this design.

gnufied avatar May 11 '20 15:05 gnufied

Meeting notes from today's meeting

For "offline controller expand"

  • Questions raised about going with option 4 above (cutting CSI 2.0 and potentially even calling volume expansion alpha).
    • Doing so would mean adoption would be delayed by potentially many quarters, and require potentially non-trivial amount of additional work on CO and SP side.
  • Questions raised about about option 3 above (Deprecate "online/offline" capability)
    • Even if controller offline is implemented via error codes (instead of capability) there is an existing problem that has not been addressed: usability. User must terminate workload, wait for the volume expansion to complete, before starting the workload again -- very manual process. The only benefit between this and having an explicit capability is this approach will discourage SPs from using offline
    • Xing raised the point that some SPs still want this behavior, so we can't eliminate it altogether.
    • Maybe we should document this usability issue in kubernetes-csi and elsewhere

For "offline node expand"

  • Chris says that unless NodeStage has a readonly field, NodeExpand call will never work for his case (and he's worked around that by doing a FS expansion during NodePublish.
  • Debate about when a CO should call NodeExpand (first time after NodeStage before NodePublish, only after NodePublish, etc.)

Next steps:

  • Follow up meeting Wed May 20 11 AM PT

saad-ali avatar May 14 '20 19:05 saad-ali

For "offline controller expand"

  • Discussion:
    • Cutting a CSI 2.0 would be the right way to handle this. But getting CSI 2.0 out and implementing it in COs and SPs will likely take a year or more, further delaying adoption of volume expansion.
    • That said, we should probably start talking about CSI 2.0 on the community calls and all the things we would like to improve now given the long lead time.
  • Conclusion:
    • Deprecate ONLINE/OFFLINE capabilities in 1.3.

For "offline node expand"

  • Discussion:
    • Plugins that support NodeStage but don't support node expand of mounted volumes (@cduchesne's use case) will have to continue to not implement NodeExpandVolume and instead hack NodePublishVolume or NodeStageVolume.
  • Conclusion:
    • Add "Volume in use" error code to allow offline node expand and readonly case (#431)
    • I'm not super confident about this, but none of the SPs on the call objected.

Next steps:

  • For "offline controller expand":
    • [ ] Add comments in spec indicating ONLINE and OFFLINE are deprecated and that CO will not look at those capabilities.
    • [ ] Update comments in spec to indicate what the default behavior is, and that if someone wants controller OFFLINE capability, they should use "volume in use" error code.
    • [ ] Mark deprecated capabilities in spec with protobuf deprecated annotation
  • For "offline node expand":
    • [ ] Add "Volume in use" error code in spec to NodeExpandVolumeResponse allow offline node expand and readonly case (#431)
    • [ ] Add language in spec clarifying offline NodeExpandVolume for both COs and SPs.
  • The above must be completed by Friday, so the CSI release can be cut next week.
    • [ ] Cut a CSI 1.3 release with release note indicating "Breaking changes/Deprecations"
    • [ ] Incorporate changes (e.g. k8s logic for volume in use) in to Kubernetes v1.19 with a "known issues" release note.

saad-ali avatar May 20 '20 21:05 saad-ali

To follow up on action items:

  1. readonly and NodeExpandVolume changes are in - https://github.com/container-storage-interface/spec/pull/431
  2. online and offline plugin capabilities deprecated - https://github.com/container-storage-interface/spec/pull/429

Currently both builds are failing for unrelated reason - https://github.com/container-storage-interface/spec/issues/434

gnufied avatar May 22 '20 15:05 gnufied