[Clone from Snapshot] CloneCommand never send
What steps did you take and what happened: When creating a pvc as a clone from a snapshot of another PVC, the clone command never gets send
ZFS command log:
2024-05-19.17:46:29 zfs create -o quota=274877906944 -o recordsize=64k -o dedup=off -o compression=zstd-6 -o encryption=off -o mountpoint=legacy NVME/app-pvc/pvc-b44a3c8a-4cae-4d63-aed1-fafe853f8e8b
2024-05-19.17:46:29 zfs create -o quota=1073741824 -o recordsize=64k -o dedup=off -o compression=zstd-6 -o encryption=off -o mountpoint=legacy NVME/app-pvc/pvc-176eff42-3b84-4ce6-8fe9-a06192320869
2024-05-19.17:46:52 zfs snapshot NVME/app-pvc/pvc-b44a3c8a-4cae-4d63-aed1-fafe853f8e8b@snapshot-08240c13-7fcb-4425-838d-79eacfc27407
2024-05-19.18:57:04 zfs clone NVME/app-pvc/pvc-b44a3c8a-4cae-4d63-aed1-fafe853f8e8b@snapshot-08240c13-7fcb-4425-838d-79eacfc27407 NVME/app-pvc/testclone
Notice "testclone" which was ran manually, the log never showed any attempt at running the clone command, successfull or otherwise.
What did you expect to happen:
It should at least send the clone command, failure or success regardless.
The output of the following commands will help us better understand what's going on:
- All OpenZFS pods and containers are running, healthy, latest and able to create normal PVC defined datasets.
- zfsvolumes cr for origin PVC is present
- zfssnapshot cr for originPVC is present
- zfs snapshot itself for origin PVC is present
- zfs snapshot itself for origin PVC can be cloned manually
- zfsvolume for targetPVC is present
- dataset for targetPVC is missing
- clonecommand never send
zfsvolume cr for targetPVC
Name: pvc-095889bb-2da5-4b40-838e-d54c499b2ab0
Annotations: <none>
API Version: zfs.openebs.io/v1
Kind: ZFSVolume
Spec:
Capacity: 274877906944
Compression: zstd-6
Dedup: off
Encryption: off
Fs Type: zfs
Owner Node ID: ix-truenas
Pool Name: NVME/app-pvc
Recordsize: 64k
Shared: yes
Snapname: pvc-b44a3c8a-4cae-4d63-aed1-fafe853f8e8b@snapshot-08240c13-7fcb-4425-838d-79eacfc27407
Thin Provision: yes
Volume Type: DATASET
Status:
State: Failed
Events: <none>
volume for origin is all correct. target and origin share the same (known-good) storageClass
-
kubectl logs -f openebs-zfs-controller-f78f7467c-blr7q -n openebs -c openebs-zfs-pluginGist logs are to be added, but highlights. But error is just failing on the Dataset existence check, after creation of the ZFSvolume CR and no descriptive error is output in any of them. Just the fact volume creation failed, which we can also already see on the zfsvolume cr. -
kubectl logs -f openebs-zfs-node-[xxxx] -n openebs -c openebs-zfs-pluginhttps://gist.github.com/Ornias1993/9cb23dc0df026233e8d64c74d70bd39a https://gist.github.com/Ornias1993/991e63fbc41fd68e71a35c2f8f2e3a62
Anything else you would like to add:
Everything works perfectly fine and the PVC is also created fine. Other PVC creation works fine. All OpenEBS pods and components are fine volumesnapshot objects are present and fine, as-well-as ZFS snapshots.
But regardless of this, at least the clone command should've been send.
Environment:
- Verified Latest
- Verified Latest
- Platform independent (tested multiple)
- Debian/TrueNAS/and-others
I've also noticed that the clone feature support is not tested in any way either, or at least I don't see it being ran.
That being said, with the lack of logging it's REALLY hard to bugtrace.
I've also noticed that the clone feature support is not tested in any way either, or at least I don't see it being ran.
I took a few notes the other day of what should be done about lacking test for the local engines. I'll write it up on a ticket so we can track progress on addressing it. Some e2e-test infra is also lost since archival, so that will have to be replaced somehow.
I took a few notes the other day of what should be done about lacking test for the local engines.
Ahh good to soo activity on that!
I'll write it up on a ticket so we can track progress on addressing it.
Yes that would be great as well!
Some e2e-test infra is also lost since archival, so that will have to be replaced somehow.
I noticed, I think it might be best to fix the tests first and go from there? If this issue is correct, it would show itself when tests get implemented anyway...
Edit/Addition
On another note is the lack of verbosity though... Lot of errors come down to "no worky", it would be nice to at least more easily trace it back to the step in the provisioning/creation process thats failing.
Yes that would be great as well!
I've raised https://github.com/openebs/openebs/issues/3726 Hopefully more folks will pitch in, correct anything I've said wrong and provide more details.
I noticed, I think it might be best to fix the tests first and go from there? If this issue is correct, it would show itself when tests get implemented anyway...
Yes that makes sense. I'd also like to see how much of these local engines (lvm, zfs...) test we can "unify". Basic CSI etc functionality should work exactly the same, so presumably the same basic tests should apply to all?
On another note is the lack of verbosity though... Lot of errors come down to "no worky", it would be nice to at least more easily trace it back to the step in the provisioning/creation process thats failing.
So you're saying even if you increase verbosity there's nothing else being added? Maybe we can track this as an enhancement too. CC @sinhaashish
EDIT: sorry mouse slip...
On another note is the lack of verbosity though... Lot of errors come down to "no worky", it would be nice to at least more easily trace it back to the step in the provisioning/creation process thats failing.
So you're saying even if you increase verbosity there's nothing else being added? Maybe we can track this as an enhancement too. CC @sinhaashish
EDIT: sorry mouse slip...
Is there even a verbosity option?
Doubt verbosity would've helped much in this case btw, as I manually tracked the log messages through the code and it never even hit the function that should've send the zfs-clone command.
Is there even a verbosity option?
sigh, sorry my mistake for making this assumption, seems there's not..
@jnels124 Can you confirm on here that you also see this behavior as well?
Hey @Ornias1993, I tried to reproduce this issue but it seems it is working for me. I tried on Debian 12, k8s 1.26.15, with zfs-2.1.11-1. I am able to successfully created clone out of a snapshot which was created on a thinProvisioned, and shared mount volume.
Can i get more specific details so that i can see how to reproduce this issue (what exact steps you performed, with yaml if possible)?
Logs doesn't seems to be useful here, i just see bunch of repeating following lines for different PVCs
Started PVC processing "ix-authelia/authelia-cnpg-main-1-wal No need to resize PVC "ix-wg-easy/wg-easy-config"
Logs doesn't seems to be useful here
I've not added all logs yet, but the issue is also that there is no verbosity at all. So the logs are, sorry to say, completely useless crap.
Those "this no worky" errors are not useable log output anyway.
i just see bunch of repeating following lines for different PVCs
Obviously this is a running system so the logs do contain things that aren't related.
Can i get more specific details so that i can see how to reproduce this issue
Primarily the use is through VolSync, so I cannot provide YAML for you. However, I verified everything on VolSync is green, it's just waiting on the clone-from-snapshot to be created by OpenEBS. Which, obviously, never happens as it errors out
However, everything works perfectly fine, except that the PVC volsync creates throws a vague error and OpenEBS never executes the clone command. But due to the insane lack of any way of bugtracing it (because there is no verbosity anywhere), I've no way of even trying to fix it myself.
@w3aman Can you share your way of testing this in yaml format?
More logs: https://gist.github.com/Ornias1993/2a9f9b32d82b5720a3ae3917486c48ea
https://gist.github.com/Ornias1993/903229b76a44da17208753519e9b755c
@w3aman Can you share your way of testing this in yaml format?
I tried to replicated spec from your zfsvolume and these are my yamls (zfs_yaml.txt) and they work perfectly fine.
here is my zfsvolume for target pvc:
Name: pvc-ef52066b-fd78-42c6-ab1f-9a08d1e94da1
Namespace: openebs
Labels: kubernetes.io/nodename=debian-worker
Annotations: <none>
API Version: zfs.openebs.io/v1
Kind: ZFSVolume
Metadata:
Creation Timestamp: 2024-05-29T11:46:47Z
Finalizers:
zfs.openebs.io/finalizer
Generation: 2
Resource Version: 14262
UID: 8d15a359-1a88-440c-9c06-bd43e771ef54
Spec:
Capacity: 2147483648
Compression: zstd-6
Dedup: off
Fs Type: zfs
Owner Node ID: debian-worker
Pool Name: NVME/app-pvc
Recordsize: 64k
Shared: yes
Snapname: pvc-b56509a2-2d22-4419-b6e1-78084883aaa6@snapshot-8c0db664-5cb6-4e9f-93b1-32ff009cfdfb
Thin Provision: yes
Volume Type: DATASET
Status:
State: Ready
Events: <none>```
Can you send me the output for this command for the node agent pod on the node ix-truenas?
kubectl logs openebs-zfs-node-[xxxx] -n openebs -c openebs-zfs-plugin | grep pvc-095889bb-2da5-4b40-838e-d54c499b2ab0
I already uploaded those logs: https://gist.github.com/Ornias1993/991e63fbc41fd68e71a35c2f8f2e3a62
Its not my personal system, I went through a troubleshooting session with one of our users.
@Ornias1993 I do see this behavior as well. However, I am creating the snapshots with different tooling. We deploy citusdb (postgrsql extension) via stackgres. Stackgres provides the snapshotting ability.
- We create the snapshots in stackgres... This creates The appropriate k8s volumesnapshot resource which in turn creates the appropriate zfssnapshot resource. When restoring, the correct zfsvolume definition is created and references the snapshot. Give me some time to get some of the old logs together and provide replication steps
@Ornias1993 are you setting the openebs.io/nodeid label to something other than the node name?
@Abhinandan-Purkait
In my case, I believe the issue is caused by the opens.io/nodeid being different than the nodename.
The issue appears to be with driver/controller.go and the CreateVolume method
This block is problematic. Because the value of selected in the first two blocks comes from the resource spec OwnerNodeId and in the last case selected will be the nodename.
if contentSource != nil && contentSource.GetSnapshot() != nil {
snapshotID := contentSource.GetSnapshot().GetSnapshotId()
selected, err = CreateSnapClone(ctx, req, snapshotID)
} else if contentSource != nil && contentSource.GetVolume() != nil {
srcVol := contentSource.GetVolume().GetVolumeId()
selected, err = CreateVolClone(ctx, req, srcVol)
} else {
selected, err = CreateZFSVolume(ctx, req)
}
This means that we are able to create fresh volumes correctly, but not from volume clones, or snapshot clones because a few lines down we have.
nodeid, err := zfs.GetNodeID(selected)
That method will fail when the value of selected is the OwnerNodeId. The code should be updated so that selected is populated from the correct field
@Ornias1993 are you setting the
openebs.io/nodeidlabel to something other than the node name?
Interesting, i'm not explicitly doing that, but maybe VolSync is doing that? Is there an issue in that label NOT being set?
@Abhinandan-Purkait
In my case, I believe the issue is caused by the
opens.io/nodeidbeing different than the nodename.The issue appears to be with
driver/controller.goand theCreateVolumemethodThis block is problematic. Because the value of
selectedin the first two blocks comes from the resource specOwnerNodeIdand in the last caseselectedwill be the nodename.if contentSource != nil && contentSource.GetSnapshot() != nil { snapshotID := contentSource.GetSnapshot().GetSnapshotId() selected, err = CreateSnapClone(ctx, req, snapshotID) } else if contentSource != nil && contentSource.GetVolume() != nil { srcVol := contentSource.GetVolume().GetVolumeId() selected, err = CreateVolClone(ctx, req, srcVol) } else { selected, err = CreateZFSVolume(ctx, req) }This means that we are able to create fresh volumes correctly, but not from volume clones, or snapshot clones because a few lines down we have.
nodeid, err := zfs.GetNodeID(selected)That method will fail when the value of selected is the
OwnerNodeId. The code should be updated so that selected is populated from the correct field
I believe the reason for the using the OwnerNodeId is to ensure that we select the node where the parent volume's pool is present, since snapshot, restore, clone all of them need to go to the same node.
You said that the nodename is different from openebs.io/nodeid's value. From the code seems like we fetch the node before setting the topology and if that succeeds we set this to openebs.io/nodeid. Can you point out when can we end up in your situation?
node, err := k8sapi.GetNode(ns.driver.config.Nodename)
if err != nil {
klog.Errorf("failed to get the node %s", ns.driver.config.Nodename)
return nil, err
}
if _, ok := topology[zfs.ZFSTopologyKey]; !ok {
topology[zfs.ZFSTopologyKey] = ns.driver.config.Nodename
}
@Abhinandan-Purkait
We pre-label (before starting zfs components) our nodes with unique identifiers for openebs.io/nodeid and these identifiers differ from the node name. So in the snippet you provided, topology[zfs.ZFSTopologyKey] will already be set and assigned to ok and therefore will not be reassigned ns.driver.config.Nodename. See the block above the one you posted which initializes the topology map to the node labels when the allowed topologies is all as is the case in the default config and our config.
So with this configuration, I am able to create new ZFS volumes without issue. However, when you attempt to create a zfs volume from a snapshot definition it fails.
In the logs I can see failed to get the node {VALUE_OF_OPENEBS_NODEID} which highlights the issue mentioned in my previous comment. This GetNodeID method in volume.go only works when it is passed the nodename.
In the case of creating a new volume this method is called with the nodename but when it is from a volume clone or a snapshot it gets called with the value of openebs.io/nodeid. This is because this block here in driver/controller.go in the CreateVolume method:
if contentSource != nil && contentSource.GetSnapshot() != nil {
snapshotID := contentSource.GetSnapshot().GetSnapshotId()
selected, err = CreateSnapClone(ctx, req, snapshotID)
} else if contentSource != nil && contentSource.GetVolume() != nil {
srcVol := contentSource.GetVolume().GetVolumeId()
selected, err = CreateVolClone(ctx, req, srcVol)
} else {
selected, err = CreateZFSVolume(ctx, req)
}
CreateSnapClone --> returns spec.OwnerNodeId
CreateVolClone --> returns spec.OwnerNodeId
CreateZFSVolume --> returns nodename loop variable
Replication steps:
- Label node(s) with unique value for
openebs.io/nodeid - Deploy zfs components
- Create a pvc with zfs storage class
- Identify volume created successfully
- Create a snapshot and ensure the appropriate zfs and k8s resources exist
- Create a new pvc defined to be created from snapshot
- Notice zfs volume created and after 10 minutes goes to pending
- Logs will indicate unable to find node with nodeid value from the owning node
@Ornias1993 are you setting the
openebs.io/nodeidlabel to something other than the node name?Interesting, i'm not explicitly doing that, but maybe VolSync is doing that? Is there an issue in that label NOT being set?
I reviewed some of your logs that i didn't see before and saw that these two values match in your case. can ignore that.
@Abhinandan-Purkait We pre-label (before starting zfs components) our nodes with unique identifiers for
openebs.io/nodeidand these identifiers differ from the node name. So in the snippet you provided,topology[zfs.ZFSTopologyKey]will already be set and assigned tookand therefore will not be reassignedns.driver.config.Nodename. See the block above the one you posted which initializes thetopologymap to the node labels when the allowed topologies isallas is the case in the default config and our config.So with this configuration, I am able to create new ZFS volumes without issue. However, when you attempt to create a zfs volume from a snapshot definition it fails.
In the logs I can see
failed to get the node {VALUE_OF_OPENEBS_NODEID}which highlights the issue mentioned in my previous comment. ThisGetNodeIDmethod involume.goonly works when it is passed the nodename.In the case of creating a new volume this method is called with the nodename but when it is from a volume clone or a snapshot it gets called with the value of
openebs.io/nodeid. This is because this block here indriver/controller.goin theCreateVolumemethod:if contentSource != nil && contentSource.GetSnapshot() != nil { snapshotID := contentSource.GetSnapshot().GetSnapshotId() selected, err = CreateSnapClone(ctx, req, snapshotID) } else if contentSource != nil && contentSource.GetVolume() != nil { srcVol := contentSource.GetVolume().GetVolumeId() selected, err = CreateVolClone(ctx, req, srcVol) } else { selected, err = CreateZFSVolume(ctx, req) }
CreateSnapClone--> returns spec.OwnerNodeIdCreateVolClone--> returns spec.OwnerNodeIdCreateZFSVolume--> returns nodename loop variableReplication steps:
- Label node(s) with unique value for
openebs.io/nodeid- Deploy zfs components
- Create a pvc with zfs storage class
- Identify volume created successfully
- Create a snapshot and ensure the appropriate zfs and k8s resources exist
- Create a new pvc defined to be created from snapshot
- Notice zfs volume created and after 10 minutes goes to pending
- Logs will indicate unable to find node with nodeid value from the owning node
@jnels124 Thanks for explaining your use case. It moreover seems pre-labeling the node was not a part of the design and thus I feel as per current design it's not supposed to work or do you think the way it works is fundamentally wrong and needs refactoring. Can you please create an enhancement or refactoring issue and explain what can be done here so that it can be worked upon?
Having said that, there are two issues here. One is @jnels124 use case and the other reported by @Ornias1993, both of them seem to have different scope and might need different resolution.
Unfortunately we haven't been able to reproduce the latter, even with same volume specs and environment, as reported by @w3aman.
Unfortunately we haven't been able to reproduce the latter, even with same volume specs and environment, as reported by @w3aman.
Yeah thats a shame for sure... Haven't have the time to look into it further either, as we're currently putting a lot of work on moving away from OpenEBS ZFS-LocalPV due to this issue
Worth of note: You can 100% certainly reproduce it when using VolSync and all the resources created by VolSync are 100% correct. It simply keeps waiting on OpenEBS ZFS LocalPV to clone, which never happens.
btw @Abhinandan-Purkait I still owe you an apology for my previous remark calling you an intern an incompetent. In hindsight, you shouldn't have been blamed by me solely and personally.
Haven't have the time to look into it further either, as we're currently putting a lot of work on moving away from OpenEBS ZFS-LocalPV due to this issue
That's unfortunate. Alright we can try this out with the VolSync and let you know the findings.
Haven't have the time to look into it further either, as we're currently putting a lot of work on moving away from OpenEBS ZFS-LocalPV due to this issue
That's unfortunate. Alright we can try this out with the VolSync and let you know the findings.
That would be completely awesome, thank you! My best educated guess would be a race condition between... well... something...
I see the request being filed for the clone, but it just... well... doesn't pass through the complete codepath... When you've some spare time, it might be worth while having some extra verbosity (flag), so we can more easily trace where in the path issues like this occur next time.
Haven't have the time to look into it further either, as we're currently putting a lot of work on moving away from OpenEBS ZFS-LocalPV due to this issue
That's unfortunate. Alright we can try this out with the VolSync and let you know the findings.
That would be completely awesome, thank you! My best educated guess would be a race condition between... well... something...
I see the request being filed for the clone, but it just... well... doesn't pass through the complete codepath... When you've some spare time, it might be worth while having some extra verbosity (flag), so we can more easily trace where in the path issues like this occur next time.
Can you provide some information about the VolSync configuration? Like what is the replication method is being used and what is the copyMethod that is being used for that.
- Restic, Snapshot, using the ReplicationDestination Volume Populator
- Restic, Snapshot, using the ReplicationDestination Volume Populator
Hi @Ornias1993, I tried using the volsync with Restic + GCP bucket and Restore using VolumePopulator and copyMethod Snapshot. I was not able to reproduce it. You can check below:
The configs used: https://gist.github.com/Abhinandan-Purkait/599df18f75c90dbb4042b352ec5a2509
The logs and the pv: https://gist.github.com/Abhinandan-Purkait/2825853a4499f21ff883ae5c031dcaeb
copyMethod
Logically because the copy method doesn't use snapshots at all-all. Thats why I said snapshot, not copy ;-)
However, thats great information, because that limits where the issue is accuring to purely the clone from snapshot side of things.