ceph-nvmeof
ceph-nvmeof copied to clipboard
Removing nvmeof service doesn't deletes OMAP entries
Noticed that OMAP entries for GW entities like subsytem to namespaces are still exists even after removing the entire service from the cluster.
Steps to follow:
- deploy nvmeof service with single pool.
- Add all required entities to GW using nvmeof-cli from subsystem to namespaces.
- Now Observe the nvme GW entities in the OMAP nvme.None.state.
- Delete the nvmeof service.
- And user can stilll notice the GW entities in OMAP.
[ceph: root@ceph-1sunilkumar-ol18l6-node1-installer /]# ceph orch ls
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
alertmanager ?:9093,9094 1/1 8m ago 5d count:1
ceph-exporter 6/6 8m ago 5d *
crash 6/6 8m ago 5d *
grafana ?:3000 1/1 8m ago 5d count:1
mgr 2/2 8m ago 5d label:mgr
mon 3/3 8m ago 5d label:mon
node-exporter ?:9100 6/6 8m ago 5d *
osd.all-available-devices 16 5m ago 5d *
prometheus ?:9095 1/1 8m ago 5d count:1
[ceph: root@ceph-1sunilkumar-ol18l6-node1-installer /]# ceph orch ps | grep nvme
[ceph: root@ceph-1sunilkumar-ol18l6-node1-installer /]#
[ceph: root@ceph-1sunilkumar-ol18l6-node1-installer /]# rados -p rbd listomapkeys nvmeof.None.state
host_nqn.2016-06.io.spdk:test_cli_*
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node5.mnoqha_TCP_10.0.211.131_4420
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node5.mnoqha_TCP_10.0.211.32_4420
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node5.mnoqha_TCP_10.0.211.32_4421
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node6.ueawqa_TCP_10.0.211.22_4420
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node6.ueawqa_TCP_10.0.213.158_4420
namespace_nqn.2016-06.io.spdk:test_cli_2
omap_version
qos_nqn.2016-06.io.spdk:test_cli_2
subsystem_nqn.2016-06.io.spdk:test_cli
I agree that when we remove a GW through ceph adm we should remove all the GW specific state in the OMAP. I would not remove the entire OMAP when the last GW of a GW group is deleted but would introduce another command that explicitly allows removing a GW group.
Isn't the way the service is deleted a behaviour determined by cephadm? If so, this issue needs to be raised under ceph/ceph for discussion with the cephadm maintainers right? For example, cephadm implements the nvmeof via a class which already has a post_remove method ... but it's empty :)
yes @pcuzner we do need to involve cephadm, but in this discussion we try to agree about the expected behavior. Also, I think that the post_remove for example, will need some kind of a CLI to perform the required cleanup.
I'm not clear on the CLI requirement for cleanup. For example, if the service is removed with --force (i.e. ceph orch rm nvmeof.gw1 --force) the mgr could just delete the rados objects (the class has both post_remove and purge methods).