ceph-nvmeof
ceph-nvmeof copied to clipboard
find_namespace_and_bdev_name
Background
find_namespace_and_bdev_name is used to associate between existing NVME namespace and its SPDK bdev. Currently implemented in terms of SPDK JSON RPC nvmf_get_subsystems sometimes holding omap_lock. The function is used in the following contexts:
- namespace_change_load_balancing_group_safe ( with omap lock )
- namespace_get_io_stats ( with rpc lock )
- namespace_set_qos_limits_safe ( with rpc lock )
- namespace_resize ( gRPC function with NO rpc lock )
- namespace_delete_safe ( with OMAP lock )
Suggestion
- remove find_namespace_and_bdev_name function, implement in terms of subsystem_nsid_bdev dictionary
- namespace_change_load_balancing_group could be removed from gRPC protobuf interface, CLI could be implemented by list the namespace, remove the namespace and recreate it with a new anagrp.
- namespace methods could use nsid as identifier which should be defined mandatory, dropping optional uuid (protobuf messages:
namespace_resize_req
,namespace_get_io_stats_req
,namespace_set_qos_req
andnamespace_delete_req
)
@baum I have few comments:
- About "subsystem_nsid_bdev" in grpc.py. Isn't it strange that we are not using the local state for such things? Can't we refer to the local state as our single source of truth? Why adding another place to manage that?
- namespace_change_load_balancing_group - you are right that we could use ns delete and then ns add with different grp id. But that is less convenient to the end user. He also need to see that the uuid will be the same, so it will not confuse the hosts. I prefer to leave this CLI method.
- I tend to agree that uuid could be dropped from the interface.
@baum I have few comments:
- About "subsystem_nsid_bdev" in grpc.py. Isn't it strange that we are not using the local state for such things? Can't we refer to the local state as our single source of truth? Why adding another place to manage that?
subsystem_nsid_bdev
dictionary represents the SPDK runtime state, as oppose to the OMAP prescribed/desired configuration. The invariant is that subsystem_nsid_bdev is updated when SPDK RPC confirmed runtime state change, so we're sure that the runtime namespace - bdev association and the underlying bdev actually exist.
My intention is code simplification:
bdev = self.subsystem_nsid_bdev[nqn][nsid]
seems easier than find_namespace_and_bdev_name
WDYT?
- namespace_change_load_balancing_group - you are right that we could use ns delete and then ns add with different grp id. But that is less convenient to the end user. He also need to see that the uuid will be the same, so it will not confuse the hosts. I prefer to leave this CLI method.
👍
- I tend to agree that uuid could be dropped from the interface.
👍
I think we can do it. Currently I see that the find_namespace_and_bdev_name is also getting uuid as an argument. If we do this change, it means that the lookup will always be by uuid?
@gbregman do you see any reason not to make this change? If we already have the information cached (after recent changes), then we can save these extra calls?
@caroav , I don't see any reason to do this change.
@gbregman every call to find_ns method is a call to the SPDK get_subsystems, while this change saves this call. That's the reason that I see.
The call in change_load_balancing group is just in case the user didn't pass the namespace NSID and wanted to use UUID instead. We can drop it and force the usage of NSID but I don't see the real benefit from this. In all other places we use this function we need the bdev details as we need to pass them to SPDK. This is not something we currently have in the local state. I don't think it's worth it to write new code to take care of that locally as these CLIs are not common and not time critical. Calling the SPDK is not such a big deal as it uses a local socket and doesn't go out to the network. Holding the OMAP lock is less of a problem then holding the RPC lock and we do that when we call get_subsystems every 2 seconds for the monitor client. This is a much bigger issue.
Why holding the OMAP lock is less than holding the RPC lock?
The only place we call the find function with the OMAP lock taken is in namespace delete. The get stats, get/set QOS and namespace resize do not change the OMAP so they don't take the OMAP lock. In principle we can also call find in change load balancing group but this is only if the user chose to call the command without NSID.
@baum @gbregman IMHO the important thing here is mainly to take care about namespace_get_io_stats which is being called very frequently. The other functions mentioned above are called very rarely IMO.