ceph-iscsi-config
ceph-iscsi-config copied to clipboard
add a cmd_time_out in LUN class to support configurable cmd_time_out attrib
What do you need cmd_time_out > 0 for?
It will only release commands in the kernel if runner dies (osd op timeout should catch commands that are running in runner), but we do not yet have a way to restart runner safely so I think you need to reboot the node either way.
Since we may need to upgrade iscsi tools in product case in future and the restart of gw may cause the bug as we discussed before, It's a little expensive to reboot the node, always this is unacceptable.
After tcmu/runner is fixed to be able to restart with IO in progress this will not be needed right? If so I think setting cmd_time_out would just be a temp hack that I do not think we want to add upstream.
My issue with cmd_time_out is that it just sort of works some times. If it's only a couple commands and the reason for restarting is not frequent then you would just slowly leak (kernel never calls tcmu_cmd_free_data). If it happened with a full ring or if runner cannot restart, then you end up where the initiator will keep trying to use the path, the cmd will fail and we will just keep bouncing between the other gw and this one.