ceph-iscsi-config icon indicating copy to clipboard operation
ceph-iscsi-config copied to clipboard

add a cmd_time_out in LUN class to support configurable cmd_time_out attrib

Open zhuozh opened this issue 7 years ago • 3 comments

zhuozh avatar Nov 15 '17 11:11 zhuozh

What do you need cmd_time_out > 0 for?

It will only release commands in the kernel if runner dies (osd op timeout should catch commands that are running in runner), but we do not yet have a way to restart runner safely so I think you need to reboot the node either way.

mikechristie avatar Nov 16 '17 01:11 mikechristie

Since we may need to upgrade iscsi tools in product case in future and the restart of gw may cause the bug as we discussed before, It's a little expensive to reboot the node, always this is unacceptable.

lxbsz avatar Nov 16 '17 09:11 lxbsz

After tcmu/runner is fixed to be able to restart with IO in progress this will not be needed right? If so I think setting cmd_time_out would just be a temp hack that I do not think we want to add upstream.

My issue with cmd_time_out is that it just sort of works some times. If it's only a couple commands and the reason for restarting is not frequent then you would just slowly leak (kernel never calls tcmu_cmd_free_data). If it happened with a full ring or if runner cannot restart, then you end up where the initiator will keep trying to use the path, the cmd will fail and we will just keep bouncing between the other gw and this one.

mikechristie avatar Nov 16 '17 21:11 mikechristie