conflux-rust icon indicating copy to clipboard operation
conflux-rust copied to clipboard

inflight_keys for GET_BLOCK may become dangle when GetBlockTxn is discarded after an extremely long timeout.

Open yangzhe1990 opened this issue 4 years ago • 1 comments

The direct cause is that GetBlockTxn's on_removed() methods does nothing. A deeper reason is the management of inflight_keys of GET_BLOCK, which is shared by GetBlocks/GetCompactBlocks/GetBlockTxn.

yangzhe1990 avatar Mar 27 '20 14:03 yangzhe1990

I'm working on this issue.

The temporary fix is to remove the inflight_keys in on_removed(), but when resend() is called, return a GetBlocks request rather than the GetBlockTxn itself.

I'm thinking about removing inflight_keys whenever a request among GetBlocks/GetCompactBlocks/GetBlockTxn is responded or timed out. A proper mechanism should be added as well to prevent GetBlocks from issued elsewhere when GetBlockTxn will be send next.

From the recent issue #1065 we have learned that the management of inflight_keys is critical. I'd like to enforce that there is always a one-to-one mapping of inflight_keys to an unfinished request (or better inflight request).

yangzhe1990 avatar Mar 27 '20 14:03 yangzhe1990