risinglight refactor(storage): use dynamic trait for storage interfaces

Signed-off-by: Runji Wang [email protected]

Dynamic async traits are not so bad for performance, as long as they are not called frequently. The advantage is that they are less invasive as they do not require generic parameters everywhere, and they allow for extensible plugins in the future.

This PR uses #[async_trait] to refactor storage traits into dyn traits. It removes <S: Storage> from executors. The only performance critical part is the TxnIterator::next_batch which is called frequently in scanning. We refactor it into a BoxStream to eliminate boxing the future on each call.

Performance comparison on TPCH

tpch	time
q1	+5.9%
q17	+5.6%
q18	+5.4%
q13	+4.1%
q11	+3.9%
q14	+3.9%
q7	+3.7%
q4	+3.2%
q3	+3.1%
q10	+2.2%
q9	+1.8%
q5	+1.4%
q15	+1.2%
q16	+1.2%
q19	+1.2%
q6	+0.4%
q12	-0.4%
q20	-0.6%
q22	-2.5%
q2	-2.9%
q8	-2.9%
q21	-5.4%

Apr 16 '24 06:04 wangrunji0408

Weakly -1

The current dynamic traits have too many limitations, making them seem like unfinished products.

And pass generic parameters are not really very painful.

Apr 17 '24 07:04 TennyZhuang

also objected by @mrcroxx

Apr 17 '24 09:04 wangrunji0408

also objected by @MrCroxx

The #[async_trait] allocates a boxed future on each call. The overhead can be high for memory-only path. e.g. cache hits.

Apr 17 '24 14:04 MrCroxx

Given that the storage part of the system runs in batches (i.e., we always fetch as many as 1024 rows if possible), I don't think using a dynamic trait would cause a large regression. However, I don't see the code is simplified significantly without static dispatch, except that the S: Storage thing gets removed. Therefore, I'm kind of in the middle of approving or turning down this pull request :(

Jun 16 '24 19:06 skyzh