tiered-storage-for-apache-kafka icon indicating copy to clipboard operation
tiered-storage-for-apache-kafka copied to clipboard

S3 download speed is too slow when consumer needs to read data from S3 storage

Open HenryCaiHaiying opened this issue 1 year ago • 12 comments

Currently the S3 download speed is low (about 20Mbit/s), the reason is we are fetching chunks in sequential order, can we do that in parallel (like how S3 multi-part upload is done in parallel)?

More details: https://the-asf.slack.com/archives/C05A1NF5SFM/p1687829723272589?thread_ts=1687245261.417759&cid=C05A1NF5SFM

HenryCaiHaiying avatar Jun 27 '23 01:06 HenryCaiHaiying

Thanks for the issue. We will think about this however right now the broker is always requesting the data without specifying the end position and such way we can not really say how much data have to be downloaded or moreover how much data will be actually read by the broker. So my understanding is that parallel fetching will lead to unnecessary downloads especially when the segments are big. IIRC there was a proposal from @jeqo to make a change on a broker side to always request maxBytes from the plugin. That could help with this a little but I'm not sure when/if it will be changed. At the same time we had some discussions about a little bit different approach, basically doing some pre-fetching to the cache of some (perhaps configurable but limited) amount of data. At the same time we are trying to understand what will be the common read patterns in general (sequential read is obviously one of them) and trying to invent some related optimisations.

AnatolyPopov avatar Jun 27 '23 10:06 AnatolyPopov

Just for reference, this was mentioned on the upstream PR introducing remote fetching: https://github.com/apache/kafka/pull/13535#discussion_r1166318900 but was mentioned to be fixed in a following PR -- not sure if this one: https://issues.apache.org/jira/browse/KAFKA-14915

jeqo avatar Jun 27 '23 18:06 jeqo

I think the common use case from consumer side is doing bootstrap or catchup pulls, reading from this old offset and all the way to the tail of the queue until the consumer caught up. They would want all the data in remote storage for this topic partition. There might exist a few diagnostic use cases people only want a small segment of the partition data, but I don't think that's that common.

In either cases, I think you can do some prefetching to make sure the next buffer of the stream is ready to serve to the client, you don't have to pre-read too much. And if you want to be fancy, you can put in an adaptive prefetch algorithm. First time only prefetch 4K bytes, the the client comes back to ask more, prefetch 8K byte then 16K, ... until a cap (e.g. 10M) That probably can work well for both use cases.

HenryCaiHaiying avatar Jun 28 '23 00:06 HenryCaiHaiying

As far as I remember we were briefly discussing exactly this with @ivanyu at some point(meaning adaptive fetching). I would say as soon as we will have this one https://github.com/aiven/tiered-storage-for-apache-kafka/pull/245 merged it should be easy to implement at least prefetching of some fixed amount of data but I think we will also consider both. Thanks for bringing this up again and for additional info about use cases. :+1:

AnatolyPopov avatar Jun 28 '23 12:06 AnatolyPopov

@AnatolyPopov @ivanyu It seems https://github.com/aiven/tiered-storage-for-apache-kafka/pull/245 is already merged, are you guys going to come back to add some prefetching or caching support to speed up the S3 download?

HenryCaiHaiying avatar Jul 20 '23 06:07 HenryCaiHaiying

Yeah, we're working on it

ivanyu avatar Jul 20 '23 06:07 ivanyu

Thanks, that's a very useful feature.

HenryCaiHaiying avatar Jul 20 '23 06:07 HenryCaiHaiying

@ivanyu Looping back to check the progress on this issue.

HenryCaiHaiying avatar Aug 09 '23 17:08 HenryCaiHaiying

@HenryCaiHaiying, #403 and #411 should help to improve fetching performance. Have a look!

jeqo avatar Oct 03 '23 12:10 jeqo

Thanks for the contribution, left a small comment in one of the PR

HenryCaiHaiying avatar Oct 04 '23 04:10 HenryCaiHaiying

@jeqo with this PR: https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/pull/429, do we still need #403 ?

HenryCaiHaiying avatar Nov 01 '23 01:11 HenryCaiHaiying

@HenryCaiHaiying The former supersedes the latter.

ivanyu avatar Nov 03 '23 14:11 ivanyu