arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[C++][FS][Azure] Optimise `ObjectAppendStream::DoAppend` in the case of many small appends

Open Tom-Newton opened this issue 1 year ago • 6 comments

Describe the enhancement requested

Optimisation to https://github.com/apache/arrow/issues/38333 Child of https://github.com/apache/arrow/issues/18014

Currently ObjectAppendStream::DoAppend calls block_blob_client_->StageBlock synchronously meaning that the call to ObjectAppendStream::DoAppend blocks until the data has been successfully written to blob storage. This is very in-efficient for large numbers of small writes.

This performance problem is actually quite obvious just in small tests against azurite. The UploadLines function used to create test data uses std::accumulate and writes the data in one call for performance reasons.

With accumulate

[ RUN      ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt
[       OK ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt (1350 ms)

without accumulate (4096 separate calls to ObjectAppendStream::DoAppend).

[ RUN      ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt
[       OK ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt (25124 ms)

And this is when testing against azurite on localhost so against real blob storage where the latency is going to be much higher the problem will be exacerbated.

By comparison the GCS filesystem is able to handle the later approach without performance issues.

Some options to optimise:

  1. Call block_blob_client_->StageBlock asynchronously and await all the futures in ObjectAppendStream::Flush.
  2. Buffer small writes in memory and make fewer larger calls to block_blob_client_->StageBlock.
  3. Buffer small writes in memory and make batched calls to block_blob_client_->StageBlock.

Component(s)

C++

Tom-Newton avatar Feb 11 '24 14:02 Tom-Newton

You should definitely buffer small writes in memory. The S3 filesystem does that.

You should also optionally allow async writes, as in S3, and have Flush ensure all writes have finished.

pitrou avatar Feb 14 '24 11:02 pitrou

Is there anybody working on this issue? It's kind of a blocking issues as it makes writing to Azure Blob quite slow.

OliLay avatar Jun 04 '24 11:06 OliLay

@OliLay Does using a BufferedOutputStream fix the problem for you?

Though I agree this should probably be handled transparently inside the Azure FS implementation. @felipecrv

pitrou avatar Jun 04 '24 12:06 pitrou

Thanks, I haven't tried that so far, but I guess this won't fully solve the problem as writing on the stream would still block once the buffer size is exceeded as we call write on the underlying Azure output stream. So I agree that probably a non-blocking write mode + internal buffering (as it is implemented for S3) would be the best approach.

OliLay avatar Jun 04 '24 13:06 OliLay

Agreed!

pitrou avatar Jun 04 '24 14:06 pitrou

I opened a PR which addresses this issue: https://github.com/apache/arrow/pull/43096

OliLay avatar Jul 01 '24 14:07 OliLay

Issue resolved by pull request 43096 https://github.com/apache/arrow/pull/43096

pitrou avatar Aug 21 '24 11:08 pitrou