AzureStorageExplorer icon indicating copy to clipboard operation
AzureStorageExplorer copied to clipboard

decompression worker exited early. When Flushing file

Open jis260 opened this issue 2 years ago • 12 comments

I am downloading files in bulk from a single azure blob storage container using Azure Storage Explorer.

I have connected to the container using a SAS URL.

However, for some files (100 out of 4000) I see the following error in the logs:

2022/07/04 11:49:28 ERR: [P#0-T#4131] DOWNLOADFAILED: <file_URL_redacted>: decompression worker exited early. When Flushing file. X-Ms-Request-Id:

I am able to download the files directly from the Azure Portal. Lots of the files have CONTENT-ENCODING deflate and the Setting in Azure Storage Explorer to Decompress Files is enabled.

This seems to stop the bulk download after ~4000-5000 files. If I start again from where it stopped, I can download another 4000-5000 files before the issue occurs again?

I also see errors like "Error closing file: timed out closing decompression worker" Is this because too many downloads are happening concurrently?

Any recommendations please?


Head of log file produced from Azure Storage Explorer:

2022/07/04 10:34:22 AzcopyVersion 10.14.1 2022/07/04 10:34:22 OS-Environment windows 2022/07/04 10:34:22 OS-Architecture amd64 ... 2022/07/04 10:34:22 Number of CPUs: 4 2022/07/04 10:34:22 Max file buffer RAM 2.000 GB 2022/07/04 10:34:22 Max file buffer RAM 2.000 GB 2022/07/04 10:34:22 Max concurrent network operations: 32 (Based on number of CPUs. Set AZCOPY_CONCURRENCY_VALUE environment variable to override) 2022/07/04 10:34:22 Check CPU usage when dynamically tuning concurrency: true (Based on hard-coded default. Set AZCOPY_TUNE_TO_CPU environment variable to true or false override) 2022/07/04 10:34:22 Max concurrent transfer initiation routines: 64 (Based on hard-coded default. Set AZCOPY_CONCURRENT_FILES environment variable to override) 2022/07/04 10:34:22 Max enumeration routines: 16 (Based on hard-coded default. Set AZCOPY_CONCURRENT_SCAN environment variable to override) 2022/07/04 10:34:22 Parallelize getting file properties (file.Stat): false (Based on AZCOPY_PARALLEL_STAT_FILES environment variable) 2022/07/04 10:34:22 Max open files when downloading: 2147483224 (auto-computed)

jis260 avatar Jul 04 '22 13:07 jis260

Thanks @jis260 for reaching out.

Are you able to repro the problem with a single file?

zezha-msft avatar Jul 05 '22 19:07 zezha-msft

I cannot replicate on only a single file. And files that had the error in the logs can be downloaded without any issue, so it is not a problem with the decompression.

I have tried reducing the concurrency settings for the network concurrency and open files. This has helped, but the issue still occurs and this causes the transfer to fail after several hours. Previously it was failing after less than 1 hour.

Restarting it again is difficult because performing select all requires the process to check which files have already downloaded and wait for the overwrite confirmation.

I have not tested this without the decompression option enabled.

jis260 avatar Jul 05 '22 20:07 jis260

We have replicated the issue using an azcopy command with the --decompress flag. This was on Linux this time, whereas the above was on Windows.

Out of 16451 files, 221 failed. There are 124313 files to download, but we have stopped the process now. The errors look like this:

  • 2022/07/06 12:16:59 ERR: [P#0-T#109] DOWNLOADFAILED: <url_removed> : 000 : decompression worker exited early. When Flushing file. X-Ms-Request-Id:

Again picking a file randomly and downloading it via the Azure web console works fine, so there is no issue with the deflate encoding.

The defaults were used without setting any environmental variables and the azcopy command was: ./azcopy copy "<sas_url>" "<output_path>" --overwrite=false --check-md5 FailIfDifferent --from-to=BlobLocal --decompress --recursive --log-level=INFO;

Tomorrow we will try without the --decompress flag to see if the error rate improves.

jis260 avatar Jul 06 '22 15:07 jis260

Without the --decompress flag over 40,000 files have been downloaded without any errors. Obviously, this is not ideal because the files need to be decompressed as a separate step. But this likely shows that the issue is with the deflate routine in azcopy and the workers being used for decompression.

Update: all files transferred successfully without the --decompress flag

jis260 avatar Jul 07 '22 06:07 jis260

Thank you @jis260 for the detailed description.

It sounds like this problem only occurs when downloading with lots of concurrency. For such performance related problems, I'll rely on @nakulkar-msft to take a look.

zezha-msft avatar Jul 09 '22 07:07 zezha-msft

Thank you for your reply, @zezha-msft.

Just to add that we did try setting both concurrency settings in Azure Storage Explorer to 4 and the issue still occurred when using --decompress flag.

image

There is currently no way that the transfer can be completed using Azure Storage Explorer or azcopy when using the --decompress flag. Failures occur on 1000s of files at random, but restarting the job from Azure Storage Explorer is very difficult. Performing a Select All of the folders is very slow and probably leads to a lot of API calls to Azure Blob Storage.

At this time the only reliable way to transfer data is to use without the --decompress flag and then I have created a script that performs the decompression afterward. However, this is not so convenient for our end-users.

If there are settings that we can use to ensure reliable transfer at the expense of speed then I would prefer this. If lowering the concurrency further is the solution then please advise?

jis260 avatar Jul 09 '22 07:07 jis260

@jis260 Can you forward the logs to my email? Though most relevant logs are here, I'd like to take a look what led to those events?

nakulkar-msft avatar Jul 11 '22 04:07 nakulkar-msft

@jis260 Can you forward the logs to my email? Though most relevant logs are here, I'd like to take a look what led to those events?

Hopefully, my email containing the logs was received? I used the email address published on your profile.

jis260 avatar Jul 15 '22 14:07 jis260

@jis260 did you ever get in contact with @nakulkar-msft over email?

MRayermannMSFT avatar Aug 15 '22 17:08 MRayermannMSFT

@jis260 did you ever get in contact with @nakulkar-msft over email?

I sent an email to the email address on @nakulkar-msft GitHub profile, but I never received a reply. That's why I asked for confirmation above on 15 July.

jis260 avatar Aug 15 '22 20:08 jis260

Hello @MRayermannMSFT, has then been any investigation into this issue? It means that the Azure Storage Explorer is unusable for downloading files that have a content-encoding heading set. The decompression needs to be performed as a separate step, which is inconvenient. It would be really helpful to know that this is being looked into.

jis260 avatar Sep 10 '22 15:09 jis260

@jis260 I've just pinged the AzCopy team to figure out what the heck is going on here. Thanks for the patience.

MRayermannMSFT avatar Sep 26 '22 17:09 MRayermannMSFT

@jis260 they did ever work things through with you?

MRayermannMSFT avatar Nov 14 '22 17:11 MRayermannMSFT

Thank you, @MRayermannMSFT. Unfortunately, I have not had any follow-up from anyone at Microsoft apart from your comments on here. All rather disappointing really.

jis260 avatar Nov 14 '22 18:11 jis260