AzureStorageExplorer icon indicating copy to clipboard operation
AzureStorageExplorer copied to clipboard

Uploading large files: files get "stuck" partially uploaded and never complete

Open kmelmon opened this issue 1 year ago • 33 comments

Preflight Checklist

Storage Explorer Version

1.35.0

Regression From

1.19.1

Architecture

i86

Storage Explorer Build Number

No response

Platform

Windows

OS Version

Windows 11

Bug Description

When uploading large files, the file gets "stuck" after partially uploading. Progress never goes beyond a certain point and the file never finishes uploading. In my most recent scenario the file was around 750 GB, but I have seen this sporadically in the past for files as small as 1 GB.

The issue seems likely related to the use of OAuth for authentication, which was added somewhat recently. If I roll back to version 1.19.1, it uses Active Directory authentication, and I don't see the issue.

The repro is 100% with the 750 GB file.

I've attached an AzCopy log file from the failure case 9978b3ad-4aa4-3944-7646-d12ca3c1bcdc.log

Steps to Reproduce

  1. Launch Azure Storage Explorer
  2. Connect to blob container storage using OAuth
  3. Upload a 750 GB file to blob storage container

Actual Experience

The upload partially completes then gets "stuck" and never makes any further progress.

Expected Experience

File should upload completely

Additional Context

No response

kmelmon avatar Sep 24 '24 17:09 kmelmon

Can you also share your Storage Explorer app logs? How long does it take for the hang to occur?

Can't seem to find them right now, but we have a few other issues other users have opened related to this.

We have an ongoing discussion with AzCopy developers about an authentication question. Essentially, AzCopy clients are expected to insert an authentication token in the system credential store, and it reads the token to do its work. For long-running operations, clients also need to refresh the token. As far as we can tell, Storage Explorer does this as expected. What remains to be determined is whether AzCopy is checking the credential store for refreshed tokens at regular intervals.

craxal avatar Sep 24 '24 21:09 craxal

Here are the app logs. I don't know exactly how long it takes for the hang to occur, but it seems like you can compute it from the AzCopy log. It seems to be under an hour from rough observation. AppLogs.zip

kmelmon avatar Sep 24 '24 23:09 kmelmon

Yeah, that sounds about right. Tokens are normally good for about an hour before they need to be refreshed. So, it definitely sounds like AzCopy is not getting the refreshed token.

One way to test this is by running AzCopy directly yourself. You can copy the command Storage Explorer uses from the link in the Activity Log. If AzCopy can't complete on its own, it isn't doing something right.

craxal avatar Sep 25 '24 16:09 craxal

2024/09/25 19:05:28 PERF: primary performance constraint is Disk. States: X:  0, O:  0, M:  0, L:  0, R:  0, D:  1, W:  0, F:  0, B:  1, E:  0, T:  2, GRs:  5
2024/09/25 19:05:28 10.7 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total, 2-sec Throughput (Mb/s): 134.1982 (Disk may be limiting speed)
2024/09/25 19:05:28 ==> REQUEST/RESPONSE (Try=1/306.3515ms, OpTime=469.0319ms) -- RESPONSE SUCCESSFULLY RECEIVED
   PUT https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blockid=MDAwMDBsRMGle8NE2GLVLVZQMZkSMDAwMDAwMDAwMDA0OTUy&comp=block
   X-Ms-Request-Id: [5f0b258a-c01e-0111-037d-0f3e52000000]

2024/09/25 19:05:29 ==> REQUEST/RESPONSE (Try=1/290.7196ms, OpTime=564.8591ms) -- RESPONSE SUCCESSFULLY RECEIVED
   PUT https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blockid=MDAwMDBsRMGle8NE2GLVLVZQMZkSMDAwMDAwMDAwMDA0OTUz&comp=block
   X-Ms-Request-Id: [5f0b2750-c01e-0111-157d-0f3e52000000]

2024/09/25 19:05:30 PERF: primary performance constraint is Disk. States: X:  0, O:  0, M:  0, L:  0, R:  0, D:  1, W:  1, F:  0, B:  0, E:  0, T:  2, GRs:  5
2024/09/25 19:05:30 10.7 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total, 2-sec Throughput (Mb/s): 66.8998 (Disk may be limiting speed)
2024/09/25 19:05:31 ==> REQUEST/RESPONSE (Try=1/228.61ms, OpTime=525.8384ms) -- RESPONSE SUCCESSFULLY RECEIVED
   PUT https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blockid=MDAwMDBsRMGle8NE2GLVLVZQMZkSMDAwMDAwMDAwMDA0OTU0&comp=block
   X-Ms-Request-Id: [5f0b28e6-c01e-0111-7f7d-0f3e52000000]

2024/09/25 19:05:32 PERF: primary performance constraint is Disk. States: X:  0, O:  0, M:  0, L:  0, R:  0, D:  1, W:  0, F:  0, B:  1, E:  0, T:  2, GRs:  5
2024/09/25 19:05:32 10.7 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total, 2-sec Throughput (Mb/s): 129.4926 (Disk may be limiting speed)
2024/09/25 19:05:32 ==> REQUEST/RESPONSE (Try=1/434.9037ms, OpTime=693.251ms) -- RESPONSE SUCCESSFULLY RECEIVED
   PUT https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blockid=MDAwMDBsRMGle8NE2GLVLVZQMZkSMDAwMDAwMDAwMDA0OTU1&comp=block
   X-Ms-Request-Id: [5f0b2b1a-c01e-0111-777d-0f3e52000000]

2024/09/25 19:05:33 ==> REQUEST/RESPONSE (Try=1/307.093ms, OpTime=551.3149ms) -- RESPONSE SUCCESSFULLY RECEIVED
   PUT https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blockid=MDAwMDBsRMGle8NE2GLVLVZQMZkSMDAwMDAwMDAwMDA0OTU2&comp=block
   X-Ms-Request-Id: [5f0b2cbd-c01e-0111-797d-0f3e52000000]

2024/09/25 19:05:34 PERF: primary performance constraint is Disk. States: X:  0, O:  0, M:  0, L:  0, R:  0, D:  1, W:  0, F:  0, B:  1, E:  0, T:  2, GRs:  5
2024/09/25 19:05:34 10.7 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total, 2-sec Throughput (Mb/s): 71.2911 (Disk may be limiting speed)
2024/09/25 19:05:35 ==> REQUEST/RESPONSE (Try=1/324.4066ms, OpTime=623.817ms) -- RESPONSE SUCCESSFULLY RECEIVED
   PUT https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blockid=MDAwMDBsRMGle8NE2GLVLVZQMZkSMDAwMDAwMDAwMDA0OTU3&comp=block
   X-Ms-Request-Id: [5f0b2e6a-c01e-0111-707d-0f3e52000000]

2024/09/25 19:05:35 ==> REQUEST/RESPONSE (Try=1/0s, OpTime=62.7333ms) -- RESPONSE STATUS CODE ERROR
   PUT https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blockid=MDAwMDBsRMGle8NE2GLVLVZQMZkSMDAwMDAwMDAwMDA0OTU4&comp=block
   Accept: application/xml
   Authorization: REDACTED
   Content-Length: 16777216
   Content-Type: application/octet-stream
   User-Agent: AzCopy/10.26.0 azsdk-go-azblob/v1.4.0 (go1.22.5; Windows_NT)
   X-Ms-Client-Request-Id: d4462e6d-5baf-4df7-6977-1171eaf0106e
   x-ms-version: 2023-08-03
   --------------------------------------------------------------------------------
   RESPONSE Status: 401 Server failed to authenticate the request. Please refer to the information in the www-authenticate header.
   Content-Length: 404
   Content-Type: application/xml
   Date: Wed, 25 Sep 2024 19:05:37 GMT
   Server: Microsoft-HTTPAPI/2.0
   Www-Authenticate: Bearer authorization_uri=https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47/oauth2/authorize resource_id=https://storage.azure.com
   X-Ms-Error-Code: InvalidAuthenticationInfo
   X-Ms-Request-Id: 5f0b3046-c01e-0111-327d-0f3e52000000
Response Details: <Code>InvalidAuthenticationInfo</Code><Message>Server failed to authenticate the request. Please refer to the information in the www-authenticate header. </Message><AuthenticationErrorDetail>Lifetime validation failed. The token is expired.</AuthenticationErrorDetail>

2024/09/25 19:05:36 ==> REQUEST/RESPONSE (Try=1/66.8347ms, OpTime=66.8347ms) -- RESPONSE STATUS CODE ERROR
   PUT https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blockid=MDAwMDBsRMGle8NE2GLVLVZQMZkSMDAwMDAwMDAwMDA0OTU4&comp=block
   Accept: application/xml
   Authorization: REDACTED
   Content-Length: 16777216
   Content-Type: application/octet-stream
   User-Agent: AzCopy/10.26.0 azsdk-go-azblob/v1.4.0 (go1.22.5; Windows_NT)
   X-Ms-Client-Request-Id: d4462e6d-5baf-4df7-6977-1171eaf0106e
   x-ms-version: 2023-08-03
   --------------------------------------------------------------------------------
   RESPONSE Status: 401 Server failed to authenticate the request. Please refer to the information in the www-authenticate header.
   Content-Length: 404
   Content-Type: application/xml
   Date: Wed, 25 Sep 2024 19:05:37 GMT
   Server: Microsoft-HTTPAPI/2.0
   Www-Authenticate: Bearer authorization_uri=https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47/oauth2/authorize resource_id=https://storage.azure.com
   X-Ms-Error-Code: InvalidAuthenticationInfo
   X-Ms-Request-Id: 0ebdb9b9-301e-0047-6a7d-0f31b0000000
Response Details: <Code>InvalidAuthenticationInfo</Code><Message>Server failed to authenticate the request. Please refer to the information in the www-authenticate header. </Message><AuthenticationErrorDetail>Lifetime validation failed. The token is expired.</AuthenticationErrorDetail>

2024/09/25 19:05:36 INFO: [P#0-T#0] Early close of chunk in singleChunkReader with context still active
2024/09/25 19:05:36 ERR: [P#0-T#0] UPLOADFAILED: \\?\Z:\downsample_text_bilinear.hdf5 : 401 : 401 Server failed to authenticate the request. Please refer to the information in the www-authenticate header.. When Staging block. X-Ms-Request-Id: 0ebdb9b9-301e-0047-6a7d-0f31b0000000

   Dst: https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload/test2/downsample_text_bilinear.hdf5
2024/09/25 19:05:36 PERF: primary performance constraint is Disk. States: X:  0, O:  0, M:  0, L:  0, R:  0, D:  1, W:  0, F:  0, B:  0, E:  0, T:  1, GRs:  5
2024/09/25 19:05:36 10.7 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total, 2-sec Throughput (Mb/s): 130.1875 (Disk may be limiting speed)
2024/09/25 19:05:37 ==> REQUEST/RESPONSE (Try=1/10.7782ms, OpTime=76.9684ms) -- RESPONSE STATUS CODE ERROR
   GET https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blocklisttype=all&comp=blocklist
   Accept: application/xml
   Authorization: REDACTED
   User-Agent: AzCopy/10.26.0 azsdk-go-azblob/v1.4.0 (go1.22.5; Windows_NT)
   X-Ms-Client-Request-Id: 795c7943-17e5-4c71-62f9-e46512d1b621
   x-ms-version: 2023-08-03
   --------------------------------------------------------------------------------
   RESPONSE Status: 401 Server failed to authenticate the request. Please refer to the information in the www-authenticate header.
   Content-Length: 404
   Content-Type: application/xml
   Date: Wed, 25 Sep 2024 19:05:39 GMT
   Server: Microsoft-HTTPAPI/2.0
   Vary: Origin
   Www-Authenticate: Bearer authorization_uri=https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47/oauth2/authorize resource_id=https://storage.azure.com
   X-Ms-Error-Code: InvalidAuthenticationInfo
   X-Ms-Request-Id: dc134f59-401e-0000-597d-0f5aeb000000
Response Details: <Code>InvalidAuthenticationInfo</Code><Message>Server failed to authenticate the request. Please refer to the information in the www-authenticate header. </Message><AuthenticationErrorDetail>Lifetime validation failed. The token is expired.</AuthenticationErrorDetail>

2024/09/25 19:05:37 ==> REQUEST/RESPONSE (Try=1/502.9µs, OpTime=502.9µs) -- RESPONSE STATUS CODE ERROR
   GET https://gfxmltrainingstore1.blob.core.windows.net/cadmus/TestLargeUpload%2Ftest2%2Fdownsample_text_bilinear.hdf5?blocklisttype=all&comp=blocklist
   Accept: application/xml
   Authorization: REDACTED
   User-Agent: AzCopy/10.26.0 azsdk-go-azblob/v1.4.0 (go1.22.5; Windows_NT)
   X-Ms-Client-Request-Id: 795c7943-17e5-4c71-62f9-e46512d1b621
   x-ms-version: 2023-08-03
   --------------------------------------------------------------------------------
   RESPONSE Status: 401 Server failed to authenticate the request. Please refer to the information in the www-authenticate header.
   Content-Length: 404
   Content-Type: application/xml
   Date: Wed, 25 Sep 2024 19:05:39 GMT
   Server: Microsoft-HTTPAPI/2.0
   Vary: Origin
   Www-Authenticate: Bearer authorization_uri=https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47/oauth2/authorize resource_id=https://storage.azure.com
   X-Ms-Error-Code: InvalidAuthenticationInfo
   X-Ms-Request-Id: dc134f5f-401e-0000-5e7d-0f5aeb000000
Response Details: <Code>InvalidAuthenticationInfo</Code><Message>Server failed to authenticate the request. Please refer to the information in the www-authenticate header. </Message><AuthenticationErrorDetail>Lifetime validation failed. The token is expired.</AuthenticationErrorDetail>

2024/09/25 19:05:37 JobID=a5c1446c-c37b-d844-62d5-2d5650319912, Part#=0, TransfersDone=1 of 1
2024/09/25 19:05:37 all parts of entire Job a5c1446c-c37b-d844-62d5-2d5650319912 successfully completed, cancelled or paused
2024/09/25 19:05:37 is part of Job which 1 total number of parts done 
2024/09/25 19:05:38 PERF: primary performance constraint is Unknown. States: X:  0, O:  0, M:  0, L:  0, R:  0, D:  0, W:  0, F:  0, B:  0, E:  0, T:  0, GRs:  5
2024/09/25 19:05:38 0.0 %, 0 Done, 1 Failed, 0 Pending, 0 Skipped, 1 Total, 
2024/09/25 19:05:38 

Diagnostic stats:
IOPS: 1
End-to-end ms per request: 669
Network Errors: 0.00%
Server Busy: 0.00%


Job a5c1446c-c37b-d844-62d5-2d5650319912 summary
Elapsed Time (Minutes): 71.5707
Number of File Transfers: 1
Number of Folder Property Transfers: 0
Number of Symlink Transfers: 0
Total Number of Transfers: 1
Number of File Transfers Completed: 0
Number of Folder Transfers Completed: 0
Number of File Transfers Failed: 1
Number of Folder Transfers Failed: 0
Number of File Transfers Skipped: 0
Number of Folder Transfers Skipped: 0
Total Number of Bytes Transferred: 0
Final Job Status: Failed

2024/09/25 19:05:38 Closing Log

kmelmon avatar Sep 25 '24 22:09 kmelmon

Yep, error message says the token is expired. So I highly suspect that's why the job is hanging. I'm pretty sure something in AzCopy isn't checking for a refreshed token. I'll keep reaching out to the AzCopy team to get something moving.

craxal avatar Sep 26 '24 18:09 craxal

Any update on this? Our team would really benefit from a fix as we use Azure Storage for all our ML training data.

kmelmon avatar Oct 22 '24 18:10 kmelmon

We have no significant updates. The AzCopy team is aware of the issue and are currently looking into it.

craxal avatar Oct 22 '24 19:10 craxal

My team is facing the same issue with large files in a major customer project. Can you point me to AzCopy issue related to this bug?

To some extent we were able to circumvent this OAuth expiration issue by using SAS-tokens for authentication - but unfortunately Storage Explorer appears to default to SAS-token authentication only if user has Storage account level Microsoft.Storage/storageAccounts/listkeys/action permission - something we can't grant to out end users. User can create SAS token manually without that Storage account level permission and use the token to do large transfers successfully - but instructing this manual method to all end users is not doable. Maybe storage explorer could default to sas-tokens even without said listkeys permission?

jarkko-siitonen avatar Nov 12 '24 08:11 jarkko-siitonen

We've been using version 1.19.1 as a workaround for now: https://github.com/microsoft/AzureStorageExplorer/releases/tag/V1.19.1

kmelmon avatar Nov 12 '24 17:11 kmelmon

@jarkko-siitonen Keep in mind there are two different kinds of SAS: key-based SAS and user-delegated SAS. Storage Explorer supports both. So you do not need to give out List Keys permissions. If the user has permissions to generate user-delegated SAS, Storage Explorer should be able to use that method instead.

@JasonYeMSFT Correct me, if I'm wrong?

craxal avatar Nov 12 '24 18:11 craxal

@craxal Yes, but how do you make Storage explorer to actually use SAS tokens without a) generating the SAS token manually and connecting the explorer to Shared access signature URL b) granting the listkeys action

My testing seems to indicate Storage explorer passes AZCOPY_CRED_TYPE = "OAuthToken" parameter to AzCopy unless either of those 2 is true. Would be delighted if there was 3rd way to make Storage Exlorer use SAS.

jarkko-siitonen avatar Nov 12 '24 18:11 jarkko-siitonen

@jarkko-siitonen What Craig said is right. Storage Explorer can generate user delegation SAS. But it can only do so after you are signed in because it needs your Azure AD credential. Storage Explorer is implemented to prefer user delegation SAS because it is more flexible. If your user can already sign in, get an OAuth token and use it to do data transfer, there is a good chance that you can take advantage of user delegation SAS to achieve easier access control. If you can tell us more about your scenario we may better help you by giving recommendations, for example

  • How do your users connect to their storage in Storage Explorer
  • How do you distribute permissions to your users
  • What are the operations your users need
  • Any additional requirements

JasonYeMSFT avatar Nov 13 '24 18:11 JasonYeMSFT

@JasonYeMSFT Short description: As a fragment of larger project, we are replacing a legacy sftp server with Azure storage account(s) to enforce Entra ID authentication with MFA and Geoblocking. The service is used to transfer files with sizes in hundreds of GB, thus transfer times can be quite long (more than OAuth token lifetime).

How do your users connect to their storage in Storage Explorer:

  • Users access the storage using Azure Storage Explorer and/or Azure Portal ** Storage Explorer is more complex than your average sftp client, but not too much so. ** Using Storage Explorer version 1.19.1 might work, but launching a new service which requires 2021 version of the client software sounds scary
  • The users connect to storage accounts using Entra Id authentication
  • There are both member user accounts and guest user accounts

What are the operations your users need

  • There are some users with upload and download permission (read+write+delete files in a blob container)
  • Some users have download permission only (read files)
  • There are several blob containers ** one user might have write access to several containers ** another user might have only read permission to one or two of them

How do you distribute permissions to your users

  • Users are created (or invited in case of guest account) to Entra ID and assigned to security groups
  • Each storage accounts has two groups, one with read+write permission, one with read permission only
  • Read+write users get now Storage Account Contributor Role ** For user with contributor role Storage explorer uses account key-based SAS in uploads and downloads ** This solved the upload problems but forced us to place each blob container in separate storage account
  • Read only users are assigned Storage Blob Data Reader role (or a custom role if that helps) ** For these users Storage Explorer download uses OAuth authentication - and transfers fail after 60-90 minutes ** Attempts to solve this with custom role did not work out - as soon as explorer started using account key-based SAS the users also got undesired write permission ** The read only users can generate User delegated SAS manually using Storage Explorer tool ** If user connects storage explorer to self generated User delegated SAS url, transfer succeeds *** Unfortunately, this adds a lot of extra complexity to end user *** Users with sftp mindset, try to download directly without generating and mounting a User delegated SAS first - and transfer fails

Is the AzCopy team just aware of the problem or is fixing this bug on the roadmap?

jarkko-siitonen avatar Nov 14 '24 07:11 jarkko-siitonen

@jarkko-siitonen They are actively working on a fix for the issue. They've given us an estimated release date in January.

@JasonYeMSFT

For user with contributor role Storage explorer uses account key-based SAS in uploads and downloads

To my knowledge, assuming they also have permission to create user-delegated SAS, this should not be happening. Any idea why this would?

craxal avatar Nov 14 '24 18:11 craxal

@jarkko-siitonen As Craig has said, AzCopy is working on a fix and once it's done, direct OAuth access token based transfer should no longer have any issues. However, you might be able to stop relying on account key access today in Storage Explorer.

For your read-only case, I think you should keep giving your users Storage Blob Data Reader role. For your read-write case, have you considered giving your users Storage Blob Data Contributor role instead and turn off the storage account key access for your storage accounts? For users with the Storage Blob Data Contributor role, Storage Explorer can generate a user delegation SAS for them which can have a lifetime much longer than a regular OAuth access token (up to 7 days). It has a maximum lifetime though so it would become an issue if your data transfer takes more time to complete. You can try setting this up for a new test storage account and upload a blob to see how it works. We would like to help you stop retaining storage account key access for your users. If you run into any issues, please let us know and we can help.

JasonYeMSFT avatar Nov 14 '24 19:11 JasonYeMSFT

Thank you @craxal and @JasonYeMSFT for the bug fix release date indication. It's a lot easier to decide what to do next, when there is a clear understanding, the bug is actively worked on!

Seven days maximum duration for OAuth file transfers should not be an issue for us.

And yes, we would indeed prefer not to give storage account key access to users. However, a new test did not yield any better results for us than it did before. Guest user has Storage Blob Data Contributor role on container level and Reader role for storage account level, with additional custom role containing Generate a user delegation key permission on storage account level. The user can generate User delegated SAS manually, but Storage Explorer upload, download and delete functions all use OAuth authentication - which would be just fine for us, if this token renewal bug didn’t exist.

If this Storage Blob Data Contributor behavior is not the expected/desired outcome with Storage Blob Data Contributor role, maybe I should create a new issue for Storage Explorer team to inspect?

jarkko-siitonen avatar Nov 15 '24 10:11 jarkko-siitonen

I read the code and I found that our upload code doesn't use user delegation SAS because it doesn't have the limitation of service to service copy where the source and destination cannot use different OAuth credentials. I added this issue to tracking adding this user delegation SAS to our upload/download code https://github.com/microsoft/AzureStorageExplorer/issues/8299

JasonYeMSFT avatar Nov 15 '24 17:11 JasonYeMSFT

I see this is now planned for 1.38, I am assuming the fix is not in 1.37 as of this time?

SammMoney avatar Jan 17 '25 14:01 SammMoney

@SammMoney Correct. As mentioned above, a fix for this depends on an update to AzCopy that we are still waiting for. The earliest we can expect that update is the end of this month, meaning we are expecting a fix to become available for our next release: 1.38.0.

craxal avatar Jan 18 '25 00:01 craxal

@richardMSFT We've now integrated AzCopy 10.28.0. This should have the intended fixes. Let's run some tests to be sure things are working as expected.

craxal avatar Jan 29 '25 23:01 craxal

@kmelmon @SammMoney @jarkko-siitonen

  • Have any of you attempted any long running uploads with 1.37.0? Do you still encounter the issue?
  • In you upload scenarios, do you have permission to generate user delegation keys?

We've been working hard to verify this issue indeed gets resolved. While doing so, we've uncovered some interesting details.

Storage Explorer always tries to generate a SAS for the destination in an upload job. This is becaucse the SAS makes it easier to manintain an authorized connection for the long running uploads. If Storage Explorer is unable to generate a SAS, it uses OAuth credentials instead. We discovered a bug in 1.36.2 where transfers weren't generating user delegated SAS where possible (#8299). This was fixed in 1.37.0.

Since then, while validating fixes in AzCopy, we found it takes considerable effort to grant access to blob data and block the ability to create a user delegate SAS (and thus force an upload via OAuth). All built-in Azure RBAC roles that provide data access also allow for user delegation key generation, so we had to use a custom rule to block that permission.

In light of these details, we're wondering what scenarios you are operating under to get a better sense of the nature of the issue.

craxal avatar Mar 22 '25 00:03 craxal

Thank you for asking @craxal

We have been using Storage Explorer 1.37.0 successfully for large uploads and downloads.

And yes, we are still waiting for next version with better OAuth support, to test scenarios like revoking user permissions during a transfer. Also trying to prevent users from sharing user delegation SAS with other people...

jarkko-siitonen avatar Mar 24 '25 07:03 jarkko-siitonen

Would you be willing to try a prerelease build? We want to be sure that the OAuth related issues in Storage Explorer/AzCopy have been properly fixed and unblock your intended workflow.

Here is the download.

craxal avatar Mar 24 '25 19:03 craxal

Yes, we are interested to test our workflows with prerelease build.

The download link currently gives AuthorizationFailure.

jarkko-siitonen avatar Mar 25 '25 07:03 jarkko-siitonen

Hmm... Try this download instead.

craxal avatar Mar 25 '25 17:03 craxal

Sorry just getting back to this (was on vacation then got busy). I'd like to try out the latest but the links above don't work. Is the fix checked in yet? If not I'd like to try a prerelease version. thanks!

kmelmon avatar Apr 23 '25 18:04 kmelmon

Looks like I missed the 2nd download link too - sorry for that. But both links do return a page with AuthenticationFailed anyhow.

The milestone was switched from 1.38.0 to 1.39.0 so apparently doesn't make must sense to re-test the issue with version 1.38.0 which is out now?

jarkko-siitonen avatar Apr 24 '25 12:04 jarkko-siitonen

Hi @kmelmon and @jarkko-siitonen. The reason that the milestone was switched from 1.38.0 to 1.39.0 is that in the last few weeks of our testing, while with AzCopy 10.28.0 the refresh tokens should now properly be obtained, there still exists an issue with long running uploads that hang after an hour when using OAuth credentials. We are currently working with the AzCopy developers to investigate why this is happening.

You can try the 1.38.0 release to see if it is able to work for your use cases or if the issue still reproduces. If it does still reproduce please let us know because confirming that this still persists for your workflow is useful to us.

richardMSFT avatar Apr 24 '25 18:04 richardMSFT

Uploading large files with OAuth is my scenario so I think I'd prefer to wait until there's a fix for that before retesting. Please let me know when you have a fix for that.

kmelmon avatar Apr 25 '25 18:04 kmelmon

@kmelmon With AzCopy's help, we've managed to resolve the hanging transfers. Resolving that issue, however, seems to have uncovered other issues related to orchestration and concurrency. The silver lining here is the job will now at least fail gracefully, so you won't waste time waiting for a job that's actually failed to complete.

The concurrency issues can be worked around by adjusting the Network Concurrency setting in Storage Explorer. We've run extensive testing and found that "Adjust dynamically" seems to work best, but "Limit to 1-4" also works well, depending on system resource availability (lower numbers are better for smaller bandwidths).

The AzCopy team is aware of the concurrency issues, and we are working closely with them to get them resolved.

craxal avatar Jun 19 '25 16:06 craxal