datalad `annex.retry` approach still suboptimal

I am on record to not be a fan of the approach to make git-annex retry any get unconditionally three times (#5808). I just had a case of a misspecified credential where datalad took 4min until it told me that it really could not get any of the 48 files. For the entire time I had a nice progress bar, with lots of progress being made. After four minutes I get the report that any download request failed, and it has tried each URL for each file (there were also 3 URL per file) before telling me anything -- after 432 HTTP403 responses -- which may well earn one a blacklist entry.

I think this business must stop. If we want to support retries then we cannot simply let git-annex hammer a service countless times until server or client give up. It would make a lot more sense to let it try once (which is still a once per recorded location, so possibly many times), then message that X failed downloads will be retried (if enabled), and only then give it a another try.

Feb 25 '22 15:02 mih

In case of a failed authentication, it also leads to the user being asked three times if they want to update their credentials:

Please enter password for encrypted keyring: 
Access to s3://NDAR_Central_3/submission_32142/2001_01_MR/unprocessed/rfMRI_REST1_AP/2001_01_MR_rfMRI_REST1_AP_InitialFrames.nii.gz has failed.
Do you want to enter other credentials in case they were updated? (choices: yes, no): no

Do you want to enter other credentials in case they were updated? (choices: yes, no): no

Do you want to enter other credentials in case they were updated? (choices: yes, no): no

get(ok): 2001_01_MR/unprocessed/rfMRI_REST1_AP/2001_01_MR_rfMRI_REST1_AP_InitialFrames.nii.gz (file)
action summary:
  get (notneeded: 1, ok: 1)

Mar 16 '22 13:03 adswa

I just had a case of a misspecified credential where datalad took 4min until it told me that it really could not get any of the 48 files.

IIRC at some point in the past progress bars included report on # of errorred out items, it is no longer the case?

annex.retry would indeed multiply the time/attempts correspondingly, but if it is due to 403 -- I think it should not retry at all . Can you confirm that 403 is directly received by git-annex in your case? then we should file TODO for Joey in git-annex.

If it is due to our or some other annex special remote and annex trying with it 3 times -- I forgot details of the protocol but may be there should be some explicit response to also provide error code which would let annex know to not even retry again (as if in case of 403)

Mar 16 '22 15:03 yarikoptic

I am closing this report of mine. It is unlikely that it will be addressed. The behavior with datalad-next is different (no retry), and the download() command from https://github.com/datalad/datalad-next/pull/124 has a different 403 handling.

Nov 18 '22 15:11 mih