bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

bcftools concat retry mechanisms?

Open graphenn opened this issue 1 year ago • 3 comments

For the bcftools concat function, could we add some retry mechanisms when directly accessing s3 files? I am using bcftools concat to directly concatenate a large number of vcf files. Due to the huge volume, if there are network fluctuations during the concatenation process, the connection is prone to aborting halfway, resulting in a "software caused connection abort" error. This makes it difficult to successfully complete the concatenation. Would it be possible to add some retry mechanisms, such as retrying three times, which could significantly improve this situation of concatenation failures?

Currently, I am unable to reproduce this issue, presumably because the network fluctuations have subsided. Unfortunately, I didn't save the log. I recall that before the "software caused connection abort" error, there was an "[E::hts_hopen]" error

graphenn avatar Aug 06 '24 10:08 graphenn

I can imagine this, probably best done at htslib level and maybe controlled by an environment variable.

pd3 avatar Aug 28 '24 09:08 pd3

Just to link it to a related issue in htslib: https://github.com/samtools/htslib/issues/1424

pd3 avatar Sep 12 '24 10:09 pd3

I reproduce it, the log is [E::hts_open_format] Failed to open file "s3://xxxxx" : Software caused connection abort

graphenn avatar Jan 03 '25 14:01 graphenn