modules
modules copied to clipboard
use md5sum to check download integrity with vdb-validate as fallback
PR checklist
As explained in https://github.com/ncbi/sra-tools/issues/896, vdb-validate
does not detect file corruption if the prefetched files do not contain MD5 checksums. It has happened to me many times that downloaded files turn out to be corrupt, if I use the option force_sratools_download
(now --download_method sratools
). What is worse is that extracting the files using fasterq-dump
does not always result in an error even if the file is corrupt. It is even conceivable that the extracted FastQ file looks perfectly intact with only some bases or quality values being changed. As such, the error may go completely unnoticed.
This PR fixes this by (1) fetching the md5sum from the SRA Data Locator API and (2) performing a manual md5sum check. The current method, vdb-validate
is only used anymore if the md5sum cannot be obtained from the API.