Finder icon indicating copy to clipboard operation
Finder copied to clipboard

SRA Toolkit update for prefork and fastq-dump

Open wesleyemason opened this issue 1 year ago • 1 comments

Please update the SRA Toolkit packages (prefork and fastq-dump) in the dep folder to the latest 3.0.2 version. They no longer download data from NCBI due to an invalid SSL certificate. For example:

$ ./Finder/dep/fastq-dump --version

./Finder/dep/fastq-dump : 2.9.6

$ ./Finder/dep/fastq-dump --stdout -X 2 SRR390728 2023-01-23T18:48:27 fastq-dump.2.9.6 sys: connection failed while opening file within cryptographic module - mbedtls_ssl_handshake returned -9984 ( X509 - Certificate verification failed, e.g. CRL, CA or signature check failed ) 2023-01-23T18:48:27 fastq-dump.2.9.6 sys: mbedtls_ssl_get_verify_result returned 0x4008 ( !! The certificate is not correctly signed by the trusted CA !! The certificate is signed with an unacceptable hash. ) 2023-01-23T18:48:27 fastq-dump.2.9.6 sys: connection failed while opening file within cryptographic module - ktls_handshake failed while accessing '130.14.29.113' from '10.123.131.29' 2023-01-23T18:48:27 fastq-dump.2.9.6 sys: connection failed while opening file within cryptographic module - Failed to create TLS stream for 'trace.ncbi.nlm.nih.gov' (130.14.29.113) from '10.123.131.29' 2023-01-23T18:48:27 fastq-dump.2.9.6 err: item not found while constructing within virtual database module - the path 'SRR390728' cannot be opened as database or table

$ ./fastq-dump --version

./fastq-dump : 3.0.2

$ ./fastq-dump --stdout -X 2 SRR390728 Read 2 spots for SRR390728 Written 2 spots for SRR390728 @SRR390728.1 1 length=72 CATTCTTCACGTAGTTCTCGAGCCTTGGTTTTCAGCGATGGAGAATGACTTTGACAAGCTGAGAGAAGNTNC +SRR390728.1 1 length=72 ;;;;;;;;;;;;;;;;;;;;;;;;;;;9;;665142;;;;;;;;;;;;;;;;;;;;;;;;;;;;;96&&&&( @SRR390728.2 2 length=72 AAGTAGGTCTCGTCTGTGTTTTCTACGAGCTTGTGTTCCAGCTGACCCACTCCCTGGGTGGGGGGACTGGGT +SRR390728.2 2 length=72 ;;;;;;;;;;;;;;;;;4;;;;3;393.1+4&&5&&;;;;;;;;;;;;;;;;;;;;;<9;<;;;;;464262

In addition, the downloadAndDumpFastqFromSRA.py script in the utils directory needs to be updated for the new prefetch version. The "-O" option for prefetch will now download each fastq file into a separate directory. The downloadAndDumpFastqFromSRA.py script currently expects the fastq files to all download into the same parent folder. If the script is modified to use the "-o" option and specify the filename then it will execute properly. For example:

$ diff -u downloadAndDumpFastqFromSRA.py.orig downloadAndDumpFastqFromSRA.py --- downloadAndDumpFastqFromSRA.py.orig 2023-01-20 10:47:53.318973000 -0600 +++ downloadAndDumpFastqFromSRA.py 2023-01-20 10:48:06.832207000 -0600 @@ -36,7 +36,7 @@

def downloadSRAFile( allinput ): sra, default_path_to_download, output_directory = allinput

  • os.system( "prefetch -X 104857600 -O " + output_directory + "/" + " " + sra + " 2> " + output_directory + "/" + sra + ".error" )
  • os.system( "prefetch -X 104857600 -o " + output_directory + "/" + sra + ".sra" + " " + sra + " 2> " + output_directory + "/" + sra + ".error" ) cmd = "fastq-dump -X 1 -Z --split-spot " + output_directory + "/" + sra + ".sra|wc -l > " + output_directory + "/" + sra + ".temp" os.system( cmd ) if int( open( output_directory + "/" + sra + ".temp" ).read() ) == 4:

wesleyemason avatar Jan 23 '23 19:01 wesleyemason