datasets
datasets copied to clipboard
datasets rehydrate error during download
Hello,
I've succesfully downloaded a dehydrated dataset of Campylobacter coli, althought, I'm getting this error during the rehydration process :
~/D$ ./datasets rehydrate --directory Campylo_coli/ Found 38905 files for rehydration Completed 467 of 38905 [------------------------------------------------] 1% Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490195.1/GCF_001490195.1_EC3511_genomic.fna 1.59MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490415.1/GCF_001490415.1_H042120298_genomic.fna 1.9MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490335.1/GCF_001490335.1_EC3952_genomic.fna 1.66MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490395.1/GCF_001490395.1_SS_2234_genomic.fna 1.68MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490255.1/GCF_001490255.1_EC3525_genomic.fna 1.8MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490315.1/GCF_001490315.1_CCN257_genomic.fna 1.82MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490295.1/GCF_001490295.1_EC4297_genomic.fna 1.68MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490375.1/GCF_001490375.1_SS_2356_genomic.fna 1.68MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490215.1/GCF_001490215.1_EC3575_genomic.fna 1.59MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490235.1/GCF_001490235.1_H072820535_genomic.fna 1.71MB done panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x87db75]
**goroutine 13 [running]: main/datasets/datasets.downloadFileWorker.func2(0xc0009b5f58, 0xc000994000, 0xc0009b5f40, 0xc000200080, 0xc000032480) /export/home/tomcat/TeamCity/Agent4/work/c6e6852d9a243866/dataloader/apps/public/Datasets/datasets/datasets/Rehydrate.go:191 +0x375 main/datasets/datasets.downloadFileWorker(0xc000200080, 0xc000032300, 0xc000032480) /export/home/tomcat/TeamCity/Agent4/work/c6e6852d9a243866/dataloader/apps/public/Datasets/datasets/datasets/Rehydrate.go:216 +0x107 created by main/datasets/datasets.downloadMultipleFiles /export/home/tomcat/TeamCity/Agent4/work/c6e6852d9a243866/dataloader/apps/public/Datasets/datasets/datasets/Rehydrate.go:241 +0x165
I'm using datasets 12.6.0 and my command line was :
./datasets download genome taxon "Campylobacter coli" --exclude-gff3 --exclude-rna --exclude-protein --dehydrated
Is there a fix for this problem? I'm trying to download the whole genome dataset for this bacterial species to do comparative genomic studies on so downloading from the website would be complicated.
Thank you!
Hi VPEMERIDIAN,
Thanks for your feedback. I was unable to reproduce this exact error on my home computer. There is a known bug where the command-line tool reports gateway errors while trying to find sequence_report files that do not exist, however, you should still be able to download all available genomic sequence and annotation files.
We will continue trying to reproduce the error that you encountered and we plan to make time to improve the overall reliability of the tool soon.
Thanks again for your feedback.
-Eric
Eric Cox, PhD [Contractor] (he/him/his) NCBI Datasets Sequence Enhancements, Tools and Delivery (SeqPlus) NIH/NLM/NCBI
Hello,
I've succesfully downloaded a dehydrated dataset of Campylobacter coli, althought, I'm getting this error during the rehydration process :
~/D$ ./datasets rehydrate --directory Campylo_coli/ Found 38905 files for rehydration Completed 467 of 38905 [------------------------------------------------] 1% Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490195.1/GCF_001490195.1_EC3511_genomic.fna 1.59MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490415.1/GCF_001490415.1_H042120298_genomic.fna 1.9MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490335.1/GCF_001490335.1_EC3952_genomic.fna 1.66MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490395.1/GCF_001490395.1_SS_2234_genomic.fna 1.68MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490255.1/GCF_001490255.1_EC3525_genomic.fna 1.8MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490315.1/GCF_001490315.1_CCN257_genomic.fna 1.82MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490295.1/GCF_001490295.1_EC4297_genomic.fna 1.68MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490375.1/GCF_001490375.1_SS_2356_genomic.fna 1.68MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490215.1/GCF_001490215.1_EC3575_genomic.fna 1.59MB done Downloading: Campylo_coli/ncbi_dataset/data/GCF_001490235.1/GCF_001490235.1_H072820535_genomic.fna 1.71MB done panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x87db75]
**goroutine 13 [running]: main/datasets/datasets.downloadFileWorker.func2(0xc0009b5f58, 0xc000994000, 0xc0009b5f40, 0xc000200080, 0xc000032480) /export/home/tomcat/TeamCity/Agent4/work/c6e6852d9a243866/dataloader/apps/public/Datasets/datasets/datasets/Rehydrate.go:191 +0x375 main/datasets/datasets.downloadFileWorker(0xc000200080, 0xc000032300, 0xc000032480) /export/home/tomcat/TeamCity/Agent4/work/c6e6852d9a243866/dataloader/apps/public/Datasets/datasets/datasets/Rehydrate.go:216 +0x107 created by main/datasets/datasets.downloadMultipleFiles /export/home/tomcat/TeamCity/Agent4/work/c6e6852d9a243866/dataloader/apps/public/Datasets/datasets/datasets/Rehydrate.go:241 +0x165
I'm using datasets 12.6.0 and my command line was :
./datasets download genome taxon "Campylobacter coli" --exclude-gff3 --exclude-rna --exclude-protein --dehydrated
Is there a fix for this problem? I'm trying to download the whole genome dataset for this bacterial species to do comparative genomic studies on so downloading from the website would be complicated.
Thank you!
I have the same issue. I have to redownload everything because there is no --resume
option.