rentrez icon indicating copy to clipboard operation
rentrez copied to clipboard

Downloading large data sets

Open kevinchen27 opened this issue 4 years ago • 1 comments

I'm trying to access a really large number of records (10,000 to be exact) and used the tutorial to try and attain this. So, I run the following code, first to save web history:

pubmed_search <- entrez_search(db = "pubmed", term = "Case Reports[Filter] AND cardiovascular disease AND English[lang] AND 2009:2019[PDat])", retmax = 792711, use_history = TRUE)

Then, I try to download first 10,000 files:

for( seq_start in seq(1,10000,100)){ recs <- entrez_summary(db="pubmed", web_history=pubmed_search$web_history, retmax=100, retstart=seq_start) cat(seq_start+99, "sequences downloaded\r") } length(recs)

But, I only get 100 files, not 10,000. Can someone help with this, I'm quite confused here as to how to use the web history feature

kevinchen27 avatar Jul 17 '19 00:07 kevinchen27

Hi Kevin,

This for loop over-writes recs every time through the loop, so you will only get the last 100, you will want to append to recs or use lapply to return a list of lists?

dwinter avatar Jul 17 '19 01:07 dwinter