rentrez
rentrez copied to clipboard
Downloading large data sets
I'm trying to access a really large number of records (10,000 to be exact) and used the tutorial to try and attain this. So, I run the following code, first to save web history:
pubmed_search <- entrez_search(db = "pubmed", term = "Case Reports[Filter] AND cardiovascular disease AND English[lang] AND 2009:2019[PDat])", retmax = 792711, use_history = TRUE)
Then, I try to download first 10,000 files:
for( seq_start in seq(1,10000,100)){ recs <- entrez_summary(db="pubmed", web_history=pubmed_search$web_history, retmax=100, retstart=seq_start) cat(seq_start+99, "sequences downloaded\r") } length(recs)
But, I only get 100 files, not 10,000. Can someone help with this, I'm quite confused here as to how to use the web history feature
Hi Kevin,
This for loop over-writes recs
every time through the loop, so you will only get the last 100, you will want to append to recs or use lapply to return a list of lists?