crul
crul copied to clipboard
How to retry requests for AsyncQueue?
Could you please add an example to the documentation for retry() that describes how one can retry requests for AsyncQueue().
library(crul)
library(data.table)
library(jsonlite)
# 5,000 URLs/API calls
tmp = fread("testURLs.txt", header=F)
urls = unlist(tmp)
reqlist <- c()
for(i in 1:length(urls)){
http_request = HttpRequest$new(urls[[i]])$get()
reqlist[[i]] <- http_request
}
## rate-limit of 300/min did not have any 429 status for 5000 API calls
## However, there are ~100 "500 status" which vary from run-to-run
## These need to be retried
out <- AsyncQueue$new(.list = reqlist, req_per_min = 300)
start <- Sys.time()
out$request() # make requests
end <- Sys.time()
total_time <- as.numeric (end - start, units = "mins")
print(paste0("Making Requests() took ", total_time, "minutes."))
start <- Sys.time()
out$responses() # list responses
end <- Sys.time()
total_time <- as.numeric (end - start, units = "mins")
print(paste0("Making Responses() took ", total_time, "minutes."))
# Take a look for 429 status?
resp <- out$responses()
x <- c()
for(i in 1:length(resp)){
if (resp[[i]]$status_code==200){
#print("200 status returned")
} else if (resp[[i]]$status_code==404) {
#print(paste0("404 status returned for row ", i, ": Results not Found"))
} else if (resp[[i]]$status_code==429) {
#print(paste0("429 status returned for row ", i, ": Rate limit exceeded"))
} else if (resp[[i]]$status_code==500) {
print(paste0("500 status returned for row ", i, ": An internal error has occurred"))
x[i] <- i
} else {
print(paste0(resp[[i]]$status_code, " for row ", i))
}
}
x <- x[!is.na(x)]
length(x) # shows "101"
If I ran this code again I see there are different 500 statuses being returned which means that some of these API calls need to be retried.
### some of the 500 status are being repeated but not all of them.
### we need to see if these are actual errors from the API that can be re-tried
### or if they are consistent errors because information for that uniprotid do not exist
## There were 100 "500 status" errors thrown from 5000 API calls for the first attempt and 101 for the second
try1 <- as.data.frame(x)
##
try2 <- as.data.frame(x)
library(dplyr)
anti_join(try1, try2)
anti_join(try2, try1)
The following doesn't work for the resp or out objects: (res_get <- x$retry("GET", path = "status/400")).
How can retry() be run on AsyncQueue() results?
Is retry() currently not supported by AsyncQueue()?
If so, could this please 🙏 be added?
Yeah, sorry HttpRequest doesn't have a retry method yet. So I think it just needs to be supported there, and then you can use it in AsyncQueue
- [x] Add retry method to
HttpRequest
asked curl maintainer about this, we'll see
@moldach finally this is done. install from github and try again. there's a brief example in the AsyncQueue docs https://docs.ropensci.org/crul/reference/AsyncQueue.html#ref-examples