httr with upload_file in body can not be re-created internally in vcr
cc @aaronwolen
In our work to fix how webmockr/vcr handle bodies correctly, I noticed that we cannot re-create requests that have an upload passed directly to body (not in a list). This concerns this code block most importantly https://github.com/ropensci/vcr/blob/master/R/request_handler-httr.R#L102-L109 but also this https://github.com/ropensci/vcr/blob/master/R/request_handler-httr.R#L76-L81 if a user uses the ignored requests feature.
Re-creating what essentially is happening in those linked code blocks above, ...
library(httr)
This works (character string to body)
e <- POST("https://httpbin.org/post", body = "foo bar")
# re-create the request
POST("https://httpbin.org/post", do.call(config, e$request$options))
And this works (upload in a list)
b <- POST("https://httpbin.org/post",
body = list(y = upload_file(system.file("CITATION"))))
# re-create the request
POST("https://httpbin.org/post",
body = list(y = upload_file(system.file("CITATION"))),
do.call(config, b$request$options))
But this doesn't work (it just hangs indefinitely)
d <- POST("https://httpbin.org/post",
body = upload_file(system.file("CITATION")))
# re-create the request
POST("https://httpbin.org/post", do.call(config, d$request$options))
The problem stems from that we CAN get what's passed in the body from the request object created by httr EXCEPT when upload_file is passed directly to body (and there may be other cases I don't know about) . When upload_file is passed directly to body, the details are al in the $options in the request object, and there is no trace of the file path for the upload, or the content type. When an upload is passed in a list, we can get the needed information from the request.
The way webmockr/vcr are setup, I don't think we have access to what was passed to body originally, only what's in the httr $request object
Any thoughts on this @jennybc ?
Having read the above once / quickly, I don't immediately have any insights. I've never tinkered on a part of httr that left with knowledge that helps me. That is, I'd have to study up and do my own analysis to contribute anything.
Afraid I don't have an answer either but I can reliably reproduce the issue. It looks like the recreated httr request hangs on curl::curl_fetch_memory(), which times out after 60s for me. Incidentally, the wait time can be modified:
POST("https://httpbin.org/post", do.call(config, c(d$request$options, timeout_ms = 1000)))
I did notice the two packages handle a NULL connection differently within the body of readfunction:
-
httr returns
raw()if (is.null(con)) { return(raw()) } -
whereas crul assigns the filepath of
bodyif (is.null(con)) con <<- file(filePath, "rb") if (is.null(con)) return(raw())so
return(raw())is never encountered
The recreated request does work if httr is modified to return a connection to the file body
readfunction = function(nbytes, ...) {
if (is.null(con)) {
+ con <- file(body$path, "rb")
- return(raw())
}
bin <- readBin(con, "raw", nbytes)
if (length(bin) < nbytes) {
close(con)
con <<- NULL
}
bin
}
but it's not clear to me whether that's the "right" thing to do or not.
thanks @jennybc - no worries
thanks @aaronwolen for that insight. I imagine something that has to be changed within httr will probably not work as a solution - but if that's the only or best solution then we can at least approach them and see if they're open to the change.
Not sure if it's helpful but I use httr::upload_file() in a function and cannot create the fixture corresponding to that, I get a 500 error. https://github.com/maelle/goodpress/pull/8 I am happy to provide more information (the same code/test works without vcr).
in the case of the API I'm working with, the filename is also passed as a header actually (Content-Disposition).
Thanks @maelle - I'll have another look at this very soon since you are encountering this