curl
curl copied to clipboard
Issues with postfieldsize in recent R curl versions
For several years now, I have built my post requests in R curl with (full example ahead):
postfields.opt <- list(post = TRUE, postfields = postfields, postfieldsize = nchar(postfields))
handle_setopt(handle, .list = postfields.opt)
curl_fetch_memory(url, handle = handle)
These lines have served me well over the years, and as for postfieldsize
, according to libcurl:
If you want to post static data to the server without having libcurl do a strlen() to measure the data size, this option must be used. When this option is used you can post fully binary data, which otherwise is likely to fail. If this size is set to -1, libcurl uses strlen() to get the size or relies on the CURLOPT_READFUNCTION (if used) to signal the end of data.
As it seems, postfieldsize
is necessary for binary data and harmless for other type of data.
However, when using curl recently, my POSTs started not working. Specifically, this snippet gives a 400 error:
library('curl')
library('jsonlite')
handle <- new_handle()
postfields <- "key1=value1&key2=value2"
postfields.opt <- list(post = TRUE, postfields = postfields, postfieldsize = nchar(postfields))
handle_setopt(handle, .list = postfields.opt)
resp <- curl_fetch_memory("https://httpbin.org/post", handle = handle)
parse_headers(resp$headers)[1] # HTTP/2 400
fromJSON(rawToChar(resp$content))$form # broken
Note that replacing the copypostfields
field for postfields
does not help, that is:
postfields.opt <- list(post = TRUE, copypostfields = postfields, postfieldsize = nchar(postfields))
In fact, in libcurl, postfields
is a pointer, so the developer has the responsibility to assure the pointed data integrity until the transfer finishes, andcopypostfields
avoids this by copying the data. However, in R, once you issue a handle_setopt()
, with POST data, they are passed by value, not by reference (unless there is some environment variable involved).
The error is removed if postfieldsize
parameter is removed. Therefore, the correct snippet is:
library('curl')
library('jsonlite')
handle <- new_handle()
postfields <- "key1=value1&key2=value2"
postfields.opt <- list(post = TRUE, postfields = postfields)
handle_setopt(handle, .list = postfields.opt)
rm(postfields)
rm(postfields.opt)
resp <- curl_fetch_memory("https://httpbin.org/post", handle = handle)
parse_headers(resp$headers)[1] # HTTP/2 200
fromJSON(rawToChar(resp$content))$form
# > $key1
# [1] "value1"
#
# $key2
# [1] "value2"
I have tested this on multiple websites and, given the libcurl docs already linked, I wonder if this is just due to R curl implementation and possibly a bug.
For example, this snippet still using postfieldsize
, but based on the old RCurl package, works like a charm:
library(RCurl)
library('jsonlite')
postfields <- "key1=value1&key2=value2"
postfields.opt <- list(post = TRUE, postfields = postfields, postfieldsize = nchar(postfields))
resp <- getURL("https://httpbin.org/post", .opts = postfields.opt)
fromJSON(resp)$form
Finally, this chunk of code surprisingly works:
library('curl')
library('jsonlite')
handle <- new_handle()
postfields <- "key1=value1&key2=value2"
postfields.opt <- list(post = TRUE, postfieldsize = nchar(postfields), postfields = postfields)
handle_setopt(handle, .list = postfields.opt)
rm(postfields)
rm(postfields.opt)
resp <- curl_fetch_memory("https://httpbin.org/post", handle = handle)
parse_headers(resp$headers)[1]
fromJSON(rawToChar(resp$content))$form
If your first glance fails you, I have just swapped the order of postfields
and postfieldsize
parameters.
It shouldn't be so, since R users expect f(x = 1, y = 2)
and f(y = 2, x = 1)
to return the same output. Maybe the C interface code does not preserve the integrity of the data buffer, as required by libcurl, making them inconsistent with the length.
I would appreciate any enlightenment on this.
I am using curl_5.0.0 with libcurl 8.4.0 under Linux.