BIS
BIS copied to clipboard
get_datasets() proxy issue
hello, Like the same issue here (https://github.com/expersso/OECD/issues/11) can you modify the package to support corporate proxy using httr: the solution is to modify the get_datasets() function like below:
get_datasets <- function() { url <- complete_url("/statistics/full_data_sets.htm") page <- xml2::read_html(httr::GET(url)) nodes <- rvest::html_nodes(page, xpath = "//a[contains(@href, 'zip')]") dplyr::tibble(name = rvest::html_text(nodes), url = complete_url(rvest::html_attr(nodes, "href"))) }
I don't think adding httr
as a dependency is the right course of action here. The package should work fine with a corporate proxy as long as you set your https_proxy
environmental variable.
Thank you for your response, it's not working for me using http_proxy and https_proxy environmental variable, i get 407 error, we are using NTLM auth,
datasets <- get_datasets() Error in open.connection(x, "rb") : Received HTTP code 407 from proxy after CONNECT
the only solution that is working for me is to usee httr::GET(url)
thank you :)
Hi,
Apologies for returning to a pretty old issue - but I've just discovered this after a colleague ran into the same problem. As it happens, I'm also the author of the linked issue above in the OECD package.
After testing, I also agree with @ab2dridi - get_datasets()
doesn't work even when the proxy server address is configured with an environment variable: that isn't always enough to authenticate with the proxy. So I think the change he suggests, to use httr::GET()
, would be very helpful. (Would you be open to a PR?)
(Digging into details a bit - the key thing seems to be that xml2::read_html(url)
uses the curl package under the hood, and I don't know of any way to configure that with the proxy server's authentication mode (ie NTLM in our case). It doesn't appear that libcurl has an environment variable to set this, sadly - see here).