lodown icon indicating copy to clipboard operation
lodown copied to clipboard

[POF] lodown download issue

Open gpompeo opened this issue 4 years ago • 3 comments

I am getting a download error when I try to get the POF catalog. Here are the messages I get:

pof_cat <- lodown( "pof" , pof_cat ) locally downloading pof

downloading from URL 'ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/Dados.zip' to file 'C:\Users\GUILHE~1\AppData\Local\Temp\RtmpoJHztT\file659463f974d5'

download issue with 'ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/Dados.zip'

download issue with 'ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/Dados.zip'

download issue with 'ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/Dados.zip'

R version 3.6.1 (2019-07-05) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] lodown_0.1.0

loaded via a namespace (and not attached): [1] httr_1.4.1 compiler_3.6.1 R6_2.4.0 tools_3.6.1 curl_4.2 Rcpp_1.0.3 cellranger_1.1.0 readxl_1.3.1 digest_0.6.22

lodown is now exiting unexpectedly. websites that host publicly-downloadable microdata change often and sometimes those changes cause this software to break. if the error call stack below appears to be a hiccup in your internet connection, then please verify your connectivity and retry the download. otherwise, please open a new issue at https://github.com/ajdamico/asdfree/issues with the contents of this error call stack and also the output of your sessionInfo().

[[1]] lodown("pof", pof_cat)

[[2]] withCallingHandlers(catalog <- load_fun(data_name = data_name, catalog, ...), error = function(e) { print(sessionInfo()) if (grepl("cannot allocate vector of size", e)) message(memory_note) else if (grepl("parameter must be specified", e)) message(parameter_note) else if (grepl("to install", e)) message(installation_note) else { message(unknown_error_note) print(sys.calls()) } })

[[3]] load_fun(data_name = data_name, catalog, ...)

[[4]] cachaca(catalog[i, "full_urls"], tf, mode = "wb")

[[5]] httr_filesize(this_url, attempts, sleepsec)

[[6]] stop(paste0("httr::HEAD( '", url, "' )\nfailed after ", initial.attempts, " attempts"))

[[7]] .handleSimpleError(function (e) { print(sessionInfo()) if (grepl("cannot allocate vector of size", e)) message(memory_note) else if (grepl("parameter must be specified", e)) message(parameter_note) else if (grepl("to install", e)) message(installation_note) else { message(unknown_error_note) print(sys.calls()) } }, "httr::HEAD( 'ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/Dados.zip' )\nfailed after 3 attempts", base::quote(httr_filesize(this_url, attempts, sleepsec)))

[[8]] h(simpleError(msg, call))

Error in httr_filesize(this_url, attempts, sleepsec) : httr::HEAD( 'ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/Dados.zip' ) failed after 3 attempts full_urls period 1 ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/Dados.zip 2017_2018 2 ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2008_2009/Microdados/Dados.zip 2008_2009 3 ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2002_2003/Microdados/Dados.zip 2002_2003 documentation 1 ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/documentacao.zip 2 ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2008_2009/Microdados/documentacao.zip 3 ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2002_2003/Microdados/Documentacao.zip aliment_file output_folder case_count 1 ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2017_2018/Microdados/tradutores.zip D:/OneDrive/Documents/POF/2017_2018 NA 2 ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2008_2009/Microdados/tradutores.zip D:/OneDrive/Documents/POF/2008_2009 NA 3 <NA> D:/OneDrive/Documents/POF/2002_2003 NA

I have the 7-zip installed.

gpompeo avatar Nov 12 '19 01:11 gpompeo

UPDATE: I tried changing filesize_fun to fix download from FTP considering that httr is not supposed to work with ftp (r-lib/httr#537). Got the download going but got stuck in a loop downloading (and unzipping) the file until ran out of attempts.

gpompeo avatar Nov 13 '19 02:11 gpompeo

Hey @gpompeo, I managed to fix the problem with the httr:HEAD changing the filesize_fun parameter as well and updated the link from 2017-2018 that was broken. Made a pull request for that. Despite that, after downloading and unziping the files, my script stops to work at the step of unpacking the 7z files. It actually unpacks one file but something happens after that.

Have you find the solutions to make the package work properly?

gocdata avatar Sep 30 '20 13:09 gocdata

@gocdata I haven´t managed to get it to work properly. I´ve noticed that there were several changes in the POF (specially on the first releases) so I tried a different path and managed to pre-process the files manually.

gpompeo avatar Sep 30 '20 14:09 gpompeo