asdfree
asdfree copied to clipboard
Errors downloading basic monthly CPS
Love the lodown package, have used it to successfully download and parse ACS and CES data. Tried to use it to download the basic monthly CPS, and got an error. Not doing anything fancy with the code:
library(lodown)
lodown( "cpsbasic" , output_dir = file.path( path.expand( "~" ) , "CPSBASIC" ) )
get_catalog("cpsbasic")
Here's the error:
Error in rvest::html_table(xml2::read_html(cps_ftp), fill = TRUE)[[2]] : subscript out of bounds
I looked at the installation code for the package and found that the error was generated by this line in cpsbasic.R:
cps_table <- rvest::html_table( xml2::read_html( cps_ftp ) , fill = TRUE )[[2]]
I removed the subscript [[2]] (I don't see that any functions point to cps_table...) and reinstalled the package manually. It now gives me a much longer error message:
> lodown( "cpsbasic" , output_dir = file.path( path.expand( "~" ) , "CPSBASIC" ) )
building catalog for cpsbasic
locally downloading cpsbasic
downloading from URL
'//www2.census.gov/programs-surveys/cps/datasets/2022/basic/jan22pub.zip'
to file
'C:\Users\EU0122~1\AppData\Local\Temp\2\RtmpyM7PoD\file34308331970'
download issue with
'//www2.census.gov/programs-surveys/cps/datasets/2022/basic/jan22pub.zip'
download issue with
'//www2.census.gov/programs-surveys/cps/datasets/2022/basic/jan22pub.zip'
download issue with
'//www2.census.gov/programs-surveys/cps/datasets/2022/basic/jan22pub.zip'
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lodown_0.1.0 usethis_1.5.0 devtools_2.0.2 RevoUtils_11.0.3 RevoUtilsMath_11.0.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 compiler_3.5.3 prettyunits_1.0.2 remotes_2.0.4 tools_3.5.3 testthat_2.3.2 digest_0.6.18 pkgbuild_1.3.1 pkgload_1.0.2 lattice_0.20-38 memoise_1.1.0
[12] rlang_0.4.5 Matrix_1.2-17 cli_2.0.2 curl_3.3 withr_2.5.0 httr_1.4.2 stringr_1.4.0 xml2_1.2.0 desc_1.4.1 fs_1.4.1 grid_3.5.3
[23] rprojroot_1.3-2 glue_1.4.0 R6_2.4.0 processx_3.4.2 fansi_0.4.0 survival_2.44-1.1 sessioninfo_1.1.1 callr_3.2.0 selectr_0.4-1 magrittr_1.5 splines_3.5.3
[34] backports_1.1.4 ps_1.3.2 assertthat_0.2.1 rvest_0.3.3 survey_3.35-1 stringi_1.4.3 crayon_1.3.4
lodown is now exiting unexpectedly.
websites that host publicly-downloadable microdata change often and sometimes those changes cause this software to break.
if the error call stack below appears to be a hiccup in your internet connection, then please verify your connectivity and retry the download.
otherwise, please open a new issue at `https://github.com/ajdamico/asdfree/issues` with the contents of this error call stack and also the output of your `sessionInfo()`.
[[1]]
lodown("cpsbasic", output_dir = file.path(path.expand("~"),
"CPSBASIC"))
[[2]]
withCallingHandlers(catalog <- load_fun(data_name = data_name,
catalog, ...), error = function(e) {
print(sessionInfo())
if (grepl("cannot allocate vector of size", e))
message(memory_note)
else if (grepl("parameter must be specified", e))
message(parameter_note)
else if (grepl("to install", e))
message(installation_note)
else {
message(unknown_error_note)
print(sys.calls())
}
})
[[3]]
load_fun(data_name = data_name, catalog, ...)
[[4]]
cachaca(catalog[i, "full_url"], tf, mode = "wb")
[[5]]
httr_filesize(this_url, attempts, sleepsec)
[[6]]
stop(paste0("httr::HEAD( '", url, "' )\nfailed after ",
initial.attempts, " attempts"))
[[7]]
.handleSimpleError(function (e)
{
print(sessionInfo())
if (grepl("cannot allocate vector of size", e))
message(memory_note)
else if (grepl("parameter must be specified", e))
message(parameter_note)
else if (grepl("to install", e))
message(installation_note)
else {
message(unknown_error_note)
print(sys.calls())
}
}, "httr::HEAD( '//www2.census.gov/programs-surveys/cps/datasets/2022/basic/jan22pub.zip' )\nfailed after 3 attempts",
quote(httr_filesize(this_url, attempts, sleepsec)))
[[8]]
h(simpleError(msg, call))
Error in httr_filesize(this_url, attempts, sleepsec) :
httr::HEAD( '//www2.census.gov/programs-surveys/cps/datasets/2022/basic/jan22pub.zip' )
failed after 3 attempts
year month X..www2.census.gov.programs.surveys.cps.datasets.2022.basic.2020_Basic_CPS_Public_Use_Record_Layout_plus_IO_Code_list.txt version
2 2022 1 //www2.census.gov/programs-surveys/cps/datasets/2022/basic/2020_Basic_CPS_Public_Use_Record_Layout_plus_IO_Code_list.txt basic
1 2022 2 //www2.census.gov/programs-surveys/cps/datasets/2022/basic/2020_Basic_CPS_Public_Use_Record_Layout_plus_IO_Code_list.txt basic
full_url dd output_filename case_count
2 //www2.census.gov/programs-surveys/cps/datasets/2022/basic/jan22pub.zip <NA> C:\\Users\\EU01221457\\Documents/CPSBASIC/2022 01 cps basic.rds NA
1 //www2.census.gov/programs-surveys/cps/datasets/2022/basic/feb22pub.zip <NA> C:\\Users\\EU01221457\\Documents/CPSBASIC/2022 02 cps basic.rds NA
Here's my sessionInfo():
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lodown_0.1.0 usethis_1.5.0 devtools_2.0.2 RevoUtils_11.0.3 RevoUtilsMath_11.0.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 compiler_3.5.3 prettyunits_1.0.2 remotes_2.0.4 tools_3.5.3 testthat_2.3.2 digest_0.6.18 pkgbuild_1.3.1 pkgload_1.0.2 lattice_0.20-38 memoise_1.1.0 rlang_0.4.5
[13] Matrix_1.2-17 cli_2.0.2 curl_3.3 withr_2.5.0 httr_1.4.2 stringr_1.4.0 xml2_1.2.0 desc_1.4.1 fs_1.4.1 grid_3.5.3 rprojroot_1.3-2 glue_1.4.0
[25] R6_2.4.0 processx_3.4.2 fansi_0.4.0 survival_2.44-1.1 sessioninfo_1.1.1 callr_3.2.0 selectr_0.4-1 magrittr_1.5 splines_3.5.3 backports_1.1.4 ps_1.3.2 assertthat_0.2.1
[37] rvest_0.3.3 survey_3.35-1 stringi_1.4.3 crayon_1.3.4
>
Any chance for a fix or some troubleshooting? Thanks.
hi :-) might be some time before i'm able to debug this, a pull request would be excellent if you believe you can fix the issue!
hi! apologies for the long delay. i've made a couple of big updates to asdfree.com that hopefully make the website a bit better, but i've decided to stop maintaining the lodown package so probably won't fix the bug you've reported. the new asdfree does have acs, ces, and cps-asec data, but they're only for the most current year and unfortunately doesn't include the cps basic. thanks