gutenbergr
gutenbergr copied to clipboard
`gutenberg_download()` not working: "Could not download a book at http://aleph.gutenberg.org/x/x/x/x.zip"
I haven't been able to get the gutenberg_download()
function to work at all. A couple of examples, both drawn from the examples provided in the readme:
> gutenberg_download(768)
# A tibble: 0 x 2
# … with 2 variables: gutenberg_id <int>, text <chr>
Warning messages:
1: In .f(.x[[i]], ...) :
Could not download a book at http://aleph.gutenberg.org/7/6/768/768.zip
2: Unknown or uninitialised column: `text`.
> aristotle_books <- gutenberg_works(author == "Aristotle") %>%
+ gutenberg_download(meta_fields = "title")
Error: Problem with `mutate()` column `gutenberg_id`.
ℹ `gutenberg_id = as.integer(gutenberg_id)`.
ℹ `gutenberg_id` must be size 0 or 1, not 7.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: In .f(.x[[i]], ...) :
Could not download a book at http://aleph.gutenberg.org/1/9/7/1974/1974.zip
2: In .f(.x[[i]], ...) :
Could not download a book at http://aleph.gutenberg.org/2/4/1/2412/2412.zip
3: In .f(.x[[i]], ...) :
Could not download a book at http://aleph.gutenberg.org/6/7/6/6762/6762.zip
4: In .f(.x[[i]], ...) :
Could not download a book at http://aleph.gutenberg.org/6/7/6/6763/6763.zip
5: In .f(.x[[i]], ...) :
Could not download a book at http://aleph.gutenberg.org/8/4/3/8438/8438.zip
6: In .f(.x[[i]], ...) :
Could not download a book at http://aleph.gutenberg.org/1/2/6/9/12699/12699.zip
7: In .f(.x[[i]], ...) :
Could not download a book at http://aleph.gutenberg.org/2/6/0/9/26095/26095.zip
I assume that the issue is to do with the aleph.gutenberg.org mirror, since none of the addresses it generates give me anything when pasted into a web browser. FWIW I have been able to use gutenberg_works()
fine on its own to return queried metadata, just not gutenberg_download()
.
I'm running gutenbergr 0.2.0, downloaded from CRAN.
Session info: R version 4.0.5 (2021-03-31) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.7
Have you tried a different mirror?
library(gutenbergr)
my_mirror <- "http://mirrors.xmission.com/gutenberg/"
gutenberg_download(768, mirror = my_mirror)
#> # A tibble: 12,314 x 2
#> gutenberg_id text
#> <int> <chr>
#> 1 768 "Wuthering Heights"
#> 2 768 ""
#> 3 768 "by Emily Brontë"
#> 4 768 ""
#> 5 768 ""
#> 6 768 ""
#> 7 768 ""
#> 8 768 "CHAPTER I"
#> 9 768 ""
#> 10 768 ""
#> # … with 12,304 more rows
Created on 2021-05-23 by the reprex package (v2.0.0)
That works perfectly, thanks!
I'll leave it up to you whether to keep the issue open or not – it works as intended when I use the different mirror (so my specific problem is solved), but without that alteration the behaviour doesn't match what would be expected and throws errors when running the examples.
It's super frustrating that the mirror determined via http://www.gutenberg.org/robot/harvest seems to be just failing all the time now.
My student and I just experienced this issue, after having code with the default mirror work fine a couple weeks ago! I wonder if a more informative error message for gutenberg_download
would be in order, like
you may want to select a different mirror, go to https://www.gutenberg.org/MIRRORS.ALL to see options
I figured that was probably the issue, but wasn't sure how to change the mirror (I clicked on the link to http://www.gutenberg.org/robot/harvest?filetypes[]=txt in the gutenberg_get_mirror
documentation hoping to find the list of mirrors, but that brings you to something different).
my_mirror <- "http://mirrors.xmission.com/gutenberg/" gutenberg_download(768, mirror = my_mirror)
You. Saved....my life, Julia! I have scoured the webs looking for a solution to this problem! Stack, Git, RStudio Community....nothing.
I actually reached out to both you (to connect on LinkedIn, a few days back) and David, the package designer and your Co-Author friend too. Was VERY stuck on a project, and this worked swimmingly....Cheers!