gutenbergr icon indicating copy to clipboard operation
gutenbergr copied to clipboard

`gutenberg_download()` not working: "Could not download a book at http://aleph.gutenberg.org/x/x/x/x.zip"

Open matthew-law opened this issue 3 years ago • 5 comments

I haven't been able to get the gutenberg_download() function to work at all. A couple of examples, both drawn from the examples provided in the readme:

> gutenberg_download(768)
# A tibble: 0 x 2
# … with 2 variables: gutenberg_id <int>, text <chr>
Warning messages:
1: In .f(.x[[i]], ...) :
  Could not download a book at http://aleph.gutenberg.org/7/6/768/768.zip
2: Unknown or uninitialised column: `text`. 
> aristotle_books <- gutenberg_works(author == "Aristotle") %>%
+     gutenberg_download(meta_fields = "title")
Error: Problem with `mutate()` column `gutenberg_id`.
ℹ `gutenberg_id = as.integer(gutenberg_id)`.
ℹ `gutenberg_id` must be size 0 or 1, not 7.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: In .f(.x[[i]], ...) :
  Could not download a book at http://aleph.gutenberg.org/1/9/7/1974/1974.zip
2: In .f(.x[[i]], ...) :
  Could not download a book at http://aleph.gutenberg.org/2/4/1/2412/2412.zip
3: In .f(.x[[i]], ...) :
  Could not download a book at http://aleph.gutenberg.org/6/7/6/6762/6762.zip
4: In .f(.x[[i]], ...) :
  Could not download a book at http://aleph.gutenberg.org/6/7/6/6763/6763.zip
5: In .f(.x[[i]], ...) :
  Could not download a book at http://aleph.gutenberg.org/8/4/3/8438/8438.zip
6: In .f(.x[[i]], ...) :
  Could not download a book at http://aleph.gutenberg.org/1/2/6/9/12699/12699.zip
7: In .f(.x[[i]], ...) :
  Could not download a book at http://aleph.gutenberg.org/2/6/0/9/26095/26095.zip

I assume that the issue is to do with the aleph.gutenberg.org mirror, since none of the addresses it generates give me anything when pasted into a web browser. FWIW I have been able to use gutenberg_works() fine on its own to return queried metadata, just not gutenberg_download(). I'm running gutenbergr 0.2.0, downloaded from CRAN.


Session info: R version 4.0.5 (2021-03-31) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.7

matthew-law avatar May 23 '21 19:05 matthew-law

Have you tried a different mirror?

library(gutenbergr)
my_mirror <- "http://mirrors.xmission.com/gutenberg/"
gutenberg_download(768, mirror = my_mirror)
#> # A tibble: 12,314 x 2
#>    gutenberg_id text               
#>           <int> <chr>              
#>  1          768 "Wuthering Heights"
#>  2          768 ""                 
#>  3          768 "by Emily Brontë"  
#>  4          768 ""                 
#>  5          768 ""                 
#>  6          768 ""                 
#>  7          768 ""                 
#>  8          768 "CHAPTER I"        
#>  9          768 ""                 
#> 10          768 ""                 
#> # … with 12,304 more rows

Created on 2021-05-23 by the reprex package (v2.0.0)

juliasilge avatar May 23 '21 19:05 juliasilge

That works perfectly, thanks!

I'll leave it up to you whether to keep the issue open or not – it works as intended when I use the different mirror (so my specific problem is solved), but without that alteration the behaviour doesn't match what would be expected and throws errors when running the examples.

matthew-law avatar May 24 '21 06:05 matthew-law

It's super frustrating that the mirror determined via http://www.gutenberg.org/robot/harvest seems to be just failing all the time now.

juliasilge avatar May 24 '21 14:05 juliasilge

My student and I just experienced this issue, after having code with the default mirror work fine a couple weeks ago! I wonder if a more informative error message for gutenberg_download would be in order, like

you may want to select a different mirror, go to https://www.gutenberg.org/MIRRORS.ALL to see options

I figured that was probably the issue, but wasn't sure how to change the mirror (I clicked on the link to http://www.gutenberg.org/robot/harvest?filetypes[]=txt in the gutenberg_get_mirror documentation hoping to find the list of mirrors, but that brings you to something different).

AmeliaMN avatar Mar 22 '22 19:03 AmeliaMN

my_mirror <- "http://mirrors.xmission.com/gutenberg/"
gutenberg_download(768, mirror = my_mirror)

You. Saved....my life, Julia! I have scoured the webs looking for a solution to this problem! Stack, Git, RStudio Community....nothing. I actually reached out to both you (to connect on LinkedIn, a few days back) and David, the package designer and your Co-Author friend too. Was VERY stuck on a project, and this worked swimmingly....Cheers! Getenberg_R_Error

ReadWalden avatar Jul 20 '22 14:07 ReadWalden