usethis
usethis copied to clipboard
`use_zip` error in unzipping directory with a different name from the zipfile
I am running into a missing value error trying to download and unzip large files with the use_zip
function.
I was able to isolate the issue to an incompatibility with the top_directory
function for extracting the head directory of the zip files in the helper function tidy_unzip
. Here is the reprex with output included:
## usethis::use_zip() issue reprex
# Goal: download and unzip large files stored on a Figshare platform
# long link: https://agdatacommons.nal.usda.gov/ndownloader/files/44576230
# bitly link: https://bit.ly/ctdv2sp
# attempt with existing function
> usethis::use_zip("https://bit.ly/ctdv2sp",
destdir = "C:/temp")
✔ Downloading from 'https://bit.ly/ctdv2sp'
Downloaded: 312.74 MB (100%)
✔ Download stored in 'C:/temp/species_v2_3.zip'
Error in if (length(unique_top) > 1 || !is_directory) { :
missing value where TRUE/FALSE needed
# separate process into its helper functions to source the error
> usethis:::tidy_download("https://bit.ly/ctdv2sp",
destdir = "C:/temp")
Downloaded: 312.74 MB (100%)
> usethis:::tidy_unzip("C:/temp/species_v2_3.zip")
Error in if (length(unique_top) > 1 || !is_directory) { :
missing value where TRUE/FALSE needed
## Dig deeper into `tidy_unzip`
# ABnote: first, define zipfile downloaded with `tidy_download`
> zipfile <- "C:/temp/species_v2_3.zip"
> file.exists(zipfile)
[1] TRUE
> base_path <- path_dir(zipfile)
> print(base_path)
[1] "C:/temp"
> filenames <- utils::unzip(zipfile, list = TRUE)[["Name"]]
> print(filenames) # you can see that the zipfile has a single-folder directory with a different name than the zip file
[1] "species_v2/label_encoder.txt" "species_v2/model_arch.pt" "species_v2/model_weights.pth"
# dropbox particularites do not apply here, skip next two lines
> td <- top_directory(filenames)
Error in if (length(unique_top) > 1 || !is_directory) { :
missing value where TRUE/FALSE needed
## AH HA! Look at `top_directory` function
> in_top <- path_dir(filenames) == "."
[1] FALSE FALSE FALSE
> unique_top <- unique(filenames[in_top])
character(0)
# I think the code should extract the dir `species_v2` from my zipfile here
> is_directory <- grepl("/$", unique_top)
logical(0)
# address conditionals one at a time
> length(unique_top) > 1
[1] FALSE
> !is_directory
logical(0)
# conditional statement is throwing the error b/c directory was not correctly extracted
> length(unique_top) > 1 || !is_directory
[1] NA
> if (length(unique_top) > 1 || !is_directory) {
+ NA_character_
+ } else {
+ unique_top
+ }
Error in if (length(unique_top) > 1 || !is_directory) { :
missing value where TRUE/FALSE needed
The code creating the top directory is not correctly addressing the structure of my zip files and extracting the unique directory within the zipfile.
# current structure
> in_top <- path_dir(filenames) == "."
> all(in_top) == FALSE
[1] TRUE
> unique_top <- unique(filenames[in_top])
character(0)
# the folder name at the top of the directory inside the zip file should be extracted
> path_dir(filenames)
[1] "species_v2" "species_v2" "species_v2"
> unique(path_dir(filenames))
[1] "species_v2"
A revised top_directory
function makes this work:
top_directory <- function(filenames) {
in_top <- path_dir(filenames) == "."
unique_top <- unique(filenames[in_top])
is_directory <- grepl("/$", unique_top)
if(length(unique(path_dir(filenames)))==1 & length(unique_top) == 0){
unique_top <- unique(path_dir(filenames))
} else {
if (length(unique_top) > 1 || !is_directory) {
NA_character_
} else {
unique_top
}
}
}
# run use_zip with the revised helper
use_zip(url = "https://bit.ly/ctdv2sp",
+ destdir = "C:/temp", cleanup=T)
✔ Downloading from <https://bit.ly/ctdv2sp>.
Downloaded: 312.74 MB (100%)
✔ Download stored in C:/temp/species_v2_3.zip.
✔ Unpacking ZIP file into species_v2/ (3 files extracted).
✔ Deleting species_v2_3.zip.
✔ Opening species_v2/ in the file manager.
I have made this change to a new branch in my forked clone of the repository here, and it passes the devtools::check
tests. Please let me know if this change needs to be more generalized or if it is ready for a pull request. Thank you!