archive
archive copied to clipboard
Suggest adding new `archive_extract_filter` which uses `fs::path_filter`
I want to loop through many zip files, extracting just files of a specific type (e.g. csv). I have written a helper that combines archive::archive_extract
with fs::path_filter
which I think could be useful to others:
archive_extract_filter <- function(archive, dir = ".", glob = NULL, regexp = NULL, invert = FALSE, ...) {
archive_contents <- archive::archive(archive)
filtered_contents <- fs::path_filter(archive_contents$path, glob=glob, regexp=regexp, invert=invert)
archive::archive_extract(
archive = archive,
dir = dir,
files = filtered_contents,
...
)
}
Or perhaps even better create an archive_filter
function that does the first part. Ideally, it would even return an archive object that it could be passed directly to archive_extract
, but even if not it would simplify current usage e.g.:
archive_filter <- function(archive, glob = NULL, regexp = NULL, invert = FALSE, ...) {
archive_contents <- archive::archive(archive, ...)
return(fs::path_filter(archive_contents$path, glob=glob, regexp=regexp, invert=invert))
}
archive_extract(archive, files=archive_filter(archive))
# Ideally, you archive_filter would create an archive object that could be passed directly
# archive |> archive_filter(...) |> archive_extract()
Thanks for the suggestion! I decided that I am not going to add this now, but it would be nice to mention it in the manual, and/or have an example along these lines in the manual. Would you like to submit a PR for that? (No pressure at all.)
@gaborcsardi I wasn't sure if you meant to add it just for the archive_extract
manual, or the README, but I added to both. If you intended something else, please let me know!