googledrive
googledrive copied to clipboard
Provide workflows for the "missing functions"
There are certain things we don't do right now. And might not ever. These are things that Drive does not directly support but that are possible by composing several operations. We should reveal the workflow, using the functions googledrive has, in an article:
- Copy a Drive folder, recursively. Demo for at least a simple folder that contains only files.
- Upload a local folder, recursively. See also #25 for some discussion.
- ~~List a Drive folder, recursively.~~ done by
drive_ls()
now - A general pattern for mapping functions that are not vectorized. Example: doing
drive_ls()
for a vector of folders. - more to come
Update: many of these are less interesting now that drive_ls()
supports recursion. But will leave this open as a place to collect workflows to incorporate into the website.
Fodder for a permissions workflow.
This snippet produces a list with one tibble per file, with one row per permission. The variables are
- role
- type
- emailAddress
- displayName
- domain
- id
It seems like a good start for a workflow (and ... one day function?) that helps people delve more deeply into permissions.
drive_auth("jenny-at-rstudio-noncaching-token.rds")
x <- drive_find(corpus = "domain", n_max = 25)
x <- x %>% drive_reveal("permissions")
make_permissions_tibble <- function(pr) {
if (is.null(pr)) return(NULL)
perms <- pr$permissions
tibble::tibble(
role = purrr::map_chr(perms, "role"),
type = purrr::map_chr(perms, "type"),
emailAddress = purrr::map_chr(perms, "emailAddress", .default = NA_character_),
displayName = purrr::map_chr(perms, "displayName", .default = NA_character_),
domain = purrr::map_chr(perms, "domain", .default = NA_character_ ),
id = purrr::map_chr(perms, "id", .default = NA_character_)
)
}
purrr::map(x$permissions_resource, make_permissions_tibble)
When I do this with my RStudio account, I can see good examples of files where I'm not allowed to get permissions (the NULL
case above) and others with a nice mix of user
, group
, and domain
grants.
I have a personal implementation of drive_download_dir()
that would address the first bullet point. The main issue is it uses unchecked recursion which could theoretically cause stack overflow for incredibly nested folder structures. :weary:
https://github.com/MilesMcBain/mmmisc/blob/master/R/uitls.R#L210
googledrive also gained the ability to recursively list a folder via drive_ls()
since we last wrote anything here. That is helpful, but doesn't completely get any of these jobs done, of course.
@MilesMcBain I have been looking for a way to "drive_download" an entire google drive folder and I haven't been able to, the link you uploaded seemed to have fixed it but the link is broken. I was ale to identify them using drive_ls but I'm not sure how to download the files after identifying them. Any leads @jennybc ?
Quite strange re the link. Try this one: https://github.com/MilesMcBain/mmmisc/blob/master/R/utils.R
On Sat, 29 Dec. 2018, 8:05 am Sebastian Tapia <[email protected] wrote:
@MilesMcBain https://github.com/MilesMcBain I have been looking for a way to "drive_download" an entire google drive folder and I haven't been able to, the link you uploaded seemed to have fixed it but the link is broken. I was ale to identify them using drive_ls but I'm not sure how to download the files after identifying them. Any leads @jennybc https://github.com/jennybc ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tidyverse/googledrive/issues/123#issuecomment-450434671, or mute the thread https://github.com/notifications/unsubscribe-auth/AJiIOutwBmFmx4QEx8rb7KFUCtJ3C04Mks5u9pWogaJpZM4OLVSG .
that one worked @MilesMcBain, thank you so much! I will look into it after new year.
Happy holidays!
Here's a pattern for downloading multiple files from one folder on a googledrive that might be useful to include in https://googledrive.tidyverse.org/articles/articles/multiple-files.html
library(googledrive)
library(purrr)
# here is the ID of our g-drive data folder that contains all the files want to download
data_folder_on_googl_drv_url <- "xxxx"
# get the IDs for each file in our g-drive folder
data_file_ids_on_googl_drv <-
drive_ls(as_id(data_folder_on_googl_drv_url))
# download them to our local folder
pwalk(data_file_ids_on_googl_drv,
~drive_download(as_id(..2), # ..2 refers to column 2 for the ID
# puts the files in /data/raw-data using the same file names
# that we see on g-drive, ..1 refers to column 1 of our dribble
# where the file name is stored
path = here("data", "raw-data", ..1),
overwrite = TRUE))
What do you think?
There must be another way to iterate using id
and name
instead of the more cryptic ..1
and ..2
I just did it this way:
library(googledrive)
library(tidyverse)
url <- "YOUR_FOLDER_URL_GOES_HERE"
x <- url %>%
as_id() %>%
drive_get()
x
ids <- x %>%
drive_ls()
walk2(ids$id, ids$name, ~ drive_download(as_id(.x), .y))
Another snippet for looking at who has what role on a set of files.
library(tidyverse)
library(googledrive)
library(googlesheets4)
#>
#> Attaching package: 'googlesheets4'
#> The following objects are masked from 'package:googledrive':
#>
#> request_generate, request_make
googlesheets4:::sheets_auth_docs()
#> [1] "[email protected]"
#> Logged in as:
#> * displayName: [email protected]
#> * emailAddress: [email protected]
x <- sheets_examples() %>%
drive_reveal("permissions")
x %>%
select(name, file_id = id, permissions_resource) %>%
hoist(permissions_resource, permissions = "permissions") %>%
select(-permissions_resource) %>%
unnest_longer(permissions) %>%
unnest_wider(permissions)
#> # A tibble: 18 x 11
#> name file_id kind id type emailAddress role displayName photoLink
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 gapm… 1U6Cf_… driv… 0868… user jenny@rstud… writ… Jenny Bryan https://…
#> 2 gapm… 1U6Cf_… driv… anyo… anyo… <NA> read… <NA> <NA>
#> 3 gapm… 1U6Cf_… driv… 0581… user googlesheet… owner googleshee… <NA>
#> 4 mini… 1k94ZV… driv… 0868… user jenny@rstud… writ… Jenny Bryan https://…
#> 5 mini… 1k94ZV… driv… anyo… anyo… <NA> read… <NA> <NA>
#> 6 mini… 1k94ZV… driv… 0581… user googlesheet… owner googleshee… <NA>
#> 7 form… 1wPLrW… driv… 0868… user jenny@rstud… writ… Jenny Bryan https://…
#> 8 form… 1wPLrW… driv… anyo… anyo… <NA> read… <NA> <NA>
#> 9 form… 1wPLrW… driv… 0581… user googlesheet… owner googleshee… <NA>
#> 10 cell… 1peJXE… driv… 0868… user jenny@rstud… writ… Jenny Bryan https://…
#> 11 cell… 1peJXE… driv… anyo… anyo… <NA> read… <NA> <NA>
#> 12 cell… 1peJXE… driv… 0581… user googlesheet… owner googleshee… <NA>
#> 13 deat… 1tuYKz… driv… 0868… user jenny@rstud… writ… Jenny Bryan https://…
#> 14 deat… 1tuYKz… driv… anyo… anyo… <NA> read… <NA> <NA>
#> 15 deat… 1tuYKz… driv… 0581… user googlesheet… owner googleshee… <NA>
#> 16 chic… 1ct9t1… driv… 0868… user jenny@rstud… writ… Jenny Bryan https://…
#> 17 chic… 1ct9t1… driv… anyo… anyo… <NA> read… <NA> <NA>
#> 18 chic… 1ct9t1… driv… 0581… user googlesheet… owner googleshee… <NA>
#> # … with 2 more variables: deleted <lgl>, allowFileDiscovery <lgl>
Created on 2019-10-10 by the reprex package (v0.3.0.9000)
Here's how I downloaded a folder's worth of files for an R advent calendar mentioned in this tweet:
https://kiirstio.wixsite.com/kowen/post/the-25-days-of-christmas-an-r-advent-calendar
library(googledrive)
library(fs)
library(purrr)
# this folder is world-readable so you don't have to auth (but you can)
drive_deauth()
# adjust to where YOU want this to go
local_dir <- "~/rrr/2019-advent-calendar"
dir_create(local_dir)
url <- "https://drive.google.com/drive/folders/1eTu5QFSSGUeBjYrnyUnuXqRz8OXyFLy2"
(x <- url %>%
as_id() %>%
drive_get())
(files <- drive_ls(x))
walk2(files$id, files$name, ~ drive_download(as_id(.x), path(local_dir, .y)))
# optional and has nothing to do with googledrive
usethis::create_project(local_dir)
Known deficiency: doesn't deal with eventualities, such as subdirectories.
Yet another rectangling example for looking at permissions.
library(tidyverse)
library(googledrive)
# hidden chunk here with auth a spreadsheet id
ssid %>%
drive_get() %>%
drive_reveal("permissions") %>%
hoist(permissions_resource, "permissions") %>%
select(!ends_with("_resource")) %>%
unnest_longer(permissions) %>%
unnest_wider(permissions, names_sep = "_") %>%
select(name, id, permissions_type, permissions_emailAddress, permissions_role,
permissions_displayName)
#> # A tibble: 3 x 6
#> name id permissions_type permissions_ema… permissions_role
#> <chr> <chr> <chr> <chr> <chr>
#> 1 whol… 1EB-… user hadley@rstudio.… writer
#> 2 whol… 1EB-… user [email protected] writer
#> 3 whol… 1EB-… user [email protected]… owner
#> # … with 1 more variable: permissions_displayName <chr>
Created on 2020-08-27 by the reprex package (v0.3.0.9001)
Thank you all for these wonderful workflows! Might anyone have an implementation for uploading an entire folder?
Just in case this is useful, I've written an alternative option for downloading the contents of a drive folder recursively, maintaining the file strucure from Google drive: https://gist.github.com/h-a-graham/27f3fceca4616cd54809dd3c28b8689b Thanks for the great package!