Auth in Google Colaboratory notebooks
Manually transferring from https://github.com/tidyverse/googledrive/issues/284, since I can't truly do a cross-org transfer.
This is the one about auth in Colaboratory / Jupypter notebooks, where you can't really do normal oob auth, but because interactive() returns FALSE, the normal oauth dance doesn't work either.
See also https://github.com/r-lib/httr/pull/634 for an experiment around that (the interactive() issue).
Following up with this issue,
I was previously using jobdiogenes's workaround to access Google Drive from R in Colaboratory, which changed is_interactive() in httr to always return true.
In the newest release though, I've had to also add options(rlang_interactive = TRUE).
Excluding either of these workarounds gives the non-interactive error:
Error: Can't get Google credentials.
Are you running googledrive in a non-interactive session? ...
There's also a separate, related (?) issue when calling drive_auth with cache = TRUE twice. The first time around, drive_auth works and displays the oauth prompt. But if one calls it again, it seems to not be able to accept user input the second time around:
The googledrive package is requesting access to your Google account. Select a pre-authorised account or enter '0' to obtain a new token. Press Esc/Ctrl + C to abort.
Error: Can't get Google credentials.
Are you running googledrive in a non-interactive session?...
Here's a minimal working version (that also demos the second issue above): https://colab.research.google.com/drive/1OEOk5iMfubV6gawzvlJK6Jz8Xpw_MEVw?usp=sharing
In the newest release though, I've had to also add
options(rlang_interactive = TRUE).
This makes sense in light of changes in the most recent gargle release.
Just from reading, I don't have immediate insight into your second observation. Perhaps a first-hand experience is necessary.
A quick note that this issue was also solved via the method linked above using the same process of overwriting is_interactive()
Note that google is completely shutting down out-of-band auth by the end of this month, after it being deprecated for at least a year.
https://developers.google.com/identity/protocols/oauth2/resources/oob-migration
Note that google is completely shutting down out-of-band auth by the end of this month, after it being deprecated for at least a year.
Addressed in #202
I just went to re-familiarize myself with the Colab experience and it seems like newly created notebooks can't use R. Do I have that right @craigcitro (or anyone else who knows)?
I have some pre-existing notebooks that use R that seem to still be executable.
It is still possible to create a new Colab notebook with R, using the following link: https://colab.research.google.com/#create=true&language=r
I am having authentication issues though, and jobdiogenes's method of overwriting is_interactive() is not working for me. What is your current recommendation?
What is your current recommendation?
I'm not sure I have one. But hopefully I can use your tip to get back into position to experiment there and that will lead to some progress.
Thanks!
(I'm surprised that the httr patch no longer works, so I would consider trying that again, in case you did something like forget to restart R. Also, make sure you're using current gargle; it recently had a release.)
It may be version installed on Colab is older since it follows Anaconda package schedules. Would check if you can install latest version.
Update: I did the httr patch before installing and loading googledrive, and this helped, to a point.
When I run drive_auth(use_oob = TRUE), I no longer get the "Can't get Google credentials, Are you running googledrive in a non-interactive session?" error. Instead I'm prompted to "point your browser to the following url" and "Enter authorization code."
However, once I click the link, I get the "Access blocked: Tidyverse API Packagesβs request is invalid" 400 error due to the changes in OOB.
I confirmed that the gargle version is 1.3.0. It seems like the conventional OOB flow is being triggered rather than the new pseudo-OOB. I'm guessing the issue is that gargle should be using a "web" client for Colab just like on RStudio Server, Posit Cloud, and Posit Workbench, but it's using the "installed" client on Colab?
I have the same problem. What is the current recommended way to read from google drive in R colab notebooks?
@jennybc @MarkEdmondson1234 Is there a way to specify the tidyverse_client type (web or installed) when calling drive_auth or bq_auth? Looking through the documentation and source code it doesn't seem possible.
You should be able to force the pseudo-oob flow by setting a global option (and many client packages also accept use_oob = TRUE in their auth function).
Set the option with code like this, before any gargle usage:
options(gargle_oob_default = TRUE)
Or you can request it in an explicit auth call, e.g.:
googledrive::drive_auth(use_oob = TRUE)
Trying the following locally in Rstudio:
library("googledrive")
library("googlesheets4")
gs4_auth(use_oob = TRUE, cache=FALSE)
brings up a browser window with the message:
Access blocked: Tidyverse API Packagesβs request is invalid
You canβt sign in because Tidyverse API Packages sent an invalid request. You can try again later, or contact the [developer](https://accounts.google.com/) about this issue. [Learn more about this error](https://support.google.com/accounts/answer/12379384)
If you are a developer of Tidyverse API Packages, see [error details](https://accounts.google.com/).
Error 400: invalid_request
Or you can request it in an explicit auth call, e.g.:
googledrive::drive_auth(use_oob = TRUE)
This, or the global option, forces the oob flow but not necessarily the pseudo-oob flow. If I'm reading the code for tidyverse_client() correctly, the traditional oob vs pseudo-oob choice is determined by is_rstudio_server() and can't be explicitly called from googledrive::drive_auth(use_oob = TRUE)
OK yes you folks are right. I will need to develop a way to explicitly request pseudo-oob in this setting. Let me create a PR with some working solution for you to try. Please hold.
In case anyone here know the answer: Does Colab set an environment variable or provide some other way to detect that code is running in that context?
I will also try to discover the answer myself.
In the meanwhile, I've actually had success using this (maximal) example in Colaboratory, borrowing ideas from jobdiogenes and @leon-seranova:
require(devtools)
install_version("httr", version = "1.4.4", repos = "http://cran.us.r-project.org")
install_version("R.utils", version = "2.12.2", repos = "http://cran.us.r-project.org")
install_version("gargle", version = "1.3.0", repos = "http://cran.us.r-project.org")
install_version("googledrive", version = "2.0.0", repos = "http://cran.us.r-project.org")
if (file.exists("/usr/local/lib/python3.8/dist-packages/google/colab/_ipython.py")) {
library(R.utils)
library(httr)
reassignInPackage("is_interactive", pkgName = "httr", function() return(TRUE))
library(gargle)
reassignInPackage("is_rstudio_server", pkgName = "gargle", function() return(TRUE))
} else {
stop("Failed to reassign is_interactive and is_rstudio_server!")
}
library(tidyverse)
library(googledrive)
options(rlang_interactive=TRUE)
drive_auth(use_oob = TRUE, cache = FALSE)
Thanks @jcccf that's helpful to know, i.e. to confirm exactly where our two blockers are. Perhaps my experimental PR could implement both, to create a working solution for Colab. Not sure I could release such a thing on CRAN (with the httr monkey patch), but let's just get a branch that works, then worry about that.
The existence of the env var COLAB_RELEASE_TAG seems like a good candidate for testing if we're on Colab.
Re: detecting Colab: using either $COLAB_RELEASE_TAG or the existence of /var/colab/hostname are both reliable ways of detecting that you're in a managed Colab backend.
Re: package versions: LMK if there's something I can/should update version-wise on the Colab side. π
Re: httr::is_interactive: does httr expose a way to declare ourselves as in an interactive environment? If so, I'm happy to add some code to the Colab .Rprofile so it's always set.
Re: httr::is_interactive: does httr expose a way to declare ourselves as in an interactive environment? If so, I'm happy to add some code to the Colab .Rprofile so it's always set.
I think I will be able to get httr to start using rlang::is_interactive() instead of base::interactive() and then, yes, we can explicitly declare the session to be interactive via an option. Which I will probably be able to do within gargle. I'm not sure you'd want to globally declare Colab to be interactive.
I just successfully auth'ed on Colab and listed some Google Drive files.
Here's how to install gargle from my branch / draft PR (you could do similar with remotes::install_github()):
install.packages("pak")
pak::pak("r-lib/gargle@google-colab")
install.packages("googledrive")
Note that I also installed googledrive.
With that experimental version of gargle, I can do:
library(googledrive)
drive_auth(cache = FALSE)
drive_find(n_max = 5)
This also works with cache = TRUE. The point is that cache can't be unspecified, because then gargle wants to interact with you about that via utils::menu(), which apparently fails here. As an alternative to specifying cache in the drive_auth() call, one can also express the same via the "gargle_oauth_cache" option.
I'll be interested to hear if others can replicate my success.
So if I do the same, but allowing the token to be cached:
library(googledrive)
drive_auth(cache = TRUE)
drive_find(n_max = 5)
I can restart the runtime, say it's OK to auto-discover an existing token, I can use the cached token to list files again:
library(googledrive)
options(gargle_oauth_email = TRUE)
drive_find(n_max = 5)
I'll be interested to hear if others can replicate my success.
This worked for me. I also tried bq_auth() and that worked too. Thank you @jennybc !
The necessary version of httr is now on CRAN.
Another thing I want to improve is how cache establishment works on Colab.
We can't use utils::menu() in Jupyter, which is currently how we ask an interactive user what to do about caching. This is why a Colab user has to pro-actively address caching. But I can make that easier.
readline() does work, because it has been shimmed in IRKernel. The (pseudo-)OOB flow already uses readline(), so I plan to do similar re: checking in with a Colab user re: the cache.
PR where the readline() shim was introduced:
https://github.com/IRkernel/IRkernel/pull/452
Lots of things make more sense now that I've (partially) read: https://github.com/IRkernel/IRkernel/blob/d0d5ccccee23d798d53b79e14c5ab5935b17f8d8/R/execution.r
OK even the interaction around initating the cache or selecting from available user tokens works the same on Colab now as it does, e.g., on a local computer (in the google-colab branch).
I had a similar problem and was successful with the code @jcccf presented. However, I am not sure of the cause of the problem, as the authentication did not work with the code provided by @jennybc. Thank you for your precious information. It was very helpful for me.
@nmarusan can you be more specific?
the authentication did not work with the code provided by @jennybc
@jennybc my colab information is next: `R version 4.2.2 Patched (2022-11-10 r83330) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.5 LTS
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] rstudioapi_0.14 magrittr_2.0.3 rappdirs_0.3.3 tidyselect_1.2.0
[5] uuid_1.1-0 R6_2.5.1 rlang_1.0.6 fastmap_1.1.1
[9] fansi_1.0.4 httr_1.4.5 dplyr_1.1.0 tools_4.2.2
[13] utf8_1.2.3 cli_3.6.0 htmltools_0.5.4 digest_0.6.30
[17] tibble_3.1.8 gargle_1.3.0 lifecycle_1.0.3 crayon_1.5.2
[21] IRdisplay_1.1 purrr_1.0.1 repr_1.1.6 base64enc_0.1-3
[25] vctrs_0.5.2 fs_1.6.1 curl_5.0.0 IRkernel_1.3.2
[29] glue_1.6.2 evaluate_0.20 pbdZMQ_0.3-9 compiler_4.2.2
[33] pillar_1.8.1 generics_0.1.3 googledrive_2.0.0 jsonlite_1.8.3
[37] pkgconfig_2.0.3 `
First of all, I executed the following code that you suggested.
install.packages("pak") pak::pak("r-lib/gargle@google-colab") install.packages("googledrive")
The result is here: Installing package into β/usr/local/lib/R/site-libraryβ (as βlibβ is unspecified)
! Using bundled GitHub PAT. Please add your own PAT using gitcreds::gitcreds_set(). ! Using bundled GitHub PAT. Please add your own PAT using gitcreds::gitcreds_set().
β Updated metadata database: 2.71 MB in 6 files. β Updated metadata database: 2.71 MB in 6 files.
βΉ Updating metadata database βΉ Updating metadata database
β Updating metadata database ... done β Updating metadata database ...
β Will install 16 packages.
β Will install 16 packages.
β Will download 15 CRAN packages (6.02 MB).
β Will download 15 CRAN packages (6.02 MB).
β Will download 1 package with unknown size.
β Will download 1 package with unknown size.
-
askpass 1.1 [bld][cmp][dl] (5.73 kB)
-
cli 3.6.0 [bld][cmp][dl] (565.15 kB)
-
curl 5.0.0 [bld][cmp][dl] (682.05 kB)
-
fs 1.6.1 [bld][cmp][dl] (1.19 MB)
-
gargle 1.3.0.9000 [bld][cmp][dl] (GitHub: bb94a25)
-
glue 1.6.2 [bld][cmp][dl] (106.51 kB)
-
httr 1.4.5 [bld][dl] (160.87 kB)
-
jsonlite 1.8.4 [bld][cmp][dl] (1.05 MB)
-
lifecycle 1.0.3 [bld][dl] (106.85 kB)
-
mime 0.12 [bld][cmp][dl] (12.56 kB)
-
openssl 2.0.5 [bld][cmp][dl] (1.20 MB)
-
R6 2.5.1 [bld][dl] (63.42 kB)
-
rappdirs 0.3.3 [bld][cmp][dl] (12.29 kB)
-
rlang 1.0.6 [bld][cmp][dl] (742.51 kB)
-
sys 3.4.1 [bld][cmp][dl] (20.13 kB)
-
withr 2.5.0 [bld][dl] (102.09 kB)
-
askpass 1.1 [bld][cmp][dl] (5.73 kB)
-
cli 3.6.0 [bld][cmp][dl] (565.15 kB)
-
curl 5.0.0 [bld][cmp][dl] (682.05 kB)
-
fs 1.6.1 [bld][cmp][dl] (1.19 MB)
-
gargle 1.3.0.9000 [bld][cmp][dl] (GitHub: bb94a25)
-
glue 1.6.2 [bld][cmp][dl] (106.51 kB)
-
httr 1.4.5 [bld][dl] (160.87 kB)
-
jsonlite 1.8.4 [bld][cmp][dl] (1.05 MB)
-
lifecycle 1.0.3 [bld][dl] (106.85 kB)
-
mime 0.12 [bld][cmp][dl] (12.56 kB)
-
openssl 2.0.5 [bld][cmp][dl] (1.20 MB)
-
R6 2.5.1 [bld][dl] (63.42 kB)
-
rappdirs 0.3.3 [bld][cmp][dl] (12.29 kB)
-
rlang 1.0.6 [bld][cmp][dl] (742.51 kB)
-
sys 3.4.1 [bld][cmp][dl] (20.13 kB)
-
withr 2.5.0 [bld][dl] (102.09 kB)
βΉ Getting 15 pkgs (6.02 MB) and 1 pkg with unknown size
βΉ Getting 15 pkgs (6.02 MB) and 1 pkg with unknown size
β Got askpass 1.1 (source) (5.73 kB)
β Got askpass 1.1 (source) (5.73 kB)
β Got mime 0.12 (source) (12.56 kB)
β Got mime 0.12 (source) (12.56 kB)
β Got rappdirs 0.3.3 (source) (12.29 kB)
β Got rappdirs 0.3.3 (source) (12.29 kB)
β Got R6 2.5.1 (source) (63.42 kB)
β Got R6 2.5.1 (source) (63.42 kB)
β Got cli 3.6.0 (source) (565.15 kB)
β Got cli 3.6.0 (source) (565.15 kB)
β Got lifecycle 1.0.3 (source) (106.85 kB)
β Got lifecycle 1.0.3 (source) (106.85 kB)
β Got curl 5.0.0 (source) (682.05 kB)
β Got curl 5.0.0 (source) (682.05 kB)
β Got sys 3.4.1 (source) (20.13 kB)
β Got sys 3.4.1 (source) (20.13 kB)
β Got rlang 1.0.6 (source) (742.51 kB)
β Got rlang 1.0.6 (source) (742.51 kB)
β Got glue 1.6.2 (source) (106.51 kB)
β Got glue 1.6.2 (source) (106.51 kB)
β Got httr 1.4.5 (source) (160.87 kB)
β Got httr 1.4.5 (source) (160.87 kB)
β Got withr 2.5.0 (source) (102.09 kB)
β Got withr 2.5.0 (source) (102.09 kB)
β Got jsonlite 1.8.4 (source) (1.05 MB)
β Got jsonlite 1.8.4 (source) (1.05 MB)
β Got fs 1.6.1 (source) (1.19 MB)
β Got fs 1.6.1 (source) (1.19 MB)
β Got gargle 1.3.0.9000 (source) (403.17 kB)
β Got gargle 1.3.0.9000 (source) (403.17 kB)
β Got openssl 2.0.5 (source) (1.20 MB)
β Got openssl 2.0.5 (source) (1.20 MB)
βΉ Building cli 3.6.0
βΉ Building cli 3.6.0
βΉ Building curl 5.0.0
βΉ Building curl 5.0.0
β Built curl 5.0.0 (7.8s)
β Built curl 5.0.0 (7.8s)
βΉ Building fs 1.6.1
βΉ Building fs 1.6.1
β Built cli 3.6.0 (21.2s)
β Built cli 3.6.0 (21.2s)
βΉ Building glue 1.6.2
βΉ Building glue 1.6.2
β Built glue 1.6.2 (3s)
β Built glue 1.6.2 (3s)
βΉ Building jsonlite 1.8.4
βΉ Building jsonlite 1.8.4
β Built jsonlite 1.8.4 (9.6s)
β Built jsonlite 1.8.4 (9.6s)
βΉ Building mime 0.12
βΉ Building mime 0.12
β Built mime 0.12 (2.1s)
β Built mime 0.12 (2.1s)
βΉ Building R6 2.5.1
βΉ Building R6 2.5.1
β Built R6 2.5.1 (2.8s)
β Built R6 2.5.1 (2.8s)
βΉ Building rappdirs 0.3.3
βΉ Building rappdirs 0.3.3
β Built rappdirs 0.3.3 (2.8s)
β Built rappdirs 0.3.3 (2.8s)
βΉ Building rlang 1.0.6
βΉ Building rlang 1.0.6
β Built fs 1.6.1 (54.9s)
β Built fs 1.6.1 (54.9s)
βΉ Building sys 3.4.1
βΉ Building sys 3.4.1
β Built sys 3.4.1 (2.7s)
β Built sys 3.4.1 (2.7s)
βΉ Building withr 2.5.0
βΉ Building withr 2.5.0
β Built withr 2.5.0 (4.3s)
β Built withr 2.5.0 (4.3s)
β Installed cli 3.6.0 (82ms)
β Installed cli 3.6.0 (82ms)
β Installed curl 5.0.0 (74ms)
β Installed curl 5.0.0 (74ms)
β Installed fs 1.6.1 (78ms)
β Installed fs 1.6.1 (78ms)
β Installed glue 1.6.2 (50ms)
β Installed glue 1.6.2 (50ms)
β Installed jsonlite 1.8.4 (72ms)
β Installed jsonlite 1.8.4 (72ms)
β Installed mime 0.12 (50ms)
β Installed mime 0.12 (50ms)
β Installed R6 2.5.1 (40ms)
β Installed R6 2.5.1 (40ms)
β Installed rappdirs 0.3.3 (44ms)
β Installed rappdirs 0.3.3 (44ms)
β Installed sys 3.4.1 (41ms)
β Installed sys 3.4.1 (41ms)
βΉ Building askpass 1.1
βΉ Building askpass 1.1
β Built askpass 1.1 (2s)
β Built askpass 1.1 (2s)
β Installed askpass 1.1 (43ms)
β Installed askpass 1.1 (43ms)
βΉ Building openssl 2.0.5
βΉ Building openssl 2.0.5
β Built rlang 1.0.6 (33.4s)
β Built rlang 1.0.6 (33.4s)
β Installed rlang 1.0.6 (95ms)
β Installed rlang 1.0.6 (95ms)
βΉ Building lifecycle 1.0.3
βΉ Building lifecycle 1.0.3
β Built lifecycle 1.0.3 (3.3s)
β Built lifecycle 1.0.3 (3.3s)
β Installed lifecycle 1.0.3 (75ms)
β Installed lifecycle 1.0.3 (75ms)
β Installed withr 2.5.0 (1.1s)
β Installed withr 2.5.0 (1.1s)
β Built openssl 2.0.5 (10.3s)
β Built openssl 2.0.5 (10.3s)
β Installed openssl 2.0.5 (59ms)
β Installed openssl 2.0.5 (59ms)
βΉ Building httr 1.4.5
βΉ Building httr 1.4.5
β Built httr 1.4.5 (4.5s)
β Built httr 1.4.5 (4.5s)
β Installed httr 1.4.5 (1.1s)
β Installed httr 1.4.5 (1.1s)
βΉ Packaging gargle 1.3.0.9000
βΉ Packaging gargle 1.3.0.9000
β Packaged gargle 1.3.0.9000 (577ms)
β Packaged gargle 1.3.0.9000 (577ms)
βΉ Building gargle 1.3.0.9000
βΉ Building gargle 1.3.0.9000
β Built gargle 1.3.0.9000 (6.1s)
β Built gargle 1.3.0.9000 (6.1s)
β Installed gargle 1.3.0.9000 (github::r-lib/gargle@bb94a25) (34ms)
β Installed gargle 1.3.0.9000 (github::r-lib/gargle@bb94a25) (34ms)
β 1 pkg + 15 deps: added 16, dld 16 (NA B) [1m 52.7s]
β 1 pkg + 15 deps: added 16, dld 16 (NA B) [1m 52.7s]
Installing package into β/usr/local/lib/R/site-libraryβ (as βlibβ is unspecified)
Next, I executed the following code.
library(googledrive) drive_auth(cache = FALSE) drive_find(n_max = 5)
The error message is as follows:
Error in drive_auth():
! Can't get Google credentials
βΉ Are you running googledrive in a non-interactive session? Consider:
β’ drive_deauth() to prevent the attempt to get credentials
β’ Call drive_auth() directly with all necessary specifics
βΉ See gargle's "Non-interactive auth" vignette for more details:
βΉ https://gargle.r-lib.org/articles/non-interactive-auth.html
Traceback:
- drive_auth(cache = FALSE)
- drive_abort(c("Can't get Google credentials", i = "Are you running googledrive in a non-interactive session? \\n Consider:",
.
*= "{.fun drive_deauth} to prevent the attempt to get credentials", .*= "Call {.fun drive_auth} directly with all necessary specifics", . i = "See gargle's "Non-interactive auth" vignette for more details:", . i = "{.url https://gargle.r-lib.org/articles/non-interactive-auth.html}")) - cli::cli_abort(message = message, ..., .envir = .envir)
- rlang::abort(message, ..., call = call, use_cli_format = TRUE, . .frame = .frame)
- signal_abort(cnd, .file)
However, when I restarted and executed the next code, the authentication went well.
library(googledrive) options(gargle_oauth_email = TRUE) drive_find(n_max = 5)