Unable connect dvc to Google Drive. Access blocked!
Added by @shcheklein :
See details and workaround here - https://github.com/iterative/dvc/issues/10516#issuecomment-2289652067
Failed to authenticate GDrive: "This app is blocked"
Description
When I use DVC commands with a gdrive remote storage configuration, I encounter an issue where it's impossible to authenticate with my Google account.
Reproduce
After initiating the command
dvc get https://github.com/my-data-registry data/samples
a browser window opens for authentication, but upon selecting my Google account, I'm directed to a page displaying the message:
This app is blocked
This app tried to access sensitive info in your Google Account. To keep your account safe, Google blocked this access.
Environment information
Output of dvc doctor:
$ dvc doctor
Platform: Python 3.11.6 on macOS-13.5-arm64-arm-64bit
Subprojects:
dvc_data = 3.15.2
dvc_objects = 5.1.0
dvc_render = 1.0.2
dvc_task = 0.4.0
scmrepo = 3.3.7
Supports:
gdrive (pydrive2 = 1.20.0),
http (aiohttp = 3.10.2, aiohttp-retry = 2.8.3),
https (aiohttp = 3.10.2, aiohttp-retry = 2.8.3)
Config:
Global: /Users/myself/Library/Application Support/dvc
System: /Library/Application Support/dvc
I'm not sure if this is a bug, but any help with this issue would be greatly appreciated!
I have precisely the same issue.
Same problem
TL;DR:
The DVC app (that is used by default by DVC) is blocked by Google because they changed some policies and we need pass the verifications again. There was nothing bad happening (like security breaches or violations) on our end. There is not easy way to pass it. For now the recommended way (and it was always the recommended way) - is to create a custom app. Here is the link. It's not very complicated and should work just fine for everyone.
Longer version
Tue, Nov 14, 2023 - Google reached out with this message:
As part of our commitment to user privacy and security, Google requires developers that use our APIs to demonstrate that their apps comply with our policies. We have identified that your app’s use of Restricted Drive API scopes may require additional verification steps.
DVC app indeed depends on the drive.files OAuth scope (that gives the full access to all the files / directories in the Google Drive). Since we don't know in advance which directory users would need to use a remote storage + for things like dvc import-url, dvc import (if a different remote is used).
- all the tokens are stored locally, we don't use any servers, DVC team doesn't see them, etc. It is safe enough to our mind for the default mode, it's better of course to use the custom app otherwise as mentioned above.
Anyways, it would be better to have a more granular permissions. And it seems Google understand this, we also like it. The only issue is that there is no API or any way to let users pick a specific dir in CLI. Here is the relevant ticket for this. But it's not resolved yet.
So, we kinda stuck in limbo with this a bit - we can't pass verification (since they are requesting a video explainer where it's clear why we need drive.file), and we can't implement a granular scope management for the default app atm.
I'm open to any ideas on this.
Also a relevant discussion on the rclone forum - https://forum.rclone.org/t/google-drive-builtin-app-verification/43919/5 .
@shcheklein, thank you for the clarification! We will proceed with the custom app option.
Same problem here
Same problem here
@tharhtetsan Find the answer here - https://github.com/iterative/dvc/issues/10516#issuecomment-2289652067
:angry: google disgraceful policy
the custom app using the Google cloud option works, but would have preferred the older way of authenticating with gdrive, which was fairly easy
Even the Google cloud option didn't work for me, it failed with ERROR: unexpected error - Failed to authenticate GDrive: 'access_token' during dvc push
You can authenticate in google at your own. First, you need to create oauth client id (like here). Then, download the client id json and use the following code:
import json
from oauth2client.client import OAuth2WebServerFlow, flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client.tools import run_flow
# Path to your OAuth2 client_id.json file
CLIENT_SECRET_FILE = 'client_secret.json'
# The scope for the Google Drive API
SCOPES = ['https://www.googleapis.com/auth/drive', 'https://www.googleapis.com/auth/drive.appdata']
def get_token_oauth2client():
# Load the client secrets from the JSON file
flow = flow_from_clientsecrets(CLIENT_SECRET_FILE, scope=SCOPES)
# Run the authentication flow and retrieve credentials
storage = Storage('token_oauth2client.json')
credentials = run_flow(flow, storage)
with open('generated_token.json', 'w') as token_file:
token_file.write(credentials.to_json())
print("Token information saved to generated_token.json")
if __name__ == '__main__':
get_token_oauth2client()
Then, you need to move generated_token.json in gdrive_user_credentials_file (look here).
Profit!
Yes, it's a clutch, but it's the only way I found so far.
cp: @kell18
Thanks for the answer @RodionfromHSE it'd work if I'd need to do it only once for myself, but it's for everyone in the team... I hope DVC will fix this issue soon!
The solution from @RodionfromHSE is the only one that worked for me, thanks! Related question/problem: following DVC documentation, the google token expires after 7 days. Any ideas how to extend this as re-authentification is always a hassle for my headless machines.
I am considering migrating to git LFS due to this problem.
@SchindlerTo @ryukinix take a look here https://github.com/iterative/PyDrive2/issues/184#issuecomment-1200081628 . I think that was a relevant discussion.
Hi, thanks for the suggested fix.
I would like to be able to share uploaded dvc files from google-drive with users, who can't set this up. Files are to large for HuggingFace and google-drive is easy to set up. To clarify, these files are all publicly accessible. Is there a way to generate within the API a link that users can download a file using requests or gdown? https://dvc.org/doc/command-reference/get-url and other ways to download the file are unfortunately also blocked. It should be relatively easy to safe the exact URL and also to enable get-url and download public dvc-backed files.
" ... google-drive is easy to set up"
Was easy.
@shcheklein Do you have any suggestion on how to output the URL instead of the file path from dvc on gdrive?
Have there been any updates on this? I'm still seeing a verification error, including when using a custom GC project.
Just ran into this problem; any fixes on the horizon?
I was not able to follow the instructions at https://dvc.org/doc/user-guide/data-management/remote-storage/google-drive#using-a-custom-google-cloud-project-recommended and get this working.
edit: I did get this working eventually
After dvc push, a browser page pops up with
{client-name} has not completed the Google verification process. The app is currently being tested and can only be accessed by developer-approved testers. If you think you should have access, contact the developer. If you are a developer of {client-name}, see error details.
I tried to add myself as a tester, but I can't seem to do this - the OAuth consent screen is redirecting me to Google Auth Platform / Overview. I found someone else with this issue here https://www.reddit.com/r/devops/comments/1isamvc/cant_configure_a_consent_screen_clicking_on_oauth/.
I'm a bit out of my depth here and unsure what to do next 😓 . Google Drive seemed like it would be the simplest remote to set-up.
edit: I was able to add my email as a Test user at https://console.cloud.google.com/auth/audience and it works!
DVC app indeed depends on the drive.files OAuth scope (that gives the full access to all the files / directories in the Google Drive). Since we don't know in advance which directory users would need to use a remote storage + for things like dvc import-url, dvc import (if a different remote is used).
@shcheklein what if dvc import-url and dvc import are not supported in the most basic usage? I'm brand new to DVC so don't understand usage patterns, but I don't think these are required? I thought I'd just be doing a dvc pull to retrieve data
@jack-mcivor (I don't remember all the details by now). Yes, import commands is not a basic workflow. We still need a way for a user to pick the directory to access somehow. I'm not sure it is possible in CLI. I might be wrong at this moment since it's been a while since I was looking into this.
Do you have a plan to solve this problem? I know there is a workaround, but I don't want to use GCP for a ‘workaround’. I would use GCP if I wanted to. You may want to consider removing Google Drive completely from DVC. At least it won't mislead people.