dvc icon indicating copy to clipboard operation
dvc copied to clipboard

Unable connect dvc to Google Drive. Access blocked!

Open psaboia opened this issue 1 year ago • 23 comments

Added by @shcheklein :

See details and workaround here - https://github.com/iterative/dvc/issues/10516#issuecomment-2289652067


Failed to authenticate GDrive: "This app is blocked"

Description

When I use DVC commands with a gdrive remote storage configuration, I encounter an issue where it's impossible to authenticate with my Google account.

Reproduce

After initiating the command

dvc get https://github.com/my-data-registry data/samples

a browser window opens for authentication, but upon selecting my Google account, I'm directed to a page displaying the message:

This app is blocked

This app tried to access sensitive info in your Google Account. To keep your account safe, Google blocked this access.

Environment information

Output of dvc doctor:

$ dvc doctor

Platform: Python 3.11.6 on macOS-13.5-arm64-arm-64bit
Subprojects:
	dvc_data = 3.15.2
	dvc_objects = 5.1.0
	dvc_render = 1.0.2
	dvc_task = 0.4.0
	scmrepo = 3.3.7
Supports:
	gdrive (pydrive2 = 1.20.0),
	http (aiohttp = 3.10.2, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.10.2, aiohttp-retry = 2.8.3)
Config:
	Global: /Users/myself/Library/Application Support/dvc
	System: /Library/Application Support/dvc

I'm not sure if this is a bug, but any help with this issue would be greatly appreciated!

psaboia avatar Aug 10 '24 01:08 psaboia

I have precisely the same issue.

fabricionarcizo avatar Aug 14 '24 08:08 fabricionarcizo

Same problem

JohnConnor123 avatar Aug 14 '24 15:08 JohnConnor123

TL;DR:

The DVC app (that is used by default by DVC) is blocked by Google because they changed some policies and we need pass the verifications again. There was nothing bad happening (like security breaches or violations) on our end. There is not easy way to pass it. For now the recommended way (and it was always the recommended way) - is to create a custom app. Here is the link. It's not very complicated and should work just fine for everyone.

Longer version

Tue, Nov 14, 2023 - Google reached out with this message:

As part of our commitment to user privacy and security, Google requires developers that use our APIs to demonstrate that their apps comply with our policies. We have identified that your app’s use of Restricted Drive API scopes may require additional verification steps.

Screenshot 2024-08-14 at 12 04 03 PM

DVC app indeed depends on the drive.files OAuth scope (that gives the full access to all the files / directories in the Google Drive). Since we don't know in advance which directory users would need to use a remote storage + for things like dvc import-url, dvc import (if a different remote is used).

  • all the tokens are stored locally, we don't use any servers, DVC team doesn't see them, etc. It is safe enough to our mind for the default mode, it's better of course to use the custom app otherwise as mentioned above.

Anyways, it would be better to have a more granular permissions. And it seems Google understand this, we also like it. The only issue is that there is no API or any way to let users pick a specific dir in CLI. Here is the relevant ticket for this. But it's not resolved yet.

So, we kinda stuck in limbo with this a bit - we can't pass verification (since they are requesting a video explainer where it's clear why we need drive.file), and we can't implement a granular scope management for the default app atm.

I'm open to any ideas on this.

Also a relevant discussion on the rclone forum - https://forum.rclone.org/t/google-drive-builtin-app-verification/43919/5 .

shcheklein avatar Aug 14 '24 19:08 shcheklein

@shcheklein, thank you for the clarification! We will proceed with the custom app option.

psaboia avatar Aug 15 '24 21:08 psaboia

Same problem here

tharhtetsan avatar Aug 20 '24 06:08 tharhtetsan

Same problem here

@tharhtetsan Find the answer here - https://github.com/iterative/dvc/issues/10516#issuecomment-2289652067

psaboia avatar Aug 20 '24 14:08 psaboia

:angry: google disgraceful policy

ryukinix avatar Sep 09 '24 23:09 ryukinix

the custom app using the Google cloud option works, but would have preferred the older way of authenticating with gdrive, which was fairly easy

Drakunal avatar Sep 17 '24 03:09 Drakunal

Even the Google cloud option didn't work for me, it failed with ERROR: unexpected error - Failed to authenticate GDrive: 'access_token' during dvc push

kell18 avatar Oct 22 '24 11:10 kell18

You can authenticate in google at your own. First, you need to create oauth client id (like here). Then, download the client id json and use the following code:

import json
from oauth2client.client import OAuth2WebServerFlow, flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client.tools import run_flow

# Path to your OAuth2 client_id.json file
CLIENT_SECRET_FILE = 'client_secret.json'

# The scope for the Google Drive API
SCOPES = ['https://www.googleapis.com/auth/drive', 'https://www.googleapis.com/auth/drive.appdata']

def get_token_oauth2client():
    # Load the client secrets from the JSON file
    flow = flow_from_clientsecrets(CLIENT_SECRET_FILE, scope=SCOPES)

    # Run the authentication flow and retrieve credentials
    storage = Storage('token_oauth2client.json')
    credentials = run_flow(flow, storage)

    with open('generated_token.json', 'w') as token_file:
        token_file.write(credentials.to_json())



    print("Token information saved to generated_token.json")

if __name__ == '__main__':
    get_token_oauth2client()

Then, you need to move generated_token.json in gdrive_user_credentials_file (look here).

Profit!

Yes, it's a clutch, but it's the only way I found so far.

cp: @kell18

RodionfromHSE avatar Oct 25 '24 14:10 RodionfromHSE

Thanks for the answer @RodionfromHSE it'd work if I'd need to do it only once for myself, but it's for everyone in the team... I hope DVC will fix this issue soon!

kell18 avatar Oct 26 '24 17:10 kell18

The solution from @RodionfromHSE is the only one that worked for me, thanks! Related question/problem: following DVC documentation, the google token expires after 7 days. Any ideas how to extend this as re-authentification is always a hassle for my headless machines.

SchindlerTo avatar Nov 24 '24 20:11 SchindlerTo

I am considering migrating to git LFS due to this problem.

ryukinix avatar Nov 25 '24 11:11 ryukinix

@SchindlerTo @ryukinix take a look here https://github.com/iterative/PyDrive2/issues/184#issuecomment-1200081628 . I think that was a relevant discussion.

shcheklein avatar Dec 06 '24 18:12 shcheklein

Hi, thanks for the suggested fix. I would like to be able to share uploaded dvc files from google-drive with users, who can't set this up. Files are to large for HuggingFace and google-drive is easy to set up. To clarify, these files are all publicly accessible. Is there a way to generate within the API a link that users can download a file using requests or gdown? https://dvc.org/doc/command-reference/get-url and other ways to download the file are unfortunately also blocked. It should be relatively easy to safe the exact URL and also to enable get-url and download public dvc-backed files.

canergen avatar Dec 12 '24 23:12 canergen

" ... google-drive is easy to set up"

Was easy.

ryukinix avatar Dec 13 '24 13:12 ryukinix

@shcheklein Do you have any suggestion on how to output the URL instead of the file path from dvc on gdrive?

canergen avatar Jan 16 '25 18:01 canergen

Have there been any updates on this? I'm still seeing a verification error, including when using a custom GC project.

henry-zwart avatar Feb 15 '25 19:02 henry-zwart

Just ran into this problem; any fixes on the horizon?

jpvelez avatar Feb 18 '25 20:02 jpvelez

I was not able to follow the instructions at https://dvc.org/doc/user-guide/data-management/remote-storage/google-drive#using-a-custom-google-cloud-project-recommended and get this working.

edit: I did get this working eventually

After dvc push, a browser page pops up with

{client-name} has not completed the Google verification process. The app is currently being tested and can only be accessed by developer-approved testers. If you think you should have access, contact the developer. If you are a developer of {client-name}, see error details.

I tried to add myself as a tester, but I can't seem to do this - the OAuth consent screen is redirecting me to Google Auth Platform / Overview. I found someone else with this issue here https://www.reddit.com/r/devops/comments/1isamvc/cant_configure_a_consent_screen_clicking_on_oauth/.

I'm a bit out of my depth here and unsure what to do next 😓 . Google Drive seemed like it would be the simplest remote to set-up.

edit: I was able to add my email as a Test user at https://console.cloud.google.com/auth/audience and it works!

jack-mcivor avatar Feb 26 '25 07:02 jack-mcivor

DVC app indeed depends on the drive.files OAuth scope (that gives the full access to all the files / directories in the Google Drive). Since we don't know in advance which directory users would need to use a remote storage + for things like dvc import-url, dvc import (if a different remote is used).

@shcheklein what if dvc import-url and dvc import are not supported in the most basic usage? I'm brand new to DVC so don't understand usage patterns, but I don't think these are required? I thought I'd just be doing a dvc pull to retrieve data

jack-mcivor avatar Feb 26 '25 07:02 jack-mcivor

@jack-mcivor (I don't remember all the details by now). Yes, import commands is not a basic workflow. We still need a way for a user to pick the directory to access somehow. I'm not sure it is possible in CLI. I might be wrong at this moment since it's been a while since I was looking into this.

shcheklein avatar Mar 01 '25 21:03 shcheklein

Do you have a plan to solve this problem? I know there is a workaround, but I don't want to use GCP for a ‘workaround’. I would use GCP if I wanted to. You may want to consider removing Google Drive completely from DVC. At least it won't mislead people.

burakai avatar Aug 29 '25 13:08 burakai