via icon indicating copy to clipboard operation
via copied to clipboard

Create a new end-point to authenticate with Google Drive using OAuth

Open jon-betts opened this issue 3 years ago • 1 comments

The conclusions of this spike:

  • Nothing is certain with Google, it's all guesswork
  • We can do two legged OAuth 2 quite easily with the Google libraries
  • We seem to hit limits well below the stated quota with service accounts
  • Round robining service accounts appears to get a higher total quota
  • After enabling billing we appear to have got a much higher single account quota (no idea if this is the cause)
  • Our caching is not effective with Google Drive due to Vary: Origin, Vary: X-Origin headers

Recommendations:

  • Add a new Python end-point to Via 3 which will use the libraries to download Google Drive files
  • Use a single account
  • Emit headers to ensure the results can be cached
  • Keep the old method behind a feature flag or other easily toggled change, so we can roll back if we hit issues

Notes

  • Spike code here: https://github.com/hypothesis/via3/pull/410

jon-betts avatar Mar 12 '21 17:03 jon-betts

It's unclear to me from this ticket (and the linked spike https://github.com/hypothesis/via/issues/405) why we should create a new Python endpoint to proxy Google Drive PDFs using OAuth-based authentication, instead of our NGINX endpoint's current API key-based authentication? Presumably we think that would make the 403 Your computer or network may be sending automated queries error pages from Google go away. But why do we think it will do that?

seanh avatar Aug 25 '21 17:08 seanh