google-auth-library-python icon indicating copy to clipboard operation
google-auth-library-python copied to clipboard

Stripping "Z" from expiry date parsing in Credentials may lead to incorrect expires_at timestamp

Open IslandJohn opened this issue 1 year ago • 11 comments

Environment details

  • OS: Mac OS 13.2.1
  • Python version: 3.10.8
  • pip version: 22.2.2
  • google-auth version: 2.17.0

Steps to reproduce

  1. Retrieve an initial access token via InstalledAppFlow with no TZ set and system time zone set to UTC-X (e.g. EDT)
  2. Observe that expire_at is set beyond 1 hour (it should be returned with 1 hour + abs(time zone difference to UTC))
  3. Refresh will not occur as expected and the access token expires

Notes

I believe the culprit may be here. The symptom investigations are here.

IslandJohn avatar Mar 30 '23 00:03 IslandJohn

Hey @IslandJohn,

Thank you for the detailed and high effort investigation.

Can you provide detailed reproduction steps for this issue? Either a bash or python script would be very useful to help investigate this.

Thanks!

Carl

clundin25 avatar Apr 10 '23 21:04 clundin25

I did not discover any particular reason this is being done after digging in the git history. There seems to be an inconsistency with how strptime is beiing used.

$ rg "strptime" -C 3 -g '!*tests*'
google/oauth2/credentials.py
395-        # access token expiry (datetime obj); auto-expire if not saved
396-        expiry = info.get("expiry")
397-        if expiry:
398:            expiry = datetime.strptime(
399-                expiry.rstrip("Z").split(".")[0], "%Y-%m-%dT%H:%M:%S"
400-            )
401-        else:

google/auth/external_account_authorized_user.py
315-        """
316-        expiry = info.get("expiry")
317-        if expiry:
318:            expiry = datetime.datetime.strptime(
319-                expiry.rstrip("Z").split(".")[0], "%Y-%m-%dT%H:%M:%S"
320-            )
321-        return cls(

google/auth/impersonated_credentials.py
105-    try:
106-        token_response = json.loads(response_body)
107-        token = token_response["accessToken"]
108:        expiry = datetime.strptime(token_response["expireTime"], "%Y-%m-%dT%H:%M:%SZ")
109-
110-        return token, expiry
111-

clundin25 avatar Apr 10 '23 21:04 clundin25

I don't have a script to test this explicitly, but here's my setup:

  1. Running MacOS 13.2.1, set to the EDT time zone
  2. Installed homebrew and using Python 3.10.8 through that installation
  3. Running gphotos-sync as described in local installation

What appears to be happening is that the time being parsed includes a "Z" to indicate UTC time (e.g. 22:00Z). Since this is stripped, the time is interpreted in the time zone the script is running in. In my case, 22:00 EDT, which is +4 hours from when the token should actually expire. If I force my local time zone to UTC (export TZ=UTC) it works correctly, because the the time is being parsed in the same zone.

IslandJohn avatar Apr 13 '23 16:04 IslandJohn

RFC 3339 seems to indicate the use of Z (or other offset) for the time zone and it should not be ignored. Based on this, none of the implementations above seem correct since they either assume "Z" (which it may not be, since it could be a +/- offset), or Z is assumed and ignored.

IslandJohn avatar Apr 13 '23 16:04 IslandJohn

Sorry for the delay. Picking this back up.

Here is a simple script for a repro.

from google.auth import default
from google.auth.transport.requests import Request
from google.oauth2 import credentials
 
orig_cred, expiry = default()
orig_cred.refresh(Request())
 
filename = "cred.json"
 
with open(filename, "w") as f:
    f.write(orig_cred.to_json())
 
parsed_cred = credentials.Credentials.from_authorized_user_file(filename)
 
print(f"{orig_cred.expiry=} {parsed_cred.expiry=}")
assert orig_cred.expiry == parsed_cred.expiry

The assertion fails, which isn't desirable since the token was not refreshed when it is deserialized from the file system.

clundin25 avatar May 04 '23 21:05 clundin25

Following up with a simple repro that is more accurate to the reported symptoms.

➜ bat repro.py
───────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: repro.py
───────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ from google.auth import default
   2   │ from google.auth.transport.requests import Request
   3   │ from google.oauth2 import credentials
   4   │ import datetime
   5   │
   6   │ FILENAME = "cred.json"
   7   │
   8   │
   9   │ def export_cred():
  10   │     c, _ = default()
  11   │     c.refresh(Request())
  12   │
  13   │     with open(FILENAME, "w") as f:
  14   │         f.write(c.to_json())
  15   │     print(f"Serialized credential that expires in {c.expiry - datetime.datetime.now()}")
  16   │
  17   │
  18   │ def print_existing_cred():
  19   │     c = credentials.Credentials.from_authorized_user_file(FILENAME)
  20   │     print(f"Deserialized credential expires in {c.expiry - datetime.datetime.now()}")
───────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
clundin in 🌐 clundin-glinux in via  v3.11.0 (env) on ☁️  [email protected]
➜ bat repro.sh
───────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: repro.sh
───────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ #!/bin/sh
   2   │
   3   │ set -eu
   4   │
   5   │ printf "============\nUTC\n============\n"
   6   │ export TZ=UTC
   7   │ echo "Creating credential in UTC timezone."
   8   │ python -c "import repro;repro.export_cred()"
   9   │
  10   │ echo "Checking time to expiration in UTC"
  11   │
  12   │ python -c "import repro;repro.print_existing_cred()"
  13   │
  14   │ printf "============\nNew York\n============\n"
  15   │
  16   │ export TZ=America/New_York
  17   │
  18   │ echo "Checking time to expiration against credential created in UTC."
  19   │
  20   │ python -c "import repro;repro.print_existing_cred()"
  21   │
  22   │ echo "Creating credential in New York timezone."
  23   │ python -c "import repro;repro.export_cred()"
  24   │
  25   │ echo "Checking time to expiration against credential created in New_York TZ."
  26   │ python -c "import repro;repro.print_existing_cred()"
  27   │
───────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
clundin in 🌐 clundin-glinux in via  v3.11.0 (env) on ☁️  [email protected]
➜ ./repro.sh
============
UTC
============
Creating credential in UTC timezone.
Serialized credential that expires in 0:59:58.998304
Checking time to expiration in UTC
Deserialized credential expires in 0:59:57.975370
============
New York
============
Checking time to expiration against credential created in UTC.
Deserialized credential expires in 4:59:57.696211
Creating credential in New York timezone.
Serialized credential that expires in 4:59:58.998128
Checking time to expiration against credential created in New_York TZ.
Deserialized credential expires in 4:59:58.113078

I will now investigate next steps.

clundin25 avatar May 09 '23 18:05 clundin25

Okay. Digging in further and I do not believe this is the cause of your issue. While I do believe the code should be cleaned up, and not strip the timezone data (and serialize without an actual timezone), that is a separate issue.

I pushed the code I used to investigate here in case there are any gaps.

I'll leave this open for now but this should not cause any issues for your reported flow.

I will dig into google-oauth-library-python next.

clundin25 avatar May 11 '23 23:05 clundin25

Do you believe this is a time zone parsing issue somewhere else?

IslandJohn avatar May 12 '23 01:05 IslandJohn

and serialize without an actual timezone

Actually, I'd say it's working correctly now:

  • whenever a timestamp is sent over the network or stored on disk, it SHOULD have the timezone info, so that it can be unambiguously identified as a point in time
  • this library internally uses python naive datetime objects and assumes they're all in UTC, but that's purely within this lib; it follows well the python datetime documentation:

    A naive object does not contain enough information to unambiguously locate itself relative to other date/time objects. Whether a naive object represents Coordinated Universal Time (UTC), local time, or time in some other timezone is purely up to the program, just like it is up to the program whether a particular number represents metres, miles, or mass.

Stripping the Z when loading from a file is an easy way to ensure these two invariants hold.

Maybe the main problem is that this library doesn't document that its datetimes are in UTC, so all the readers have to guess (and can guess incorrectly and cause bugs?).

So, the bug must be somewhere else 🤷

juzna avatar Jun 10 '23 06:06 juzna

Cross-posting: https://github.com/gilesknap/gphotos-sync/issues/413#issuecomment-1585537627

TL;DR: The gphotos-sync tool takes expiry attribute and assumes it's in local time, which causes the bug (their code).

However, this mistake is very easy to make, probably even very likely that most would do the same. I tried a couple of variations with Copilot which also always produced this bug. Just FYI.

juzna avatar Jun 10 '23 07:06 juzna

Thanks for digging through this.

IslandJohn avatar Jun 10 '23 13:06 IslandJohn