GitPython
GitPython copied to clipboard
GitPython parses RFC 2822 dates with timezone incorrectly
Bug Report
Description
The parse_date
function returns a wrong result, when using an RFC 2822 date with timezone.
In [24]: parse_date('Thu, 20 May 2021 13:38:44 +0300')
Out[24]: (1621517924, -10800)
This output is wrong.
1621517924
is supposed to be UTC time (10:38:44
).
But instead, it is my time (13:38:44
)
Source Code Investigation
The function calls are as following:
-
repo.index.commit
takesauthor_date
as input, which is anRFC 2822
string. -
repo.index.commit
callsCommit.create_from_tree
withauthor_date
. -
Commit.create_from_tree
callsparse_date(author_date)
to parse the date intoauthor_time
andauthor_offset
. -
Commit.create_from_tree
callscls
(akaCommit.__init__
) passingauthor_time
andauthor_offset
. -
Commit.__init__
savesauthor_time
andauthor_offset
inauthored_date
andauthor_tz_offset
.
Thus, the fault should lie with the parse_date
function.
Environment
Tried with versions 2.1.11
and 3.1.14
.
Thanks for the report!
With the given example, I could reproduce the issue and validate that GitPython essentially does the opposite. It returns the original time and inverts the offset, so that original_time + offset
yields the UTC time.
The respective source is here, and I think it's very old too. I am also very sure that a lot of code depends on this behaviour so fixing it to match the docs will be a breaking change.
Instead of that I propose to change the documentation instead. What do you think?
Hi @Byron,
Thanks a lot for the quick response!
It returns the original time and inverts the offset, so that original_time + offset yields the UTC time.
I am using GitPython to make a commit and specify a date in RFC 2822 (date taken from another commit).
The original date (as shown in git log
) is: Fri May 21 12:45:30 2021 +0300
The resulting date (as shown in git log
) is: Fri May 21 15:45:30 2021 +0300
Thus, when creating a commit with an RFC 2822 date, the resulting date is incorrect, it's not that the timezone has changed to balance the changed time.
The respective source is here, and I think it's very old too. I am also very sure that a lot of code depends on this behaviour so fixing it to match the docs will be a breaking change.
I'm not sure about the utctz_to_altz
function and where it is used.
However, I think we should fix parse_date
to correctly parse RFC 2822 dates, because that is clearly broken.
We can do that without changing the helper function utctz_to_altz
, if you believe it will break other places.
To make my case more clear, here is a minimum reproducer:
import os
import git
import tempfile
from pathlib import Path
from email import utils as email_utils
tmp_dir = tempfile.mkdtemp()
print("Repo folder: '%s'" % tmp_dir)
repo = git.Repo.init(tmp_dir)
Path(os.path.join(repo.working_dir, "example_file")).touch()
repo.index.add(["example_file"])
author = git.Actor("user", "[email protected]")
date = "Thu, 20 May 2021 13:38:44 +0300"
print("Input date: '%s'" % date)
repo.index.commit("Wrong date", author=author, committer=author,
author_date=date, commit_date=date)
print("Output date: '%s'" %
email_utils.format_datetime(repo.head.commit.authored_datetime))
Which outputs:
Repo folder: '/tmp/tmp2x6_v43g'
Input date: 'Thu, 20 May 2021 13:38:44 +0300'
Output date: 'Thu, 20 May 2021 16:38:44 +0300'
Thanks for making the issue easily reproducible. I have taken a look and noticed that the date in the commit arrives correctly, that's a pleasant surprise, and it can be expected that reading the same date both from the newly created commit (by re-reading it or by using the return value of commit(…)
).
My initial desire to not fix such a long-standing bug in fear of breakage makes the assumption that a lot of code depends on it. A way to find it would be to fix this issue and see how many new issues show up - in the worst case the release can be yanked to find another way of introducing a fix.
Thanks a lot for your engagement here to help making this change, I do hope you would be willing to help fixing it too. Please note that if you do and no existing test breaks, it's desirable to add a new test for that.