augur icon indicating copy to clipboard operation
augur copied to clipboard

analyze_commits_in_parallel error

Open cdolfi opened this issue 8 months ago • 3 comments

On our instance we are getting 100s of the same error. This is causing our logs to fill up and causes issues on our entire system.

logs.txt

06/06/2025 update:

I am seeing a few of these errors again. The magnatutide is much lower. 3 out of 500+

Repos with this error: https://github.com/tianocore/edk2 - KeyError('author_timestamp') https://github.com/tianocore/edk2-1 - KeyError('author_timestamp') https://github.com/tianocore/edk2-codereview - TypeError('sequence item 1: expected a bytes-like object, NoneType found')

cdolfi avatar Apr 24 '25 21:04 cdolfi

Talking tonight, we think the issue is resolvable by correcting git logs that are malformed with regard to the time zone.

import datetime

def parse_git_commit_date(epoch_str, tz_offset_str):
    try:
        timestamp = int(epoch_str)
        # Validate timezone format as ±HHMM
        if len(tz_offset_str) != 5 or not tz_offset_str.lstrip("+-").isdigit():
            raise ValueError("Malformed TZ offset")
        # Convert ±HHMM to timedelta
        sign = 1 if tz_offset_str.startswith("+") else -1
        hours = int(tz_offset_str[1:3])
        minutes = int(tz_offset_str[3:])
        offset = datetime.timedelta(hours=hours, minutes=minutes) * sign
        return datetime.datetime.utcfromtimestamp(timestamp) + offset
    except Exception as e:
        print(f"Invalid timestamp: {epoch_str} {tz_offset_str} ({e})")
        return None

sgoggins avatar Apr 29 '25 22:04 sgoggins

Closing as we are no longer seeing this error, thanks!! @Ulincsys

cdolfi avatar Jun 02 '25 19:06 cdolfi

@IsaacMilarky : can you look at this?

sgoggins avatar Jun 08 '25 22:06 sgoggins

ive seen the KeyError('author_timestamp') thing on these repos:

tianocore/edk2 tianocore/edk2-1 tianocore/edk2-codereview  

not sure where the TypeError('sequence item 1: expected a bytes-like object, NoneType found') is coming from though

All of the KeyError('author_timestamp') ones come with HUGE stack traces relating to the timezone displacement (very much like the ones included in https://github.com/chaoss/augur/issues/3228)

MoralCode avatar Jul 17 '25 18:07 MoralCode

problematic date values:

  • "2018-01-24 22:36:22 -3407" (confidential-containers/edk2 and the other edk derived repos)
    • offending commit: https://github.com/tianocore/edk2/commit/630cb8507b2f1d7d7af3ac0f992d40f209dc1cee (seems like GH interperets the 3000 as a number of hours and therefore shows the timestamp as a few days sooner)
  • "2011-08-29 06:50:23 +51800" (github/rails)

MoralCode avatar Jul 17 '25 19:07 MoralCode

2025-10-17 22:57:24 kate analyze_commits_in_parallel[551620] DEBUG Analyzing commit e2b0828ea9fe7071ce5b0dc1a35de23ca95dd8f1 for repo_id=299384
2025-10-17 22:57:24 kate analyze_commits_in_parallel[551626] ERROR Ran into issue when trying to insert commits 
 Error: (psycopg2.errors.InvalidTimeZoneDisplacementValue) time zone displacement out of range: "2011-08-29 06:50:23 +51800"
LINE 1: ...', 1, 0, 1, 'actionpack/CHANGELOG', '2011-08-29', '2011-08-2...
                                                             ^

@sgoggins said he saw this in his logs during this weeks maintainers meeting (this was posted belatedly by a few days)

https://github.com/oroinc/orocommerce/commit/e2b0828ea9fe7071ce5b0dc1a35de23ca95dd8f1 commit is from this repo

When we looked into this, we realized this is not an actual exception, but its a log statement that prints the exception... every time it is caught..... as it recurses down the list of commits to find the problematic one....

So yeah wildly extraneous repeat logging of something that looks like an error but is not actually a problem because it is being handled.

Logging issue Solved by https://github.com/chaoss/augur/pull/3324. All timezone displacement issues should now be resolved and any new ones discovered warrant a new issue. Closing.

MoralCode avatar Oct 24 '25 02:10 MoralCode