github-pullrequest-resource icon indicating copy to clipboard operation
github-pullrequest-resource copied to clipboard

Prevent `check` from missing versions

Open timrchavez opened this issue 7 years ago • 11 comments

There are situations where new versions of the PR resource will can get lost. This happens because the versions that are returned are sorted by when they were last updated. Comments on PRs can affect the update time. This means that between checks, if the version Concourse passes to the check script is commented on, Concourse will only consider anything updated after this PR a valid version. This change puts the version Concourse passes to the check script at the start of the list of versions it returns. This ensures that a valid version does not get skipped.

timrchavez avatar Jan 25 '18 22:01 timrchavez

Is this whole point to put the input.version at the beginning of the list? If so, there's a cleaner way. I just making sure I'm grasping what's going on.

jtarchie avatar Jan 26 '18 03:01 jtarchie

@jtarchie Yep... yeah my Ruby is... well yeah.

timrchavez avatar Jan 26 '18 14:01 timrchavez

Right now the PRs and SHAs are the version object used in concourse. I wonder if I should add updated_at date, that way, given a version, all PRs that have been updated since the last known version would be returned.

This would be a breaking change though.

jtarchie avatar Jan 27 '18 18:01 jtarchie

What has this change, in the PR, fixed for you?

jtarchie avatar Jan 27 '18 18:01 jtarchie

@jtarchie It prevented the scenario where the input version update time changed and caused us to lose legitimate PR #'s... here's an example (* denotes input version in list):

Today

<push commit B>
<check C>
V=[A,C*,B]
new_versions = [B]
<push commit C>
(V_1=[A,B*,C])
<comment B>
(V_2=[A,C,B*])
<check B>
V=[A,C,B*]
new_versions = [] <--- this is the problem, the pushed commit to C is lost

With Change

<push commit B>
<check C>
V=[C*,A, B]
new_versions = [A,B]
<push commit C>
(V_1=[B*,A,C])
<comment B>
(V_2=[B*,A,C])
<check B>
V=[B*,A,C]
new_versions = [A,C]

Concourse will skip over versions it's already seen when get occurs or at least this is what appears to be happening in my testing (I won't pretend I understand fully the algorithm :)) -- in this case A will not build multiple times because it's already been seen, thus letting us achieve the desire result.

timrchavez avatar Jan 29 '18 16:01 timrchavez

The more I look at this, I don't think it actually solves the problem. It might appear to, but in the long run we are just masking the issue.

If the updated timestamp was added to the version for all PRs, then we could filter does all new PRs greater the previous updated timestamp.

It would be a breaking change though. Some peeps might need to reset their pipelines to flush the previous version history of PRs.

jtarchie avatar Feb 12 '18 20:02 jtarchie

The more I investigate this, the only solution is going to be using the github GraphQL API. It allows minimizing the API requests, but getting all the information needed to sort and filter PRs correctly.

I just don't have the time to do this. :-/

jtarchie avatar Feb 22 '18 16:02 jtarchie

The more I look at this, I don't think it actually solves the problem. It might appear to, but in the long run we are just masking the issue.

How does this solution mask the issue?

The more I investigate this, the only solution is going to be using the github GraphQL API. It allows minimizing the API requests, but getting all the information needed to sort and filter PRs correctly.

How would using GraphQL API solve this problem? My understanding is that the problem isn't inherent in how the data is retrieved from GitHub; the problem is one of ordering.

bhcleek avatar Mar 07 '18 20:03 bhcleek

The resource was a hack that mapped PRs on to the resource model of concourse. The order of PRs comes in on the most recent update. Removing the most recently known version doesn't actually solve a problem. Basically a hack already on a hack. I'm not going to explain this again. Please go through the issue history for more info.

The GraphQL would solve the issue because I could actually determine versions of a PR based on the latest commit pushed date. Rather than the mess of the updated_at of the PR can be (comments, labels, etc.)

jtarchie avatar Mar 07 '18 22:03 jtarchie

FWIW it's not removing the most recently known version, it's placing it at the front of the list. This means that Concourse will scan over the entire list on every check, which seems to be okay as it'll skip over versions it's already seen (they won't retrigger). I do agree it's a hack on a hack, but it's one that prevents us from missing PRs, so from a UX perspective I think it's :+1: Anyway...

timrchavez avatar Mar 08 '18 00:03 timrchavez

Just FYI: I was also hit by this bug and tried this PR for quite awhile but then stumbled upon another problem: some PRs were tested multiple times (I'm not sure why – maybe comments trigger it.) At the same time, some PRs still get lost sometimes. So maybe @jtarchie is right, and the only viable solution is to switch to the new GraphQL API, but it sounds like a lot of work...

leshik avatar Apr 25 '18 03:04 leshik