kargo icon indicating copy to clipboard operation
kargo copied to clipboard

add "since" option (or similar tbd feature) to git subscriptions

Open krancour opened this issue 4 months ago • 4 comments

Discussed in https://github.com/akuity/kargo/discussions/5074

Originally posted by shamsalmon September 18, 2025 Git is extremely slow pulling from monorepos with large commit histories. I have determined the primary reason for this not directly repo size, but rather how many git log operations are being run and the history of the repo itself.

The main issue is with the logic defined here: https://github.com/akuity/kargo/blob/main/internal/controller/git/work_tree.go#L395

There is no way to set filters within this work tree function. My proposal is to add several options here:

  1. Set the git log path to only look at folders within the IncludePath to reduce history
  2. Ability to set a --since argument ("1 year ago" for ex) to reduce history

There are more potential improvements here, such as also making ExcludePaths transition into the pathspec format for git log.


More context. We've nailed down the root cause and it's this:

https://github.com/akuity/kargo/discussions/5074#discussioncomment-14457660

This is the tentative solution:

Your --since idea might be the most viable thing here, as when you are creating a subscription involving new apps or environments in your monorepo, it should probably be relatively easily to say, "hey, this stuff didn't exist before this date, so if you go back that far and haven't hit the discovery limit, you're done."

I'd like @hiddeco and myself to discuss that further before we commit to that approach, but I'm tentatively including this in v1.9.0 because the ratio of pain relief to effort is very high, so I'd like to see us address it soon-ish.

krancour avatar Sep 19 '25 16:09 krancour

I added this to a fork we run where I modified the selectCommits function to do this:

			if len(selectedCommits) == 0 && commit.CommitDate.Before(n.sinceDate) {
				return nil, fmt.Errorf("no commits found in branch before %s", n.sinceDate.Format("2006-01-02"))
			}

			if len(selectedCommits) >= 1 && commit.CommitDate.Before(n.sinceDate) {
				return trimSlice(selectedCommits, n.discoveryLimit), nil
			}

Would you like me to submit a PR with the crd / function changes or do you want to discuss internally more about the best way to approach this?

shamsalmon avatar Sep 29 '25 20:09 shamsalmon

@shamsalmon, if you want to give the cliff notes version of your changes by first showing tentative changes to API types, that will be a good springboard for further conversation. Just be aware that some patience will be in order. We're juggling a lot of high priority issues right now.

krancour avatar Sep 30 '25 00:09 krancour

@shamsalmon, if you want to give the cliff notes version of your changes by first showing tentative changes to API types, that will be a good springboard for further conversation. Just be aware that some patience will be in order. We're juggling a lot of high priority issues right now.

thats no problem we are operating off a fork right now so there is no rush to get this in. would be nice to have by 1.9. ill submit a draft PR with the API types.

shamsalmon avatar Sep 30 '25 00:09 shamsalmon

Hi @krancour - Added my draft PR here https://github.com/akuity/kargo/pull/5359 so we could attempt to get this into 1.9.0 if still feasible. Let me know what you think of the implementation. It is slightly different than we have running in our fork because we never allow git to look past 01/01/2025 so I changed the type to a pointer to check if its null or not.

shamsalmon avatar Nov 06 '25 18:11 shamsalmon