augur icon indicating copy to clipboard operation
augur copied to clipboard

Change repo uniqueness to be based on repo_src_id not url

Open cdolfi opened this issue 9 months ago • 6 comments

When repositories move like https://github.com/openai/triton to https://github.com/triton-lang/triton, both can be added and cause neither to fully complete collection. If the check before adding to the repository was based on repo_src_id I believe this problem could be prevented

cdolfi avatar Mar 14 '25 14:03 cdolfi

This is an issue on instances that started before May, 2024. This is addressed/fixable, using the scripts here: https://github.com/chaoss/augur-utilities/tree/main/more_cowbell

sgoggins avatar Mar 14 '25 16:03 sgoggins

@ABrain7710 : I think this is possibly not fixed by our Augur patch to the 100% level. @cdolfi is reporting that a new repository that was a duplicate got added in December. I really thought this was patched, and I know it was tested. So, perhaps there is an edge case missed?

https://github.com/openai/triton to https://github.com/triton-lang/triton,

sgoggins avatar Mar 14 '25 16:03 sgoggins

I verified this occurred on Padres in January of this year:

select * from repo where repo_git like '%triton'; 

sgoggins avatar Mar 14 '25 16:03 sgoggins

@sgoggins I haven't heard about this in awhile. Where are we at on this?

ABrain7710 avatar Jul 08 '25 02:07 ABrain7710

repo source id applied to frontend repo additions in https://github.com/chaoss/augur/pull/2929

This issue should be resolved as a result

MoralCode avatar Oct 31 '25 19:10 MoralCode

Still an issue with CLI additions

per @sgoggins analysis:

augur/tasks/frontend.py has the method called first. (add_new_github_repos) augur/application/cli/db.py seems to handle the insertion of repos at the command line (add_repos)

A brief examination of the code, [...] suggests to me that you possibly added repositories using the command line tools? I say that because it "looks like" we actually already have the GH ID in the table, and we are checking for duplicates in that interface.

We should compare the add_new_github_repos and add_repos functions to see how they differ and refactor them to use the same methods to add repositories

MoralCode avatar Oct 31 '25 19:10 MoralCode