incubator-devlake
incubator-devlake copied to clipboard
feat #6615, replace libgit2 with go-git.
⚠️ Pre Checklist
Please complete ALL items in this checklist, and remove before submitting
- [x] I have read through the Contributing Documentation.
- [x] I have added relevant tests.
- [x] I have added relevant documentation.
- [x] I will add labels to the PR, such as
pr-type/bug-fix,pr-type/feature-development, etc.
Summary
What does this PR do?
In order to replace libgit2 with go-git, thir PR provides a new config USE_GO_GIT_IN_GIT_EXTRACTOR in env.sample, if USE_GO_GIT_IN_GIT_EXTRACTOR is set 1, in plugin GitExtractor, DevLake will use go-git to collect repo's data.
Does this close any open issues?
Closes #6615
Screenshots
Include any relevant screenshots here.
Other Information
Any other information that is important to this PR.
In gitextractor, libgit2 will produce a table named commit_line_change, while go-git cannot fetch such information.
commit_line_change is not used in any dashboards or processes, so go-git just ignore this tables.
My thoughts:
- I think
GitReposhould be turned into an interface with two impls: GoGitRepo and LibGitRepo. Refactor the appropriate methods. This creates better abstraction and less coupling in the code. - Write a temporary main function that uses LibGitRepo to grab data from a repo and perform the extraction functions. Capture the output in CSV files (like you're doing)
- Write a test suite that uses GoGitRepo to run the same functions on the same repo as above. Test the same functions against the output CSV files from above (i.e. compare results).
- Once all is good, get rid of the LibGitRepo impl and the temporary main function.
My thoughts:
* I think `GitRepo` should be turned into an interface with two impls: GoGitRepo and LibGitRepo. Refactor the appropriate methods. This creates better abstraction and less coupling in the code. * Write a temporary main function that uses LibGitRepo to grab data from a repo and perform the extraction functions. Capture the output in CSV files (like you're doing) * Write a test suite that uses GoGitRepo to run the same functions on the same repo as above. Test the same functions against the output CSV files from above (i.e. compare results). * Once all is good, get rid of the LibGitRepo impl and the temporary main function.
Turning GItRepo into an interface is a good idea! I'll follow your advice.
@keon94 @klesh Please review this PR. I have updated the main part, and won't add new features.
I‘m glad the libgit2 dependency will be eliminated at last. It has caused too much inconvenience to the developers.
I‘m glad the libgit2 dependency will be eliminated at last. It has caused too much inconvenience to the developers.
Yes, go-git doesn't equal with libgit2(commit_line_change cannot be collected with go-git). Hope go-git will satisfy DevLake's requirements.
@mindlesscloud Would you like to take a look at the PR when you find time?😊
@mindlesscloud If there is no addtional comment, please approve this PR. Thx.