libgit2sharp
libgit2sharp copied to clipboard
repo.Commits.QueryBy(filename) slow on large repos
Reproduction steps
1): Clone a large repo 2): run this function on that repo with some random file:
public IEnumerable<string> TestSlow(string filename)
{
using (var repo = new Repository(repoRoot))
{
string path = filename.Substring(repoRoot.Length + 1).Replace("\\", "/");
foreach (LogEntry entry in repo.Commits.QueryBy(path))
{
yield return entry.Commit.Author.ToString();
}
}
}
- run this command on the same file: "git log --follow --oneline --
"
Expected behavior
I expect similar time to be taken by TestSlow and the git log command above
Actual behavior
The git log command finishes in about 1.6 ms on my repo The TestSlow command takes about 70 seconds.
Here is what I see in my profiler:

Version of LibGit2Sharp (release number or SHA1)
0.27.0-preview-0017 0.26.1 0.24.1
Operating system(s) tested; .NET runtime tested
.NET Framework 4.7.2 on Windows 10
As a note: I tried different sorting options, but was limited because FileHistory doesn't support None or Reverse:
System.ArgumentException: Unsupported sort strategy. Only 'Topological', 'Time', or 'Topological | Time' are allowed.
Parameter name: queryFilter
at LibGit2Sharp.Core.FileHistory..ctor(Repository repo, String path, CommitFilter queryFilter) in C:\projects\libgit2sharp\LibGit2Sharp\Core\FileHistory.cs:line 76
Just submitted a proposed solution. The change to FileHistory make it run in 14 seconds instead of 80, the change in Tree got it down to about 8 seconds.
I see the continuous integration failed, but that looks like a CI problem, not a problem with my code. Here is the error (linux only):
========================== Starting Command Output ===========================
[command]/bin/bash --noprofile --norc /home/vsts/work/_temp/4d8a90ac-c758-402b-94d3-740f95fba16d.sh
/usr/share/dotnet/sdk/2.2.105/NuGet.targets(499,5): error : Could not find a part of the path '/tmp/NuGetScratch/e31463d7-84e6-4141-aa64-0e5166476164'. [/home/vsts/work/1/s/LibGit2Sharp/LibGit2Sharp.csproj]
##[error]Bash exited with code '1'.
##[section]Finishing: CmdLine
Awesome work! I met same issue and I have to wrap a git.exe and use git log to speed up the log reading.
@Blueve Could you share your code for parsing the git.exe output?
@Blueve Could you share your code for parsing the git.exe output?
We read the cmd output from git log command, such as: git --no-pager log --date-order --no-merges --no-renames --pretty=format:@/%H/ --stat=512 -- {0} where the {0} is formatted filter. We can use other command and parameter to satisfied different intent.
And then parse the output string line by line. The format of output combined with below:
<empty line>
<commit sha line, start with @/ and end with />
<file path> | <changed lines>
<file path> | <changed lines>
...
<file path> | <changed lines>
Sorry I couldn't share the full code since they are interval visible only.