go-git icon indicating copy to clipboard operation
go-git copied to clipboard

Help diagnosing Blame concurrency problems

Open TheDahv opened this issue 7 years ago • 3 comments

Go Version: go version go1.8 darwin/amd64 go-git Version: fresh install with go get gopkg.in/src-d/go-git.v4/...

First off, thank you for this project! I was very excited when I saw an alternative to git2go that didn't have a dependency on libgit2 and that had a blame implementation. For context, I'm working on a problem that involves scanning files that were changed in a branch, blaming each of them, and then determining experience by committer via git-blame to recommend a reviewer for a pull request.

This one will be odd to write because I don't know what exactly is triggering this problem. For the example code, I have 1 simple repo in which this works and another more complex one where it doesn't. So I apologize in advance; I'll do my best to describe what I'm seeing and hopefully you can guide me toward providing better info.

Description

I'm unable to Blame multiple files concurrently at the same commit. My motivation here was to run the blames in parallel in light of known performance issues .

Specifically I get 2 kinds of crashes:

  • fatal error: concurrent map writes from gopkg.in/src-d/go-git.v4/plumbing/format/packfile.(*Decoder).decodeByHeader
  • fatal error: concurrent map read and map write from gopkg.in/src-d/go-git.v4/storage/filesystem.(*ObjectStorage).findObjectInPackfile

Here's a stack trace for the bug from the repro program below.

The former would fire immediately in serial contexts. The latter I got in a concurrent context after adding locks to the Decoder methods to guard the hashToOffset and offsetToHash maps.

Here's a little diff patch gist to show what I tried changing to protect against the crashes. Unfortunately this slowed performance to a crawl.

Frankly I don't feel like sprinkling locks in without a firmer understanding of the design goals is an effective approach for me anyway.

Repro

Here's a little snippet from my larger code that illustrates how I isolated the bugs.

In terms of which kinds of repo sets this up, I'm not really sure I know what the main difference is. Perhaps you have some guidance on how I can set up test scenarios to isolate that; I admit I don't have much background in setting up phony git objects and files.

The branch I'm working with has a file rename in it as well as code that has moved from files to the new file. Hard to say if that's a red herring or not.

Open Questions

  • I saw that you're reworking the Blame functionality now. Is concurrency expressly a design goal?
  • Would I be better served making independent instances of repo or commit objects to isolate the resources when calculating blames? It might solve my concurrency problem but seems wasteful.
  • I'm new to the project and saw there are different options for Storer and FileSystem options. I wasn't able to pick up from the documentation how to choose from among them, but I assumed I wanted the FileSystem option. Will I gain anything by exploring these options further?

TheDahv avatar Jul 01 '17 19:07 TheDahv

@TheDahv Concurrency using the filesystem backend is currently not supported. It is a limitation which we should document. We do not plan to work on supporting concurrency on that side at the moment, but we will accept PRs improving support if they are not disruptive on the public API.

smola avatar Jul 04 '17 08:07 smola

Understood! Thank you for responding, and I think adding a note to the documentation is a perfectly acceptable response.

TheDahv avatar Jul 05 '17 22:07 TheDahv

@smola Just submitted a pull request to address this

lxjhk avatar Dec 30 '19 09:12 lxjhk