go-git icon indicating copy to clipboard operation
go-git copied to clipboard

Add() a single file is very slow

Open rustyx opened this issue 4 years ago • 5 comments

Worktree.Add(file) calls Status(), which goes over the entire worktree. In addition, Add() takes a single file. So adding multiple files in a large repo therefore becomes an O(N2) operation (excruciatingly slow).

rustyx avatar Dec 11 '19 09:12 rustyx

This is killing me too. I was wondering if the status could be cached in the worktree. I intend to tinker with this, but it someone has time to work on this, please pick it up.

maguro avatar Jan 24 '20 05:01 maguro

Caching status won't work that cleanly. it seems. In reality, all it's being used for is to check if the file is unmodified or not. Rather than boiling the ocean for that simple check, I'll replace it with an on demand check.

maguro avatar Jan 24 '20 15:01 maguro

Interesting, I don't think there are tests for adding files that are already tracked.

maguro avatar Jan 24 '20 15:01 maguro

This is crushing me as well. I have a repo with thousands of files. I've done a lot of work to make my workflow entirely in memory, using libasciidoc to get really fast document processing, but then this Add() performance basically steals back all the gains I achieved.

gdamore avatar Mar 18 '20 06:03 gdamore

Hey, we encountered that issue as well in Kubernetes release engineering. We're trying to add a single file to kubernetes/kubernetes which takes round about 5 minutes to succeed. Do you have any idea how to overcome this obstacle?

saschagrunert avatar Mar 25 '20 16:03 saschagrunert