Gource icon indicating copy to clipboard operation
Gource copied to clipboard

Deleted file never vanishes

Open Liz4v opened this issue 9 years ago • 13 comments
trafficstars

I generated a Gource video of one repository of mine: https://github.com/leandigo/django-oneall

It seems to assume that the file oneall/django_app/models.py is still around at the end. Alas, it was removed at revision d34833f

Used Gource v0.43 on Mac 10.11.4 Homebrew.

stale

Liz4v avatar May 14 '16 22:05 Liz4v

Hi! Looking at the logfile generated from your repo and selecting all related info...

1431577825|Ekevoo|A|/oneall/django_app/models.py
1433849539|Ekevoo|M|/oneall/django_app/models.py
1433904622|Ekevoo|M|/oneall/django_app/models.py
1433987454|Ekevoo|M|/oneall/django_app/models.py
1433989141|Ekevoo|M|/oneall/django_app/models.py
1433991898|Ekevoo|M|/oneall/django_app/models.py

That file never gets deleted

mathieu-aubin avatar May 16 '16 12:05 mathieu-aubin

Hi Mathieu, would that be an upstream git bug then?

I'm not sure what the command to generate logs is. git log displays only authors and messages; if I add --dirstat it looks markedly different than what you posted.

Liz4v avatar May 17 '16 04:05 Liz4v

i have generated this log with gource... using (posted from the readme):

gource --output-custom-log my-project-custom.log

maybe use gitk to browse

mathieu-aubin avatar May 17 '16 18:05 mathieu-aubin

Well, that's still on gource then.

Liz4v avatar May 17 '16 18:05 Liz4v

That's very true. Hence the: maybe use gitk to browse - meaning look at your git logs - gource did not invent the file - Could be coming from a merged branch? I wish i could be of more help.

mathieu-aubin avatar May 17 '16 19:05 mathieu-aubin

You can see the command used by gource to generate the input log file from git:

gource --git-log-command

Currently:

git log --pretty=format:user:%aN%n%ct --reverse --raw --encoding=UTF-8 --no-renames

You can run your own command and save the output to a file. Providing the file is in the same format Gource will read it. If there is a more accurate command it could use it would be good to know.

acaudwell avatar May 20 '16 23:05 acaudwell

Okay, I've investigated a lot, and here's the missing pieces.

There were two branches during June last year. There was a lot of activity in the green develop branch, and there was a bugfix for the models.py file in the black/purple master branch. Develop One of the first things done in the develop branch was exactly a directory move, that was properly handled by gource as expected. Because of the intertwined activity in the master branch made the models.py file re-appear, which is a bit weird, but completely understandable.

However, down the road, there's a merge commit 0d32878 and it includes a delete of that file (oneall/django_app/models.py) along with several other modifications. Merge

Still, that command (which I modified to display the commit hash) does not list a single modified file for this particular commit! Only the previous one (4ec4c6e) and the next one (da69603).

$ git log --pretty=format:user:%aN%n%ct\ %H --reverse --raw --encoding=UTF-8 --no-renames
(…snip…)
user:Ekevoo
1438223791 4ec4c6eb8a88e11a46790d7f3d5492f7d31c6c84
:100644 100644 1a8328f... 09c6d8e... M  oneall/django_oneall/management/commands/legacyimport.py

user:Ekevoo
1438224809 0d328789f4ea0f802de4dbbafea3605184d5c72c
user:Ekevoo
1441078837 da6960312cfa8a601ceff1a4a7378384f1a372ef
:100644 100644 eb27d58... 8ab8ca4... M  oneall/django_oneall/auth.py
:100644 100644 cd83385... 3e4cc68... M  oneall/django_oneall/templates/oneall/login.html
:100644 100644 6bc1701... 3f7adf9... M  oneall/django_oneall/views.py
(…snip…)

I'm not sure what to suggest now.

Liz4v avatar May 22 '16 17:05 Liz4v

I have a little bodged script that passes through the log generated by gource's default git log command to make sure no deleted files get modified, thus re-added:

files = set()

def test_file(file, action):
	if action == 'A':
		files.add(file)
		return True
	elif action == 'D':
		try:
			files.remove(file)
		except KeyError:
			return False
		return True
	elif action == 'M':
		if file in files:
			return True
		return False

f = open("new_log.txt", "a+")

for line in open("log.txt", "r"):
	l = line.split("\t")
	if len(l) == 2:
		if test_file(l[1], l[0][-1]):
			f.write(line)
	else:
		f.write(line)

tienne-B avatar Dec 23 '19 23:12 tienne-B

This also happens on my repository.

image This file here has never existed in my repository for over a year now. It's some weird Microsoft Frontpage junk file.

Same thing goes for this file whose name was unfonturaly poorly named so I had to blur it. The file's parent folder was renamed, and the file was later deleted. image

PF94 avatar Oct 13 '20 21:10 PF94

The best method to solve this problem that I found is to just linearize your git history first with

git filter-branch --parent-filter 'cut -f 2,3 -d " "'

before you run gource. This will just avoid any kind of problem with files not disappearing due to merge commits. ATTENTION: Do this with a fresh checkout, not with something you are working on!

FlorianWilhelm avatar Jan 04 '21 17:01 FlorianWilhelm

There must be a better solution than rewritting history.

andyquinterom avatar Jan 21 '22 21:01 andyquinterom

From my perspective the git log output is insufficient because it omits merge commits. This problem would not exist if merge commits were considered.

Liz4v avatar Jan 21 '22 21:01 Liz4v

If we add the option --first-parent to git log the problem seems to solve itself. I will open a PR with the changes.

andyquinterom avatar Jan 21 '22 21:01 andyquinterom