repodriller icon indicating copy to clipboard operation
repodriller copied to clipboard

Why parent of a commit, is not a commit?

Open ttben opened this issue 7 years ago • 3 comments

I was wondering why the parent field is a String instead of a commit? I understand that building every Commit object for each commit is the analysed Git repository will be too heavy ; but a getCommit(String id) call may return a Commit object with another Commit instance as parent. WDYT ?

ttben avatar Dec 17 '17 14:12 ttben

Yes, @ttben, it's definitely due to performance reasons. The way RepoDriller works, we always need the parent commit to perform the diff. However, creating a Commit object for each parent can be too expensive (and, in most cases, not needed).

So, if you need it, you can get the Commit using the SCM that you also have in your visitor.

Does that answer your question?

mauricioaniche avatar Dec 17 '17 14:12 mauricioaniche

Yes as it confirms that it is only for performance reasons.

The issue is that, for now, RepoDriller seems to handle (very very) well isolated analyses. Eg.: find every commit that has a WORD in it, every commit of a given authors, etc

But if you want to analyse branches, relation between old a new commits, integrate the notion of evolution, you have to go back and forth between Commit and SCM.

I also worked with Rugged and its Walker. Commit have ids but a method parent can return a Commit object.

Maybe its just a utility method that performs the call to the SCM for the user? Maybe different modes: isolated (current mode) or linked?

This is deeply rooted in the notion of a Commit, do we consider that a Commit doesn't know its branch nor its repository? Does it embed every single information from where it came from, or does the user has to go to the SCM to ask additional information about a given commit.

Sorry for the long text. I'll try to be more concise next time ^^"

ttben avatar Dec 17 '17 14:12 ttben

Thanks for your clear and concise explanation!

That's definitely doable! We could think of implementing it. Maybe in similar way to my new PR, #111, where one can configure which data to be loaded, and parent as a Commit object can be one of them!

mauricioaniche avatar Dec 17 '17 14:12 mauricioaniche