repodriller
repodriller copied to clipboard
Why parent of a commit, is not a commit?
I was wondering why the parent
field is a String instead of a commit?
I understand that building every Commit
object for each commit
is the analysed Git
repository will be too heavy ; but a getCommit(String id)
call may return a Commit
object with another Commit
instance as parent
.
WDYT ?
Yes, @ttben, it's definitely due to performance reasons. The way RepoDriller works, we always need the parent commit to perform the diff. However, creating a Commit
object for each parent can be too expensive (and, in most cases, not needed).
So, if you need it, you can get the Commit
using the SCM
that you also have in your visitor.
Does that answer your question?
Yes as it confirms that it is only for performance reasons.
The issue is that, for now, RepoDriller
seems to handle (very very) well isolated analyses.
Eg.: find every commit that has a WORD
in it, every commit of a given authors, etc
But if you want to analyse branches, relation between old a new commits, integrate the notion of evolution, you have to go back and forth between Commit
and SCM
.
I also worked with Rugged and its Walker
. Commit
have id
s but a method parent
can return a Commit
object.
Maybe its just a utility method that performs the call to the SCM
for the user?
Maybe different modes: isolated (current mode) or linked?
This is deeply rooted in the notion of a Commit
, do we consider that a Commit
doesn't know its branch nor its repository? Does it embed every single information from where it came from, or does the user has to go to the SCM
to ask additional information about a given commit.
Sorry for the long text. I'll try to be more concise next time ^^"
Thanks for your clear and concise explanation!
That's definitely doable! We could think of implementing it. Maybe in similar way to my new PR, #111, where one can configure which data to be loaded, and parent
as a Commit
object can be one of them!