git-tfs icon indicating copy to clipboard operation
git-tfs copied to clipboard

Clone with branches lead to infinite loop

Open Sterbsli opened this issue 10 years ago • 13 comments
trafficstars

Hi all

I encountered a problem while cloning a TFS Repository into a Git Repository with Full History.

A simplified view of my TFS Repository looks like this:

             MAIN
             /     \
           /         \
  Dev/Dev       Release/UAT       <--- C5315 (Caused the problem)

I cloned the Repository with following command

D:\GitTfs-0.23.0\git-tfs.exe clone --with-branches --with-labels -d <tfsServer> $/k2.DWH/Main D:\GitRepo\

During the clone process, the changeset C5315 on Branch Release/UAT causes an infinite loop, as the Changeset is a "merge changeset" and includes a rename of the Branche Release/UAT from "UAT" to "Release/UAT"

Somehow it always tries to clone itself for infite...

How can I avoid this issue while cloning the TFS Repository with full history?

As the history is extremely long, we decided, we don't need everything. Therefore I tried additionaly to clone the repository beginning from Changeset C12322, to avoid having the troublesome changeset in the history.

I used the following command:

D:\GitTfs-0.23.0\git-tfs.exe quick-clone --with-branches --with-labels -changeset=12322 -d <TFSServer> $/k2.DWH/Main D:\GitRepo\

After that initial clone I used a pull to get the rest of the history:

D:\GitTfs-0.23.0\git-tfs.exe pull -i default --all -d

Unfortunately, the process also encounters the troublesome changeset C5313.

Can I somehow clone the repository beginning with changeset C12322 and ignore every older changeset lower than C12322?

Thanks a lot for your efforts and Best Regards,

Sterbsli

Sterbsli avatar Nov 16 '15 18:11 Sterbsli

Hello,

I can also reproduce the problem by deleting (and re-creating?) a branch.

Repro Steps

  1. Add trunk "$/TeamProject/Trunk".
  2. Create branch "$/TeamProject/Branch" from "$/TeamProject/Trunk".
  3. Delete branch "$/TeamProject/Branch".
  4. Create branch "$/TeamProject/Branch" from "$/TeamProject/Trunk" again.

Debug output (repeating endless)

Looking for changeset 68744 in git repository...
Changesets fetched!
Cleaning...
Fetching remote :default
Try fetching changesets...
info: refs/remotes/tfs/default: Getting changesets from 68834 to -1 ...
Looking for changeset 68744 in git repository...
info: refs/remotes/tfs/Branch: Getting changesets from 67012 to 68744 ...
Setting up a TFS workspace at .git\~w
TFS Workspace 'git-tfs-cde6789e-ff61-448f-927f-450d5f331d85;User Name' was removed.
Cleaning...
Looking for changeset 68744 in git repository...
Changesets fetched!
Cleaning...
Fetching remote :default
Try fetching changesets...
info: refs/remotes/tfs/default: Getting changesets from 68834 to -1 ...
Looking for changeset 68744 in git repository...
info: refs/remotes/tfs/Branch: Getting changesets from 67012 to 68744 ...
Setting up a TFS workspace at .git\~w
TFS Workspace 'git-tfs-91bd1279-b9b6-49fa-a594-59d567d3910d;User Name' was removed.
Cleaning...

Debugging Information

// Class: Sep.Git.Tfs.Core.GitTfsRemote
// Method: FetchWithMerge(int, bool, int, IRenameResult, string[])

// DEBUG: changeset.IsRenameChangeset = true
// DEBUG: isFirstCommitInRepository = false
if (changeset.IsRenameChangeset && !isFirstCommitInRepository)
{
  // DEBUG: renameResult = null
  if (renameResult == null || !renameResult.IsProcessingRenameChangeset)
  {
    fetchResult.IsProcessingRenameChangeset = true;
    fetchResult.LastParentCommitBeforeRename = MaxCommitHash;
    // DEBUG: Here the fetch-loop is aborted.
    return fetchResult;
  }
  renameResult.IsProcessingRenameChangeset = false;
  renameResult.LastParentCommitBeforeRename = null;
}

And this is what the caller does with the result:

// Class: Sep.Git.Tfs.Core.GitTfsRemote
// Method: FindRemoteAndFetch(int, bool, bool, IRenameResult, [out] string)

try
{
  var fetchResult = ((GitTfsRemote) tfsRemote).FetchWithMerge(-1, stopOnFailMergeCommit, parentChangesetId, renameResult);
}

Gareon avatar Nov 27 '15 20:11 Gareon

Thanks for the reproduction case! ...but I won't be able to work on that before long! :worried: If you could continue and find a solution, it will be a real pleasure!

pmiossec avatar Nov 29 '15 16:11 pmiossec

Hello, I'm having the same issue that Gareon when migrating the full history. The branch was created and after 1 commit, deleted and re-created. Is there a way I can bypass this branch? Or ignore the old branch creation and only took the latest?

Thanks, Rodrigo

rrivem avatar Dec 18 '15 04:12 rrivem

First of all, I've to say I am very impressed with the power of this tool, I've find this very useful and extensive!

After spending several time with the code i didn't found a fix but a hack that worked to clone the whole history. Perhaps someone else can perform the change in code.

Here's my scenario:

        trunk b1
7370    *        commit1 in trunk
7371    |\----*  b1 created from 7370
7372    *     |  commit2 in trunk
7373    |\    x  deleting b1
7374    | \---*  b1 created from 7372
...     |     |
7378    |     *  commit3 in b1
7380    *----/   merged commit3 in trunk

The problem is originated when a branch is reused and merged into trunk. To fetch the whole history of trunk, the program goes through the changes in that branch until it finds a merge. There, it tries to pull the branch's history that originated the merge commit. So it looks for the root changeset of the branch by picking the first commit on the branch. The problem is that in this case would be C7371, but I would wanted it to pick C7374. My idea was to pick the latest branch root changeset, by changing in TfsHelperBase::GetRootChangesetForBranch to:

var changesets = VersionControl.QueryHistory(tfsPathBranchToCreate, VersionSpec.Latest, 0, RecursionType.Full, null, null, null, int.MaxValue, true, false, false, true).Cast<Changeset>();
var firstChangesetInBranchToCreate = changesets.LastOrDefault(x => x.Changes.Any(c => c.ChangeType.HasFlag(ChangeType.Branch) && c.Item.ItemType == ItemType.Folder && c.Item.ServerItem == tfsPathBranchToCreate));

Here, I'm looking for the last changeset that performs a branch operation of the branch root folder.

Having that changeset, the program searches for its parent commit were the branch was created from, and use its changeset id to start fetching the branch, in this case is C7372.

But when is going to fetch the changesets for the branch, in GitTfsRemote::FetchChangesets, it starts from the parent's changeset (C7372) + 1, and therefore stopping at C7373 were the branch is deleted.

int lowerBoundChangesetId;
if(properties.InitialChangeset.HasValue)
    lowerBoundChangesetId = Math.Max(MaxChangesetId + 1, properties.InitialChangeset.Value);
else
    lowerBoundChangesetId = MaxChangesetId + 1;

So, my hack was to identify the branches under this situation and change the original changeset for the branch in the method GitTfsRemote::FetchChangesets to the one I wanted (C7374 in this case).

private static readonly Dictionary<int, int> replaceChangesets = new Dictionary<int, int>
{
    { 7372, 7374 },
};
if (replaceChangesets.ContainsKey(lowerBoundChangesetId - 1))
{
    lowerBoundChangesetId = replaceChangesets[lowerBoundChangesetId - 1];
}

I believe MaxChangeset should be initialized with the first changeset to pull, but I wasn't sure how this change would affect the rest of the process.

I hope this helps others to be able to do the change in code.

Thanks, Rodrigo

rrivem avatar Dec 21 '15 13:12 rrivem

@rrivem or @icnocop if you have a chance could you make a PR with the code changes in the previous comment?

spraints avatar Mar 10 '16 20:03 spraints

@spraints, it seems the code changes by @rrivem are specific to his changeset ids, so they can't just be taken as-is.

icnocop avatar Mar 10 '16 20:03 icnocop

Exactly, what I did was bypassing the commits that belonged to branches that were recreated with the same name.

If you're experiencing the same issue you can replace the 3 code snippets I described above, and you should be able to replicate the entire history.

Let me see if I can upload the changes to github

rrivem avatar Mar 10 '16 20:03 rrivem

I tried to create a unit test to test the behavior but I get an unexpected/unrelated exception. See pull request #927. Any ideas?

icnocop avatar Mar 11 '16 05:03 icnocop

There might be a workaround that could work for "lighter" clones. That is the full history will not be available

If possible start the clone after the initial "reused" branch is already dead by using "-c 7374" (7374 from example above replace with your changeset-id)

This works for me on a test project I have setup as @rrivem described above. I am trying to verify it on my real project as well but that will take quite some time....

I might add that I did

git tfs clone -c 7374 "tfsurl" "mainbranch" "gitrepo" cd in to "gitrepo" git tfs branch --init --all

polyzois avatar Mar 29 '16 13:03 polyzois

No luck for my real project, that actually contains multiple renames. I think there are two issues at play here. One for renames as @Sterbsli points out and the other for dead branches as @rrivem and @Gareon points out. For renames the "-c" does not work as the "init" of a new branch goes all the way to the beginning. As seen here in GetRootChangesetForBranch https://github.com/git-tfs/git-tfs/blob/9399d4dd08976a2f654fe3089ff9fb189c036058/GitTfs.VsCommon/TfsHelper.Common.cs#L299

polyzois avatar Mar 30 '16 09:03 polyzois

I would be very happy if someone explained the purpose of this (https://github.com/git-tfs/git-tfs/blob/9399d4dd08976a2f654fe3089ff9fb189c036058/GitTfs.VsCommon/TfsHelper.Common.cs#L316)

if (renameFromBranch != null)
                    GetRootChangesetForBranch(rootBranches, renameFromBranch);

This is causing my problems for renaming (branches have changed names back and forth). I added an extra condition to it and then it works for my renames:

 if (renameFromBranch != null && !renameFromBranch.Equals(tfsParentBranch))

I am guessing that this code is responsible for infinte recursion as it is calling itself...

polyzois avatar Mar 30 '16 09:03 polyzois

possibly related: issue #871

timabell avatar Apr 19 '16 10:04 timabell

I'm also having infinite recursion issues and I got it well defined and pinned down to a test at #1074. It is caused by a merge changeset having as a parent a rename changeset.

There seem to be multiple fixes or workarounds, but I'm not sure which is the right one. Would any of you who might be able to lend insight please take a look? Thanks!

jnm2 avatar Jun 03 '17 00:06 jnm2