p4-fusion icon indicating copy to clipboard operation
p4-fusion copied to clipboard

Support view mappings for branches

Open rpetti opened this issue 2 years ago • 3 comments

The docs show that this only works for single branches and even then, only branches containing single paths. Perforce allows branches to be located in any location, and even have different mappings for different branches.

Proposal: Allow the specification of a configuration file that describes the branches and the views used by them to sync:

[main]
//depot/main/config/my-component/... config/...
//depot/main/pkg/my-subcomponent-a/... pkg/my-subcomponent-a/...
//depot/main/pkg/my-subcomponent-b/... pkg/my-subcomponent-b/...

[dev]
//depot/comp/my-component/branches/dev/... ...

rpetti avatar Sep 14 '22 15:09 rpetti

This is definitely the point where we consider how much of Perforce do we support and translate to Git.

Perhaps it may not be possible to support all of the features that Perforce provides like p4 streams and branch view mappings just because of the scope of the problem that p4-fusion is solving.

Do you have a specific use-case that I can know where you'd want to use this kind of a functionality?

p4-fusion is intended to be used across the entire VCS (or at least the most relevant sub directory) and that is why we initially only built it to support a single depot path to perform all of the conversion.

twarit-waikar avatar Sep 15 '22 07:09 twarit-waikar

The use case would be to do a conversion that preserves all branches and merge history for a project, irrespective of how the branches are structured in Perforce. Being able to specify the mapping between git branches and p4 depots seemed like the most straightforward way to accomplish this...

I can't find any other tool capable of doing a proper conversion in this manner. Git Fusion could do it, but Perforce doesn't make that available anymore.

If it's too difficult to add to this tool or you believe it is out of scope then I understand.

On Thu, Sep 15, 2022, 01:39 twarit-waikar @.***> wrote:

This is definitely the point where we consider how much of Perforce do we support and translate to Git.

Perhaps it may not be possible to support all of the features that Perforce provides like p4 streams and branch view mappings just because of the scope of the problem that p4-fusion is solving.

Do you have a specific use-case that I can know where you'd want to use this kind of a functionality?

p4-fusion is intended to be used across the entire VCS (or at least the most relevant sub directory) and that is why we initially only built it to support a single depot path to perform all of the conversion.

— Reply to this email directly, view it on GitHub https://github.com/salesforce/p4-fusion/issues/54#issuecomment-1247702989, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABSSRJ4QX5VLNLAY7WWUR3V6LHCXANCNFSM6AAAAAAQMRYZT4 . You are receiving this because you authored the thread.Message ID: @.***>

rpetti avatar Sep 15 '22 13:09 rpetti

I understand the use-case. Ideally, p4-fusion would create branches in Git that would match to the ones in Perforce and then also create merge commits on the main branch when integration CLs are encountered in the history. I do believe that's the ideal that p4-fusion should be.

However, adding that kind of a functionality needs some extra research. That's not to say it's impossible, just that it needs an in-depth understanding of the branch and view mapping in Perforce to be able to convert that into Git

twarit-waikar avatar Sep 16 '22 06:09 twarit-waikar

If we limit the investigation to the Perforce streaming model (branches in the form //depot/branchname), then some simplifying assumptions can be made:

  1. Each "branchname" is its own branch (branch maps can mean that each branch map is its own map, and the stream branch model is one specific form of that, but overlapping branch maps would be tricky and potentially won't do what the end-user wants in all cases);
  2. Each Change list only affect a single branch (with classic depots, where change lists can spread across multiple locations, a single changelist could be broken into one change per branch).

This now leaves us inspecting the per-file action type for branch, move/add, move/delete, integrate logic (I'm leaving out import and archive which aren't exactly branching commands). I'm referencing the action types from the fstat command.

Simple integration is the equivalent of a Git cherry picked merge. There isn't much complication here. It would be complicated if we want a formal merge history, as that would require inspecting the formal merge file list and ensuring it matches up with the integration changelist's file list.

Integration with edit is more difficult. It's a Git cherry picked merge with a conflict edit (my knowledge of the Git API is limited).

In most cases, the move/add and move/delete pair should be only in a single branch, but there's nothing stopping it from spreading out. This logic is much more complicated. If done cross-branch, it's the Git equivalent of merge -> delete.

groboclown avatar Nov 29 '22 15:11 groboclown

More thoughts on this issue:

  • When a Perforce branch's first changelist is an integration from another branch, that other branch must be created first. This adds a dependency tree to the construction.
  • A branch can exist in Perforce history, but the head revision can be deleted. Should there be special logic to skip the branch construction logic for these? The commits can be reproduced in Git, but should the extra effort of putting them in other branches continue? I'm guessing yes, as it's a single algorithm.
  • I'm thinking that the general algorithm should:
    1. Walk through the changelists in the complete list of branches, in increasing order.
    2. For the changelist, look at each changed file. 1. If the file is one of the integration type actions, use the branch spec to extract the source and destination branch. Record an association between the source/destination branches (there could be multiple, if branch specs overlap) and the file. 2. If the file is not an integration type action, record an association between the file's branches (there could be multiple) and the file.
    3. After examining each file, create a git commit for each separate branch group.

groboclown avatar Nov 30 '22 15:11 groboclown

I'm putting a proof-of-concept for this together.

The out-of-scope scenarios so far:

  • Perforce Streams for anything other than a simple path mapping. Only "...". Excludes and imports and other edge cases are silently ignored.
  • Integrating within a branch. Perforce is able to maintain the integration history here. Git doesn't. Old p4-fusion behavior is kept.
  • Integrating to another branch with a file name change. Proper branching equivalency here would split the changelist into 2 commits (merge + move). I'm keeping this as a simple "add" in Git.

groboclown avatar Nov 30 '22 22:11 groboclown

Another tricky scenario. This is the kicker that will slow everything down.

  • Integrating from one branch to another but with a not-head revision changelist of the source.

Dealing with this scenario (which is a must) means storing a mapping between changelists and Git commit hashes.

groboclown avatar Nov 30 '22 23:11 groboclown

The issue where Perforce had an integration from a not-head revision makes this an incredibly difficult problem. In most cases in the history of the depot, we'll be dealing with a "yes, it's a branch" situation of:

  • CL 1 - file //a/dev/b.txt added (revision 1, which has CL 1)
  • CL 2 - file //a/dev/c.txt added (revision 1, which has CL 2)
  • CL 3 - file //a/dev/... integrated to //a/main/...
  • CL 4 - file //a/dev/b.txt modified

In this scenario, CL 3 is a valid merge, but inspecting the log history of //a/main/...@3 shows that b.txt was merged from CL 1 and c.txt was merged from CL 2. They don't have the same source changelist.

In the case of a pre-history merge:

  • CL 1 - file //a/dev/b.txt added (revision 1, which has CL 1)
  • CL 2 - file //a/dev/b.txt edited (revision 2, which has CL 2)
  • CL 3 - file //a/dev/... integrated to //a/main/...
  • CL 4 - file //a/dev/b.txt#1 integrated to //a/main/...

In this situation, CL 4 is merging from an older revision than the state of the "dev" branch when the CL 4 happened.

My current approach to deal with these situations is a combination of discovery then action.

  • Discovery: Should we classify this changelist as a Git merge?
    • The changelist contains only integration style actions (move/add, integrate, branch, import).
    • The changelist files have a source depot path from one and only one branch, and the target depot path to one and only one branch, each with a same name result (//a/dev/a.txt will move to //a/main/a.txt, not //a/main/b.txt).
  • Action: How to perform the Git action?
    • Cherrypick the source files in the source branch commit into the target branch.
    • Run the changelist through the classic single branch method.

groboclown avatar Dec 01 '22 18:12 groboclown

My branching method seems to be mostly working:

https://github.com/groboclown/p4-fusion/tree/branch-support-poc

I'm tracking down a bug where a commit happens to the wrong branch, or possibly to multiple branches. It looks like a possible issue with my reference handling, or maybe it's just a real bug.

groboclown avatar Dec 04 '22 08:12 groboclown

Got it working! PR incoming.

I know my comments on this issue have turned into a developer blog. I hope I haven't been too noisy with posting this.

groboclown avatar Dec 04 '22 23:12 groboclown