rspec-metagem
rspec-metagem copied to clipboard
Merge rspec repos
Proof of concept for merging the other repos into this repo while keeping all history (See: https://github.com/rspec/rspec-core/issues/2509). I had some success with the steps described here (https://thoughts.t37.net/merging-2-different-git-repositories-without-losing-your-history-de7a06bba804
mkdir rspec-mono
cd rspec-mono
git clone [email protected]:rspec/rspec.git
git clone [email protected]:rspec/rspec-core.git
git clone [email protected]:rspec/rspec-expectations.git
git clone [email protected]:rspec/rspec-mocks.git
cd rspec-core
mkdir rspec-core
git mv -k * rspec-core
git rm .gitignore
git rm .document
git commit -m 'Moving repo into its own subdirectory'
cd ..
cd rspec-expectations
mkdir rspec-expectations
git mv -k * rspec-expectations
git rm .gitignore
git rm .document
git rm .rspec
git rm .rubocop.yml
git rm .rubocop_rspec_base.yml
git rm .travis.yml
git rm .yardopts
git commit -m 'Moving repo into its own subdirectory'
cd ..
cd rspec-mocks
mkdir rspec-mocks
git mv -k * rspec-mocks
git rm .gitignore
git rm .document
git rm .rspec
git rm .rubocop.yml
git rm .rubocop_rspec_base.yml
git rm .travis.yml
git rm .yardopts
git commit -m 'Moving repo into its own subdirectory'
cd ..
cd rspec
git remote add rspec-core ../rspec-core
git remote add rspec-expectations ../rspec-expectations
git remote add rspec-mocks ../rspec-mocks
git fetch rspec-core
git fetch rspec-expectations
git fetch rspec-mocks
git co -b merge-rspec-repos
git merge --allow-unrelated-histories rspec-core/master
git merge --allow-unrelated-histories rspec-expectations/master
git merge --allow-unrelated-histories rspec-mocks/master
git commit -m 'Import sub repos'
We do already have a seperate proof of concept repo for this, theres a lot to do to make this happen, we need to think about builds (multiple), how we deal with issues, multiple gemspecs etc etc.
@JonRowe Yes, I've seen https://github.com/rspec/rspec-monorepo-prototype. But that doesn't keep the commit history as mentioned in https://github.com/rspec/rspec-core/issues/2509. What would you propose would for the next step?
Honestly I think what is needed is a script that does the relevant repo merging that can be reviewed, and used on a sample repo so we can peruse the results. We'd need this resulting repo to have all the builds that we currently run, and the common rubocop setup.
Once merged into the rspec-dev repo this could be used to generate the final new repo, check everything runs, then that would become the new rspec repo and the others archived with all there issues transferred to the new repo.
The final generation should be done by someone on the core team due after all due diligence on the process, as the resulting PRs of doing it by hand and merging are far to big to review...
@p8 How is it going? I believe this is valuable work, since having a monorepo might simplify development.
Just out of curiosity, how does this merge script deals with branches, e.g. if there are 3.9-maintenance and 3.8-maintenance branches in two original repos, if they are cobmined, is it still possible to check out any of those two branches and have code in the state corresponding to the branch in the original repo, but separate folders? Also tags.
@pirj, I've created a more generic script:
Repo = Struct.new(:name, :url, :branch)
default_branch = 'master'
mono_repo = Repo.new('rspec', '[email protected]:rspec/rspec.git', default_branch)
repos = [
Repo.new('rspec-core', '[email protected]:rspec/rspec-core.git', default_branch),
Repo.new('rspec-expectations', '[email protected]:rspec/rspec-expectations.git', default_branch),
Repo.new('rspec-mocks', '[email protected]:rspec/rspec-mocks.git', default_branch)
]
# These files exist in multiple repo's with minor differences resulting in
# merge conflicts.
conflicting_files = %w[
LICENSE.md
.autotest
.gitignore
.document
.rspec
.rubocop.yml
.rubocop_rspec_base.yml
.travis.yml
.yardopts
]
# Checkout the mono repo
%x(
mkdir rspec-mono
cd rspec-mono
git clone #{mono_repo.url}
cd #{mono_repo.name}
git co #{mono_repo.branch}
git co -b merge-rspec-repos
)
# Merge other repo's into the mono repo while keeping the commit history.
repos.each do |repo|
%x(
cd rspec-mono
git clone #{repo.url}
cd #{repo.name}
git co #{repo.branch}
mkdir #{repo.name}
git mv -k * #{repo.name}
git rm --ignore-unmatch #{conflicting_files.join(' ')}
git commit -m 'Moving #{repo.name} into its own subdirectory'
cd ../#{mono_repo.name}
git remote add #{repo.name} ../#{repo.name}
git fetch #{repo.name}
git merge --allow-unrelated-histories --ff #{repo.name}/#{repo.branch}
)
end
# Resolve conflicts
%x(
cd rspec-mono
cd #{mono_repo.name}
git add .document
git add .gitignore
git add Rakefile
git rm License.txt
git rm rspec-expectations/LICENSE.md
git rm rspec-mocks/LICENSE.md
git commit -m 'Import sub repos'
)
The script only merges the master branches into the mono repo for now. But it's seems to work for other branches as well. I don't think existing tags can be migrated, but they could be recreated by creating new branches.
Please correct me if I'm wrong, but based on the presence of the list of conflicting files, is the directory structure of the result similar to https://github.com/rspec/rspec-monorepo-prototype, or do you merge lib structure as well?
The idea of monorepo is to keep it all in one repo, while still being able to publish separate gems easily.
@pirj Sorry for the late reply. I was indeed under the presumption that everything was to be merged in a single gem. I've changed the script to keep repo's as separate gems in separate folders. So the folder structure is now similar to the mono-repo-prototype:
mono_repo/
mono_repo/rspec/
mono_repo/rspec/README.md
mono_repo/rspec/lib/
mono_repo/rspec/...
mono_repo/rspec-core
mono_repo/rspec-core/README.md
mono_repo/rspec-core/lib/
mono_repo/rspec-core/...
...
It can now also create branches for different versions.
#!/usr/bin/env ruby
working_dir = 'working'
Repo = Struct.new(:name, :url, :branch)
repos = [
Repo.new('rspec', '[email protected]:rspec/rspec.git'),
Repo.new('rspec-core', '[email protected]:rspec/rspec-core.git'),
Repo.new('rspec-expectations', '[email protected]:rspec/rspec-expectations.git'),
Repo.new('rspec-mocks', '[email protected]:rspec/rspec-mocks.git'),
Repo.new('rspec-support', '[email protected]:rspec/rspec-support.git')
]
# merge everything into the rspec-monorepo-prototype repo for now
mono_repo = Repo.new('rspec-mono', '[email protected]:rspec/rspec-monorepo-prototype.git')
# Clone the mono repo
%x(
mkdir #{working_dir}
cd #{working_dir}
git clone #{mono_repo.url} #{mono_repo.name}
)
# Merge sub repos into the mono repo while keeping the commit history.
# 1bf79d2bf is the initial commit of the mono-repo without the sub repos
# to prevent merge conflicts
%w[v3.8.0 v3.9.0].each do |branch|
%x(
cd #{working_dir}
cd #{mono_repo.name}
git co 1bf79d2bf
git co -b #{branch}
)
repos.each do |repo|
repo.branch = branch
# Check out the sub repo and move all files to a sub directory
# with the same name as the repo.
# Merge the sub repo into the mono repo while keeping history.
%x(
cd #{working_dir}
rm -rf #{repo.name}
git clone #{repo.url} #{repo.name}
cd #{repo.name}
mkdir #{repo.name}
git fetch --tags
git co -b #{repo.branch} #{repo.branch}
git mv -k * #{repo.name}
git mv -k {.[!.]*,..?*} #{repo.name}
git commit -m 'Moving #{repo.name} into its own subdirectory'
cd ../#{mono_repo.name}
git remote add #{repo.name} ../#{repo.name}
git fetch #{repo.name}
git merge --allow-unrelated-histories --ff #{repo.name}/#{repo.branch}
)
# rspec-expectations and rspec-mocks both moved License.txt to LICENSE.md.
# Resolve this merge conflict by removing License.txt and adding the
# LICENSE.md for both sub repos.
%x(
cd #{working_dir}
cd #{mono_repo.name}
git rm License.txt
git add rspec-expectations/LICENSE.md
git add rspec-mocks/LICENSE.md
git commit -m 'Fix merge conflict'
)
end
end
I'll see if I can make a PR for the mono-repo-prototype and get the build running.
Thanks for pushing this forward!
I can now run the specs for all sub repo's in the rspec-monorepo-protoype: https://github.com/rspec/rspec-monorepo-prototype/pull/1 Some specs are failing though...
Hmm, weird I'm not seeing Travis in the checks anymore. Here is a link to a job that ran the specs: https://travis-ci.org/rspec/rspec-monorepo-prototype/jobs/643931521
Wow, nice, good job! I actually see a link to a build job in the checks. https://travis-ci.org/rspec/rspec-monorepo-prototype/builds/643953326
Yes, the problem was that my branch could not merge to master. I now have some green jobs on my fork which used the script mentioned above: https://travis-ci.org/p8/rspec-monorepo-prototype/builds/643957490
script/update_rubygems_and_install_bundler: line 8: is_ruby_23_plus: command not found
I guess the problem to the red builds might be this little thing.
Yes, setting the dist to trusty fixes it on travis: https://github.com/rspec/rspec-monorepo-prototype/pull/1/files
1.1) Failure/Error: let(:file_1) { File.open(File.join("tmp", "file_1"), "w").tap { |f| f.sync = true } }
Errno::ENOENT:
No such file or directory @ rb_sysopen - tmp/file_1
Maybe tmp is not present?
@p8 Have you had a chance to take a look at the tmp issue?
@pirj I'll try to have a look this week.
@pirj The script now merges the repo's, adds the travis config, and runs script/run_build in every subrepo. So Rubocop, Cucumber and other checks are run as well. https://gist.github.com/p8/33563f7378376218a9ce078578b6c095
And the build is green: https://travis-ci.org/github/p8/rspec-monorepo-prototype/builds/679697190
Awesome!
So I've created a branch called setup on https://github.com/rspec/rspec-monorepo-prototype which contains this script and have run it to create master-20200503
Some thoughts I have so far are that, we are going to lose a large amount of our history in the form of PR references and overlaps, but I'm not sure how we can do much about that.
We're also going to need to work on all of our tags to get them into line, otherwise they will overwrite one another...
We'd also need to transfer all open issues from the other repos to the monorepo.
I've reviewed @p8's build and it works but would need tweaking / optimising afterwards, specifically I think we might actually want to break out the individual specs in a matrix, maybe this would also be a good opportunity to use Github actions, but my branches seem not to want to run for some reason
Nice!
work on all of our tags
I guess we can add a repo prefix to tags for all repos before merging them.
lose a large amount of our history in the form of PR references
That might be solvable, but I don't have a good idea how exactly. Before hub existed I used a trick to also fetch PR branches by tweaking the local repo's .git/config. Fetching looks like:
$ git fetch origin
From github.com:joyent/node
* [new ref] refs/pull/1000/head -> origin/pr/1000
* [new ref] refs/pull/1002/head -> origin/pr/1002
And even though those pull request won't exist in the resulting repo, the pull request heads will.
Maybe also prefix them as refs/pull/core-1000 refs/pull/expectations-1002 to avoid clashing?
It makes sense to add those two tweaks to the merge script.
@p8 Good job!
Its more that we have commits which reference the pull request number, which would not exist in the new repo, but the old repos will continue to exist, we'll just archive them
maybe this would also be a good opportunity to use Github action
I can start again my investigation on this. :)
For the tags, if we choose to prefix them. I am wondering how will manage the github release page
This is how rails handles it https://github.com/rails/rails/releases They just release all versions in lock step, maybe thats easier, we could adopt that from rspec 4
This is how rails handles it https://github.com/rails/rails/releases They just release all versions in lock step, maybe thats easier, we could adopt that from rspec 4
I am in favor of this.
all versions in lock step, we could adopt that from rspec 4 All for doing it this way 👍
@p8 I can take it from here, add tag prefixes and keep PR references, if you don't want to continue.
@pirj Sure, glad I could help :)
Just in case https://github.com/tj/git-extras/blob/master/Commands.md#git-merge-repo
It'd be great to get this over the line. TBH I feel it's a huge pain to work with so many git repos, issue trackers, separate projects for PRs, etc. This probably also drives potential contributors away, because to contribute to RSpec one needs a far more complicated setup than about any other gem.