[WIP] Add reinsert functionality
VERY MUCH [WIP] AT THE MOMENT
I expect that I will be moving some commits around here and there, possibly into other branches to reduce this branch's size, and hopefully make the code look a lot better... but this is just a starting place for some feedback and discussion to get started.
Also of note: The first couple of commits are currently based on my previous unmerged PRs:
- https://github.com/juliancheal/code-extractor/pull/9
- https://github.com/juliancheal/code-extractor/pull/10
Simpler viewing can be found here:
https://github.com/NickLaMuro/code-extractor/compare/convert-to-module..extract-to-existing-repo
Changes
This is effectively the "undo" feature of the original project. By that, it allows a user to take a portion (or all) of a previously extracted project, and re-insert it back into the original project from which it came from.
This becomes tricky, as we need to filter out the existing commits that existed previously in the old ("target") repository. As a result, 3 filter-branch passes are done to achieve this change:
- After determining all file names that existed for the current extractions, filter commits to down to ones that only include the files we wish to commit. In addition, create a script to move any of those files into a temporary directory, maintaining the directory structure as we go.
- Take the temporary directory that was created and do a second filter that makes the new root of the project the temporary directory we created in the previous filter.
- After adding the target repository as a remote, filter the commits out that already exist in that remote. Also in this step, a commit is re-written that is shared between both repos that will be the "re-inject" commit, which all of the injected commits will be based off of.
From there, cherry-pick commits onto a new branch that is based off the existing HEAD of the current target remote's branch (most likely master) that will be receiving the "injected" commits. This allows the target repo to have the commits that were created "post extraction" that are based off a commit that re-adds the code in a state where it was originally extracted from.
Just an FYI: I tried running this on bits of the manageiq-gems-pending code that I planned on running this on, and unfortunately part of it doesn't work since that project was extracted prior to this project existing, and some assumptions were made to filter out existing commits that doesn't work without the commit messages including the (transferred from ManageIQ/manageiq@-----) bit.
I have an idea for a fallback, but it is taking me a bit to find a good way to handle that. I will be adding that as a follow up commit as well with a different test case to support the fallback, which will add to the time it will take to impilent.
@NickLaMuro hard to review because it's too many things in one PR (though I understand as you said you'd be moving things around and things are based on different PRs), so I'll break down my thoughts:
-
I am :+1: on gemifying...I'll review the other PR.
-
I am :+1: on testing, but that should be a standalone PR (also hard to tell which tests are for the old stuff or the new stuff)
-
I am 〰️ on refactoring the original code. It was pretty simple (77 lines) to begin with, and very easy to follow, so moving things out into classes seems like overkill for a simple tool, unless those classes are going to be reused for the new functionality.
-
I am not sure about reinsert functionality. I can't think of a use-case for it, and even so I'm not sure why we need the complicated logic of matching the tree history vs just grafting in the repo's master branch (or master branch plus a commit to moves things into the right directory locations). I'm more trying to understand what problem you are trying to solve by introducing this direction.
Separately, I'm not sure this feature belongs in this repo. The code-extractor is very single responsibility at the moment. I could see a separate code-inserter tool in the same vein as this repo.
I am 👍 on testing, but that should be a standalone PR (also hard to tell which tests are for the old stuff or the new stuff)
@Fryguy Just an FYI the testing and even part of the "extract into classes" bit were part of the other two PRs mentioned. Best to to use the compare link that I provided in the PR description to see what this PR specifically adds (which admittedly, is still a frick'n lot...)
As I got going with the main purpose of this PR, I realized I had some more refactoring to do to share code between the two processes, but decided not to make too much more "PR-ception" than I already had before I started getting some form of feedback.
I am not sure about reinsert functionality. I can't think of a use-case for it...
As mentioned here, I have one with manageiq-gems-pending and putting some code back into ManageIQ/manageiq. But more on that below.
Separately, I'm not sure this feature belongs in this repo. The code-extractor is very single responsibility at the moment. I could see a separate code-inserter tool in the same vein as this repo.
I am actually fine with that, though I wouldn't mind if you then just scrap all three of these since I don't really see a point in even Gemifying (aka: "making it more complex") if we don't want to add any further scope to the project.
Honestly, the main reason I did this is I wanted a place to put the process that I would be taking with the effort of moving out he "object storage" and miq_ftp_lib.rb that should not exist in the manageiq-gems-pending repo, but don't really deserve standalone gems either. That is the current use case I am working on, but I also attempted something similar past in this PR:
- https://github.com/ManageIQ/manageiq/pull/15358
And even documented it in one of the commits:
- https://github.com/ManageIQ/manageiq/commit/3b79b79
But it had a poor result, and was eventually scrapped. Besides, eventually ManageIQ/managiq-loggers came into being, but that was almost 1.5 years later. However, I do realize this feature has very limited uses, but I really don't want others to have to try and figure this out for themselves like I did if we ever have to do it again, and codifying it is the easiest way to do that I think.
the main reason I did this is I wanted a place to put the process that I would be taking with the effort of moving out he "object storage" and miq_ftp_lib.rb that should not exist in the manageiq-gems-pending repo, but don't really deserve standalone gems either. That is the current use case I am working on
Ah ok. Thanks for the history behind this. That makes more sense to me now. I still think that it deserves a separate tool/repo, and I still like gemifying and testing, and will likely merge those in some form.
I really don't want others to have to try and figure this out for themselves like I did if we ever have to do it again, and codifying it is the easiest way to do that I think.
That's cool. I'm interested to see where this goes, so feel free to rev on it in this PR / repo space.
So a few force pushes later, and I have I think got this working... (still need to test it on my use case though)
Worth noting: Since this is based off of the #9 and #10 PRs, I found a few mistakes in those that I have since fixed and updated in those respective branches, and did some (not all) of the moving around that I said I would with this PR. A few commits were moved into other branches, and I might make an entirely new PR for some of the test_helper.rb/Rakefile changes I have done and re-ordered already. Hard to see with GitHub, since they sort commits by date and not ancestor order, but here is a pretty git log of the current order:
$ git log ...
* 8e0fc61 Nick LaMuro - (HEAD -> extract-to-existing-repo, origin/extract-to-existing-repo) [CodeExtractor] Support merge commits (23 hours ago)
* 52d2a40 Nick LaMuro - [CodeExtractorReinsertTest] Support custom commits (23 hours ago)
* 2059b75 Nick LaMuro - Add Fallback for when code_extractor wasn't used (30 hours ago)
* cef82a2 Nick LaMuro - Add code_extractor_reinsert_test.rb (5 days ago)
* dcc269f Nick LaMuro - [code_extractor_test.rb] Add .update_extraction_hash (5 days ago)
* c20177b Nick LaMuro - [code_extractor_test] Add .apply_new_commits_on_extracted_repo (5 days ago)
* d4c2e80 Nick LaMuro - [code_extractor_test] Add .perform_merges_of_extracted_code (5 days ago)
* a77c492 Nick LaMuro - [code_extractor_test.rb] Add .create_base_repo (5 days ago)
* 2943b5b Nick LaMuro - [CodeExtractor] Add reinsert functionality (7 days ago)
* 06f02bf Nick LaMuro - [test_helper.rb] Add TestRepo.merge (24 hours ago)
* 49d5de8 Nick LaMuro - [test_helper.rb] Add TestRepo#execute (24 hours ago)
* 5b35c07 Nick LaMuro - [test_helper.rb] Add TestRepo#checkout_b (7 days ago)
* cacf177 Nick LaMuro - [test_helper.rb] Add TestRepo.clone_at helper (7 days ago)
* 5351667 Nick LaMuro - [test_helper.rb] Add .set_destination_dir (7 days ago)
* eaaca5c Nick LaMuro - [test_helper.rb] Add assert_commits helper (7 days ago)
* e55817d Nick LaMuro - [Rakefile] Add test:debug task (7 days ago)
* 19ad15f Nick LaMuro - (origin/convert-to-module, convert-to-module) [CodeExtractor] Break out GitProject (3 weeks ago)
* 6020bdf Nick LaMuro - [CodeExtractor] Break out Config to it's own class (3 weeks ago)
* 12da249 Nick LaMuro - [CodeExtractor] Convert to module (3 weeks ago)
* 75534dc Nick LaMuro - (origin/gemify, gemify) [README.md] Update install/usage instructions (3 weeks ago)
(note: commits with [Rakefile] and [test_helper.rb] would get their own PR)
As another aside: All of this digging into git I have done for this PR has resulted in me finding out a better way to rebase "onto" branches I am working off of and moving commits between the two (the tl;dr for the lengthy blog post I would write about it is git rebase --onto).
Anyway, I will be testing out what I currently have against manageiq-gems-pending to so how that goes. Hopefully I have covered all the bases at this point.
👍 @NickLaMuro this looks like an awesome feature!