codecharta
codecharta copied to clipboard
Merge multiple project files into multiple project outputs (MIMO)
Feature request
Improve merge so it is better at handling typical microservice merge cases where we merge multiple files for multiple repos.
Description
As an auditor, I want to merge multiple repositories automatically so that merging multiple microservice cc.jsons is much easier.
Context
Let's say you have an audit where a team has divided their code up into 20 repositories. Then you'll have 20 .sonar.cc.json and 20 .git.cc.json files which you have to merge. Perhaps these files are organized like this:
├📁 sonar
├─📄 prj1.sonar.cc.json
├─📄 ...
├─📄 prj20.sonar.cc.json
├📁 git
├─📄 prj1.git.cc.json
├─📄 ...
├─📄 prj20.git.cc.json
├📁 raw # perhaps you have another folder with raw metrics
├📁 merge # empty right now, but this could be the folder where the merge result would be
Or perhaps these files are organized like this:
├📁 projects
├─📄 prj1.sonar.cc.json
├─📄 prj1.git.cc.json
├─📄 ...
├─📄 prj20.sonar.cc.json
├─📄 prj20.git.cc.json
├📁 merge # empty right now, but this could be the folder where the merge result would be
Note that in both cases the project names match exactly, it just the git/sonar/raw that is different.
I would be great if you could merge these projects via command-line so the result looks like:
├📁 merge
├─📄 prj1.merge.cc.json
├─📄 ...
├─📄 prj20.merge.cc.json
Acceptance criteria
- One new command-line argument exists to merge multiple files with the same name into multiple output files. For example:
-
ccsh merge sonar/ git/ raw/ -mimo MATCH_BY_DOT_PREFIX -o merge/
-
ccsh merge projects/ -mimo MATCH_BY_DOT_PREFIX -o merge/
- MIMO = Multiple Inputs & Multiple Outputs
- Removed for now: MATCH_BY_DOT_PREFIX -> in case other matching strategies make sense in the future. We could name the one proposed here MATCH_BY_DOT_PREFIX. If no other strategy is proposed in the future this parameter could be changed to the default one, so it does not have to be specified any more.
-
- Files are matched based on their prefix before the "dot".
- prj1.sonar.cc.json and prj1.git.cc.json and prj1.raw.cc.json would be matched because prj1 are equal.
- If a file could not be matched to any other file an warning is reported to the error stream.
- If a file could not be matched, files that have a Levenshtein distance of less than 3 are suggested.
- This is done so typos are easy to identify.
- For example mailbox.sonar.cc.json exists and does not have a match. There also exists a malbox.git.cc.json, notice that the i is missing. Then malbox.git.cc.json has a Levenshtein distance of 1 to mailbox.sonar.cc.json and should be suggested as a merge target.
- It is up to the user to fix file typos
Development notes (optional Task Breakdown)
- [ ]
- [ ]
- [ ]