Move utility programs out of the main coreruleset repository
Since ages ago, probably when there was only one repo assigned in the OWASP organization, we pushed all the utilities related to CRS in the same repository as the rules.
This has lead to having lots of different tools and scripts in the same place, making it more difficult to test properly, and to perform updates on tools independent from the rules themselves.
After considering it in our October 2024 monthly chat, we decided to move away from this pattern.
The idea is then to split the tools and scripts in different repos. This will be the epic ticket to start the move.
The (updated) proposal is:
Move
- [x] #3892
- [x] APPROVED_TAGS file is used by crs-rules-check. Must be moved with the script
- [x] ✅ move the
rule_ctldirectory to its own repository. Done: https://github.com/coreruleset/rule-ctl - [x] ✅ #3880
Add to crs-toolchain as feature
- [x] https://github.com/coreruleset/crs-toolchain/issues/181
- [ ] https://github.com/coreruleset/crs-toolchain/issues/204
Remove
- [x] #3870
- [x] #3868
- [x] #3890
- [x] #3888
- [x] #3882
- [x] #3876
- [x] #3878
- [x] #3874
- [x] #3872
- [x] #3886
- [x] #3884
- [x] ~
regexp-tricks+ add documentation https://github.com/coreruleset/crs-toolchain/pull/202~ It turns out that we actually never used that code? So we leave one behind.
Do we want to keep the names? Eg. is the crs-rules-check is a good name or do we want to continue with rules-check? Or may be coreruleset-check (just a stupid idea - we can choose anything...)?
Question 1: How many of the utilities in /util are no longer useful and can simply be sunset at this time? (I couldn't see all of them listed, hence asking this question.)
Question 2: How many of the utilities that we want to keep will break if they no longer live in /util? E.g. hard coded paths and similar? (E.g. it might be easier to simply sunset rather than try and rewrite/fix some of them if moving them will cause them to break?)
Question 2: How many of the utilities that we want to keep will break if they no longer live in /util? E.g. hard coded paths and similar? (E.g. it might be easier to simply sunset rather than try and rewrite/fix some of them if moving them will cause them to break?)
In our CI workflow we use crs-check-rules script. I don't know about other tools which is used.
Do we want to keep the names? Eg. is the
crs-rules-checkis a good name or do we want to continue withrules-check? Or may becoreruleset-check(just a stupid idea - we can choose anything...)?
I think we use is as our main linter, so I'll call this one crs-linter?
Question 1: How many of the utilities in /util are no longer useful and can simply be sunset at this time? (I couldn't see all of them listed, hence asking this question.)
Question 2: How many of the utilities that we want to keep will break if they no longer live in /util? E.g. hard coded paths and similar? (E.g. it might be easier to simply sunset rather than try and rewrite/fix some of them if moving them will cause them to break?)
Answer 1: Updated the list to cover all files and directories. Proposed removal, move and we need to know what to do with the unknowns.
Answer 2: probably not much, as we call them from our ci/cd or we provide it for people to use as standalone scripts. So we can remove without looking back.
My take where I do not agree with original proposal.
-
move av-scanning to its own repository.Why? We have a plugin and the plugin is better than this cruft. -
move the rule_ctl directory to its own repository.This is strongly tied to our rules. If we move this, we run the risk to lose the functionality. I did not use it myself like ever, but it has its uses. -
virtual-patching move to its own repo. I think we can start adding more products2secrules here, so there is space for improvement.Is this maintained? Is it useful? Can't we simply remove it? -
honeypot-sensor update for v4, create plugin?Remove and point to honeypot project in a message -
find-max-datalen-in-tests ?Not sure this works with recent YAML format changes. Remove or store far away from users (but so we can find it again if use it in like 27 years out of the blue) -
find-rules-without-test ?Is this still relevant? -
fp-finder? move to its own repo? add feature to crs-toolchain?We are going with the go implementation of the quantiatative testing, don't we? The functioanlity is somewhat overlapping if I get this one correctly. But even if it's not, I think we want to integrate everything into the toolchain. -
join-multiline-rules add feature to crs-toolchain?No longer necessary. Remove. -
php-dictionary-gen add feature to crs-toolchain? move to own repo?I'd rather have it as part of the toolchain. -
regexp-tricks add as crs-toolchain feature?This is a folder containing one such trick. That one is marked as experimental and lacks a description of what it really does. Hard to say what the course of action should be. -
file [send-payload-pls.sh](https://github.com/coreruleset/coreruleset/blob/main/util/send-payload-pls.sh). Don't know its utility. Remove?This is fairly useful. It's an adoption of Andrea of one of my own scripts. I use it when assessing a payload after somebody claims he / she has found a bypass. Could be done via a script wrapping a sandbox call. As it stands maybe remove it. -
[verify.rb](https://github.com/coreruleset/coreruleset/blob/main/util/verify.rb) and [d-range](https://github.com/coreruleset/coreruleset/blob/main/util/id-range) looks like they work together. Don't know what they do, I'll just remove those.The work together indeed. Contributed by the FlameEyes project, a former ModSec rule project. It checks rule IDs for duplicates and if they fit into the assigned range of the project. Let's remove this.
I'll add this to the above decision then.
About the rule_ctl: all tools are strongly tied to our rules. go-ftw, crs-toolchain, etc. This is one similar tool, that adds linting to the rules, with our specific requirements. So it is not different from others.
Moving it to a new repository would allow us to write proper python tests, and eliminate the undesired behavior that we have now that is sending a patch when something changes, and "just reading the code" to see if it works.
You wrote above:
APPROVED_TAGS file is used by crs-rules-check. Must be moved with the script
I think this is not a good idea.
Consider someone wants to add new tag(s) into rule(s). Until now it was enough to add the new tag(s) to utils/APPROVED_TAGS file. But from now the contributor needs to add a PR to the another repository too.
I vote to keep this file inside of coreruleset repository.
Makes sense. Also, the script might not need to know which tags are approved.
the script might not need to know which tags are approved.
That's why there is the -t option. We can pass the file with that which contains the approved tags.