Improve processing logic
- Improve and unify logic of downloaded allowlist and blocklist processing
- Improve console output with highlight colors and spacers
- Implement config validation and parsing
- Change config format
- Implement config migration
- Improve run state checks and reporting
- Remove support for running custom commands specified in config with eval
- Support specifying in config a custom script to source and run functions report_sucess(), report_failure() from
- Support custom dnsmasq directory when specified in UCI
- Append a custom entry to the blocklist and use it to verify that the blocklist was successfully loaded
(I may have forgotten something)
@Wizballs just tagging you so that you see this impressive looking PR.
Gave everything a quick test run only and is working great. I'll run more detailed tests when I can, but this is looking really good. So many improvements and added functions!
^
- parse_config(): detect and prevent previously undetected invalid entries
- parse_config(): support reading and comparing current and default config format versions
- print_def_config(): add config format version
- write_config(): validate new config before moving to permanent storage
- load_config(): don't validate fixed config (now validated by write_config)
- gen_config(): don't validate new config (now validated by write_config)
- load_config(), parse_config(), gen_config(): simplify communication between functions
@lynxthecat Would you prefer the commits squashed into one?
No please keep them all! But could you maybe capitalise the one beginning “fix.. “ to make it “Fix..” for consistency?
^ Fixed the 'fix'
^
- parse_config(): compact some code
- parse_config(): don't print legacy keys
- load_config(): save old config before fixing it
- update(): call load_config() upon success
@Wizballs I’ll also run a couple of tests and if ours both passes then we can merge in. So far so good for my ordinary day to day runs.
I'm on now also. Testing and will report back...
Surprised (good way) by the number of commits every time I look ;)
Working:
- start/stop/pause/resume
- rouge check (block and allow)
Unsure:
- gen_config is just overwriting the config file to defaults. It's not reusing existing valid config entries. If this is intended all good.
And two items that are for later maybe (one discussed previously):
-
Discussed previously, but will the custom script have a default path, and have a config entry disable by default? This would save editing the path on a gen_config, could just switch to a YES/1 or whatever the config entry would be.
-
If a cron job is called while an invalid config exists, it correctly exists. Should it report_failure so people can receive a message?
user.info Fix your config file '/root/adblock-lean/config' or generate default config using 'service adblock-lean gen_config'.
user.err adblock-lean: Error: Failed to load config.
- gen_config is just overwriting the config file to defaults. It's not reusing existing valid config entries
"It's not a bug, it's a feature". In what scenario would you want gen_config() to reuse existing values? My logic is this: if user's config is technically compliant to the format then there is no need to fix it. If it's not compliant then load_config() will pick this up and suggest fixes. If the user is unhappy with their config for any reason, or if the config file is missing, or if the config file has issues which can not be automatically fixed, then they can discard the old config and have it replaced with the default one - this is what gen_config() is for.
, it's a feature
Perfect got it. I'll check the load_config part....
Alright I got it now. Intentionally deleted lines and messed up my config. And it recommended the missing lines and highlighted entries that shouldn't be there. Worked great on multiple different test.
Very smooth.
- will the custom script have a default path, and have a config entry disable by default? This would save editing the path on a gen_config, could just switch to a YES/1 or whatever the config entry would be.
This makes sense. Pro's: lower chance of user error when copy-pasting the path, slightly easier process for the user. Con's: adds another config option. I don't mind either way.
- If a cron job is called while an invalid config exists, it correctly exists. Should it report_failure so people can receive a message?
Checking on this... Will comment a bit later.
I think, leave this as is, keeps config options lower as you highlighted. Anyone using a custom script should be able to manage this.
Also @Ree9 the default recommended path for this in the config file from antonk is
# Recommended path is '/usr/libexec/abl_custom-script.sh' which the luci app has permission to access
Mabye worth changing from your Luci default /root/adblock-lean/custom-script location?
- If a cron job is called while an invalid config exists, it correctly exists. Should it report_failure so people can receive a message?
So the challenge here is this: if the config is invalid, can we trust it enough to source the custom script and call the report_failure() function from it? Current approach relies on valid config as 'root-of-trust' for the matter of safely using any config entries. Technically we could implement an exclusion for the custom_script option but I'm not sure if this is a good idea. Not saying that it's not. Just not sure.
"It's not a bug, it's a feature". In what scenario would you want gen_config() to reuse existing values? My logic is this: if user's config is technically compliant to the format then there is no need to fix it. If it's not compliant then load_config() will pick this up and suggest fixes. If the user is unhappy with their config for any reason, or if the config file is missing, or if the config file has issues which can not be automatically fixed, then they can discard the old config and have it replaced with the default one - this is what gen_config() is for.
Just so I understand, valid old entries will be retained, but invalid old entries will be deleted and any needed replacements adds as defaults, right? So a custom config element without double quotes will get lost, but a custom config element with double quotes will be retained? I think that's probably just fine.
can we trust it enough to source the custom script and call the report_failure()
Looking at this another way also, if a custom notification does not come through, that should be reason alone to investigate. Leave as, I actually don't think it's overly important or worth additional lines of code. Thanks for bouncing idea's back and forth, sometimes it just helps to put a reason behind it.
Seems super daring and ambitious to have config elements automatically changed, but it does seem to be working. Nice one.
config elements automatically changed, but it does seem to be working. Nice one
It works really well. I messed my config up various different ways, and it always fixed missing elements, or reported "rogue" entries ;)
While we're at it, maybe now it's a good time to stop using /root and move stuff to where it should normally be? I.e. probably to /etc/adblock-lean/.
Seems super daring and ambitious to have config elements automatically changed
It looks more impressive than it actually is. The logic is very simple. We check for critical issues which would deem the config unusable - if found, we throw an error and quit. If not found then we load the values for expected keys from the existing config, while registering missing and unexpected keys, or entries which don't have double quotes around the values. Then we report of any found issues to the user and ask their permission to fix them. The fix is ultra-simple: we take the default config file and replace the values in it with values of the known-good entries in the current config. Then we validate the result. If it passes validation, we write it out. That's it.
Well I've thrown everything at this, and it's come up spot on. Handles every situation / config as it should!
It's really awesome what you've done/doing with this project.
The only shortcoming of this method (which I'm aware of) is that custom comments are not preserved after fixing the config. You essentially get a vanilla config with your existing values for valid keys. Moving the custom comments would be really challenging, because the comments are probably attached to entries in some way which a simple algorithm has no means to figure out.
The only shortcoming of this method
Such a minor thing in the grand scheme of things.
Just so I understand, valid old entries will be retained, but invalid old entries will be deleted and any needed replacements adds as defaults, right? So a custom config element without double quotes will get lost, but a custom config element with double quotes will be retained? I think that's probably just fine.
Oops, just noticed this comment.
valid old entries will be retained
Yes
invalid old entries will be deleted
The code makes a distinction between "invalid" and "unexpected" entries. Invalid = bad format (missing '=', missing key, the key has invalid characters, or the value is not enclosed in double-quotes while also including the # character). Unexpected = entry has a valid format but the key is not found in the new default config. When an invalid entry is encountered, load_config() throws an error and exits. When an unexpected key is encountered, removing it will be suggested.
any needed replacements adds as defaults, right?
The fix-related code essentially replaces default values with existing values extracted from entries which are both valid and expected in the existing config (including values extracted from entries in "legacy format", i.e. missing double-quotes). Missing entries are not replaced, i.e. they are copied over from the default config to the fixed config as-is.
a custom config element without double quotes will get lost
An entry which has its value not enclosed in double-quotes is considered "legacy format". Provided that the key is valid and expected (i.e. included in the new default config), the value is still retained when the config is fixed but if any legacy entries are detected, the config must be fixed before it can be used.
An exception to this are entries which both have their value not enclosed in double quotes, and the value contains the # character, because in this situation we don't have a reliable way to determine whether # is a beginning of a comment or a part of a string (i.e. URL). Such an entry would be considered invalid and it doesn't have an automatic fix, so if detected, load_config() will error out and exit.
An exception to the exception (I know, it's a bit confusing) is a special case of 2 entries which exist in our older default config, don't have double-quotes and have in-line comments. Specifically:
compress_blocklist=1 # enable (1) or disable (0) blocklist compression
initial_dnsmasq_restart=0 # enable (1) or disable (0) initial dnsmasq restart
The following code in parse_entry() specifically addresses these entries so they are flagged as legacy rather than as invalid:
# Following 'case' is a temporarily solution to allow easy config migration - remove a few months from now (Aug 2024)
case "${entry}" in
"compress_blocklist="?" #"*|"initial_dnsmasq_restart="?" #"*)
legacy_entries="${legacy_entries}${entry}"$'\n'; legacy_keys="${legacy_keys}${key} "
test_keys="${test_keys%%"${key}|"*}${test_keys#*"${key}|"}"
val=${val%% *}
return 0
esac
^
- parse_config(): set ${luci_conf_needs_fix}
- parse_config(): fix a couple typos
- parse_config(): remove ${legacy_keys) (unused variable)
- parse_config(): catch another previously uncaught case of invalid entries
- fix_config(): move config-fixing code to a separate function
- load_config(): accept the '-f' argument to force fixing config (bypassing console dialog and /dev/tty check)
- load_config(): call fix_config()
The main change here is exposing the list of required fixes to the luci RPC script via ${luci_conf_needs_fix} variable, and added functionality of forced config fix via a call to load_config() with the argument -f. Both are intended to allow the luci app to have a similar dialog with the user as currently implemented via console, and initiate the fixes. The idea is something like:
- call parse_config() (needs path to the config file in 1st argument)
- check return code - if it's 2 then fixes are needed
- pull suggested fixes from the ${luci_conf_needs_fix} variable, report to the user
- to initiate config fix, call
load_config -f - check the return code of
load_config()and report success of failure
Not sure if @rickparrish would want to use this but I thought it could be a good idea to make the functionality available.
Not sure if @rickparrish would want to use this but I thought it could be a good idea to make the functionality available.
Sounds like an excellent idea, I'll definitely make use of it.
The luci app does have logic for updating a config file while keeping settings (eg remapping min_blocklist_file_part_line_count to min_blocklist_part_line_count), but now that you have the logic in adblock-lean itself it's redundant to have it in the luci app. So removing it will streamline the code a bit, and minimize the amount of testing needed for future config changes.
I now have a Reset button appearing when parse_status returns 1 that will call gen_config, and an Update button appearing when parse_status returns 2 that will call load_config "-f"
I tried several different config files, and most worked, but I did run into an issue with one. You should be able to replicate it by doing:
-
service adblock-lean gen_config - Edit the config file to remove the
# config_format=v2line -
service adblock-lean start
This is the output I get:
root@OpenWrt:~/adblock-lean# service adblock-lean start
Warning: Config format version is unknown or invalid.
Perform following automatic changes?
1. Update config format version
y|n: y
Error: fix_config(): invalid argument.
/etc/rc.common: line 445: report_failure: not found
Fix your config file '/root/adblock-lean/config' or generate default config using 'service adblock-lean gen_config'.
Error: Failed to load config.