syncMyMoodle
syncMyMoodle copied to clipboard
Add file name exclusion using UNIX filename pattern matching
Exclude files using a combination of fnmatch and braceexpand in order to give the full BASH filepattern matching experience.
For example it is now possible with the exclude_files
parameter in the config.json
to exclude all files like the following by simply using Lecture Videos_{zoom,video}*.mp4
. Whcih makes sense in my case for example where lecture videos are uploaded with a semantic name in an extra folder called Video Download
Lecture Videos_zoom_0_NWFjN2RlMD.mp4
Lecture Videos_video1680865997.mp4
Lecture Videos_video1119115261.mp4
Lecture Videos_video1230373776.mp4
Lecture Videos_video1599374246.mp4
In other words it is a more powerful version of the existing exclude_filetypes
config parameter.
Current limitations are:
- [ ] it's not possible to supply
exclude_files
using the command line arguments - [ ] You can only give a pattern for the filename, not the full path (i.e. exclusions of whole subfolders)
- [ ] added a new dependency
bracketexpand
in order to give the BASH file pattern matching experience- UNIX file patterns alone don't define the
{a,b}
expansion
- UNIX file patterns alone don't define the
- [ ]
exclude_filetypes
still exists for compatibility reasons, even though it could be completely replaced byexclude_files
-
exclude_filetypes
could be merged intoexclude_files
during the config loading phase, removing https://github.com/Romern/syncMyMoodle/blob/e928bab71223b4ec176fd0c3c9f0574056be23cd/syncmymoodle/main.py#L765-L768
-
I would prefer a filter with using fnmatch
or re
module instead of adding a new dependency. Regexes also allow for quite powerful matching capabilities.
I get that. Sadly fnmatch
is a bit limiting due to no bracket expansion, which could lead to a lot of duplication in the exclude_files
. (Or would have in my case)
I also thought about Regex expression, but they are far more complex to use then BASH expressions and not user friendly for something like filtering files. I think the Syntax is also harder to learn than BASH file matching. (And yeah. Regex is probably even more powerful than BASH file matching. But still)
But tbh the dependency is quite small and the code could just be copied into it's own function inside the main or an extra file. (Less then 150 lines)
I think 95% of the use cases can be solved with fnmatch
and unless one performs 3 brace expansions it is also trivial to copy and paste the pattern. I say let us go with pure fnmatch
and if that turns out to be insufficient we can always expand the functionality in a backwards compatible manner.
@arandomliz could you share your filters, maybe it helps to understand why we need fnmatch, and some might be useful as defaults.
I currently use this filter right now in order to exclude lecture videos that are also in the Lecture Video Download
directory with a much clearer and descriptive name:
"exclude_files": ["Lecture Videos_{video,zoom,untitled}*.mp4"]
Using only fnmatch
would result in:
"exclude_files": ["Lecture Videos_videos*.mp4", "Lecture Videos_zoom*.mp4", "Lecture Videos_untitled*.mp4"]
Luckily they don't switch here between file types (yet).
I currently use this filter right now in order to exclude lecture videos that are also in the
Lecture Video Download
directory with a much clearer and descriptive name:"exclude_files": ["Lecture Videos_{video,zoom,untitled}*.mp4"]
Why not simply Lecture Videos_*.mp4
? Also regarding your case I think a better approach would be to implement blocking of a specific section or module.