core
core copied to clipboard
Clone filter
- New class
OcrdMetsFilterinocrd_modelsthat represents restrictions on files (include/exclude by fileGrp, mimetype currently) ocrd workspace clonesupports--fileGrp-include--fileGrp-exclude--mimetype-include--mimetype-exclude
Proposed by @bertsky in #506
This is a very rushed implementation because we need this feature now., Implementation has been improved now.
Codecov Report
Merging #582 into master will increase coverage by
0.66%. The diff coverage is99.03%.
@@ Coverage Diff @@
## master #582 +/- ##
==========================================
+ Coverage 84.60% 85.27% +0.66%
==========================================
Files 49 50 +1
Lines 2813 2933 +120
Branches 550 577 +27
==========================================
+ Hits 2380 2501 +121
Misses 332 332
+ Partials 101 100 -1
| Impacted Files | Coverage Δ | |
|---|---|---|
| ocrd_utils/ocrd_utils/__init__.py | 100.00% <ø> (ø) |
|
| ocrd_models/ocrd_models/ocrd_mets_filter.py | 97.70% <97.70%> (ø) |
|
| ocrd/ocrd/cli/workspace.py | 76.13% <100.00%> (-0.34%) |
:arrow_down: |
| ocrd/ocrd/decorators.py | 95.78% <100.00%> (+4.12%) |
:arrow_up: |
| ocrd/ocrd/resolver.py | 96.66% <100.00%> (+0.11%) |
:arrow_up: |
| ocrd_models/ocrd_models/__init__.py | 100.00% <100.00%> (ø) |
|
| ocrd_models/ocrd_models/ocrd_mets.py | 93.14% <100.00%> (+<0.01%) |
:arrow_up: |
| ocrd_models/ocrd_models/ocrd_xml_base.py | 93.33% <100.00%> (+2.02%) |
:arrow_up: |
| ocrd_utils/ocrd_utils/str.py | 90.81% <100.00%> (+1.28%) |
:arrow_up: |
| ... and 1 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 8dafbac...9fb27c0. Read the comment docs.
Any preferences on the command line interface?
0
--ID, --id PAT ID to include, string/regex/comma-separated
--not-ID, --not-id PAT ID to exclude, string/regex/comma-separated
--mimetype PAT mimetype to include, string/regex/comma-separated
--not-mimetype PAT mimetype to exclude, string/regex/comma-separated
--pageId, --pageid PAT pageId to include, string/comma-separated
--not-pageId, --not-pageid PAT pageId to exclude, string/regex/comma-separated
--fileGrp, --filegrp PAT fileGrp to include, string/regex/comma-separated
--not-fileGrp, --not-filegrp PAT
fileGrp to exclude, string/regex/comma-separated
1
--id PAT ID to include, string/regex/comma-separated
--not-id PAT ID to exclude, string/regex/comma-separated
--mimetype PAT mimetype to include, string/regex/comma-separated
--not-mimetype PAT mimetype to exclude, string/regex/comma-separated
--pageid PAT pageId to include, string/comma-separated
--not-pageid PAT pageId to exclude, string/regex/comma-separated
--filegrp PAT fileGrp to include, string/regex/comma-separated
--not-filegrp PAT fileGrp to exclude, string/regex/comma-separated
2
--id-include PAT ID to include, string/regex/comma-separated
--id-exclude PAT ID to exclude, string/regex/comma-separated
--mimetype-include PAT mimetype to include, string/regex/comma-separated
--mimetype-exclude PAT mimetype to exclude, string/regex/comma-separated
--pageid-include PAT pageId to include, string/comma-separated
--pageid-exclude PAT pageId to exclude, string/regex/comma-separated
--filegrp-include PAT fileGrp to include, string/regex/comma-separated
--filegrp-exclude PAT fileGrp to exclude, string/regex/comma-separated
3
--id PAT ID to include, string/regex/comma-separated
--not-ID PAT ID to exclude, string/regex/comma-separated
--mimetype PAT mimetype to include, string/regex/comma-separated
--not-mimetype PAT mimetype to exclude, string/regex/comma-separated
--pageid PAT pageId to include, string/comma-separated
--not-pageId PAT pageId to exclude, string/regex/comma-separated
--filegrp PAT fileGrp to include, string/regex/comma-separated
--not-fileGrp PAT fileGrp to exclude, string/regex/comma-separated
Any preferences on the command line interface?
I fail to see the difference between 1 and 3. But I would prefer the --not-* scheme over *-exclude/*-include.
What about --not (as a separate option negating the follow-up option), though?
Also, I think it would be better to use the same identifiers as the other workspace CLI commands:
-i | --file-id-m | --mimetype-g | --page-id-G | --file-grp
The relevant part is filtering by file group, which has now been impemented in #1139 in a simpler way than the more generic way proposed here.
Since this is only targeting file groups and --not-file-grp/--file-grp would conflict with the regular --file-grp option, it is using -Q/--exclude-file-grps and -q/--include-file-grps.