Annif icon indicating copy to clipboard operation
Annif copied to clipboard

WIP: Exclude rules

Open osma opened this issue 9 months ago • 2 comments

Aims to eventually implement #844

osma avatar Apr 03 '25 11:04 osma

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 99.65%. Comparing base (6bae2e5) to head (003d962). :warning: Report is 40 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #846      +/-   ##
==========================================
+ Coverage   99.64%   99.65%   +0.01%     
==========================================
  Files          99      102       +3     
  Lines        7349     7629     +280     
==========================================
+ Hits         7323     7603     +280     
  Misses         26       26              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Apr 03 '25 11:04 codecov[bot]

E.g. I tried to include some RDF types which did not exist in the graph. Could there be an error or a warning for such misuse?

I implemented that in commit e01196f77fb0fd119aa72064055bcca88a376b10.

Also I wonder if the project configurations can became too messy when there are many rules with long URIs... Some alternative place for rule definitions could be a vocabs.cfg file, which could be used to create (or allow access to) pruned versions of (base) vocabularies (the ones loaded with annif load-vocab). But then again for simple use that creates configuration overhead.

You are right, this can quickly get messy... We could try to support shortened URIs (CURIEs) in the configuration, like yso:p12345. The only problem is that there is no way to define which prefixes are in use; that would have to come from within the vocabulary file (in Turtle). I'll see if there's a way to do that without much hassle...

osma avatar Aug 06 '25 13:08 osma

Support for shortened URIs in the exclude/include configuration was added in 003d962fab4412f53bccbc10388ac2513a9c4528.

I think this is now good enough so merging it...any possible enhancements can be done later in separate PRs.

osma avatar Aug 06 '25 14:08 osma