extraction-framework
extraction-framework copied to clipboard
temporary PR for Dev
this is a temporary pull request in order to check how well older commits from the dev branch can be merged into current master
Summary by CodeRabbit
-
New Features
- Improved abstract extraction with new Plain and HTML extractors and an option to auto-clean broken brackets in abstracts.
- Validator groups (LEFT/RIGHT/DEFAULT) for finer-grained construct validation.
-
Improvements
- Default configurations updated to use the HTML abstracts extractor; new plain abstracts config available.
- Enhanced CI/validation with additional triggers, generators, and coverage for abstracts and IRIs.
-
Documentation
- Updated README links/formatting.
- Overhauled issue template for clearer reporting.
-
Tests
- Added SHACL rules and instances; expanded CI test coverage.
- New utilities for test grouping and dataset file handling.