textacy
textacy copied to clipboard
Improved quote detection and attribution
Description
- Pairwise, incremental quote detection looks for specific pairs of characters, no longer requires an even number of quotation marks to work.
- Attribution window expanded and adjusted to improve accuracy and prevent some false positives.
- Code added to prep/standardize text for quote detection
Motivation and Context
This is part of a larger project to create a package to combine quote detection and attribution with coreference resolution, which will be used for the analysis of several thousand newspaper articles.
How Has This Been Tested?
A/B testing with random samples of said articles, test creation after major changes.
(New tests added as well.)