textacy icon indicating copy to clipboard operation
textacy copied to clipboard

Improved quote detection and attribution

Open afriedman412 opened this issue 1 year ago • 0 comments

Description

  • Pairwise, incremental quote detection looks for specific pairs of characters, no longer requires an even number of quotation marks to work.
  • Attribution window expanded and adjusted to improve accuracy and prevent some false positives.
  • Code added to prep/standardize text for quote detection

Motivation and Context

This is part of a larger project to create a package to combine quote detection and attribution with coreference resolution, which will be used for the analysis of several thousand newspaper articles.

How Has This Been Tested?

A/B testing with random samples of said articles, test creation after major changes.

(New tests added as well.)

afriedman412 avatar Jun 20 '23 15:06 afriedman412