spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

`StringStore` refactoring

Open shadeMe opened this issue 2 years ago • 4 comments

Description

This PR introduces the following changes:

  • Add type checks to ensure unexpected input raises a TypeError.
  • Add items() method to iterate over hashes and strings in a pair-wise fashion.
  • Add type hints.
  • Clean up vestigial code from the Python 2 era and remove support for bytes as inputs.
  • Reorganize functions.
  • Update usage docs and tests.

At the moment, this PR breaks a couple of Matcher tests which were passing unexpected input to the API (related).

Types of change

Refactoring

Checklist

  • [x] I confirm that I have the right to submit this contribution under the project's MIT license.
  • [x] I ran the tests, and all new and existing tests passed.
  • [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

shadeMe avatar Aug 19 '22 13:08 shadeMe

The Morphology related change could/should probably be cherry-picked into its own PR (after this one gets merged). But we can leave it in if no one minds it.

shadeMe avatar Aug 19 '22 13:08 shadeMe

I'll reintroduce the Morphology-related changes (and the fix for __reduce__) in another PR (once this one is merged into v4).

shadeMe avatar Sep 12 '22 10:09 shadeMe

@explosion-bot please test_slow

shadeMe avatar Sep 19 '22 11:09 shadeMe

🪁 Successfully triggered build on Buildkite

URL: https://buildkite.com/explosion-ai/spacy-slow-tests/builds/223

explosion-bot avatar Sep 19 '22 11:09 explosion-bot