spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

Rule-based Matcher Explorer does not show matches with length pattern

Open FabianHertwig opened this issue 2 years ago • 2 comments

How to reproduce the behaviour

Go to the Rule-based Matcher Explorer and set a length based rule (included in the link)

No matches are shown. I would expect that matches for every token of length 2 are shown.

When recreating the same scenario in code, matches are found:

import spacy
from spacy.matcher import Matcher

text = "A match is a tool for starting a fire. Typically, modern matches are made of small wooden sticks or stiff paper. One end is coated with a material that can be ignited by frictional heat generated by striking the match against a suitable surface. Wooden matches are packaged in matchboxes, and paper matches are partially cut into rows and stapled into matchbooks."
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)

matcher.add("LENGTH", [[{'LENGTH': 2}]])

doc = nlp(text)
matches = matcher(doc)

for match_id, start, end in matches:
    print(doc[start:end])

is
of
or
is
be
by
by
in

Info about spaCy

  • spaCy version: 3.1.0
  • Platform: Darwin-20.5.0-x86_64-i386-64bit
  • Python version: 3.7.3
  • Pipelines: en_core_web_sm (3.1.0)

FabianHertwig avatar Jul 18 '21 16:07 FabianHertwig

Thanks for the report, that does seem to not be working.

Note that the backends of the public demos haven't been updated in a while, so they're still running the v2.2 series of spaCy. Even in the older versions that attribute should work though... We'll take a look at it.

polm avatar Jul 19 '21 03:07 polm

Checking this again it still seems to be an issue - it looks like the LENGTH matches specifically usually, but not always, cause some kind of server-side error.

polm avatar Nov 17 '21 04:11 polm

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

github-actions[bot] avatar Oct 03 '22 00:10 github-actions[bot]