spaCy
spaCy copied to clipboard
Rule-based Matcher Explorer does not show matches with length pattern
How to reproduce the behaviour
Go to the Rule-based Matcher Explorer and set a length based rule (included in the link)
No matches are shown. I would expect that matches for every token of length 2 are shown.
When recreating the same scenario in code, matches are found:
import spacy
from spacy.matcher import Matcher
text = "A match is a tool for starting a fire. Typically, modern matches are made of small wooden sticks or stiff paper. One end is coated with a material that can be ignited by frictional heat generated by striking the match against a suitable surface. Wooden matches are packaged in matchboxes, and paper matches are partially cut into rows and stapled into matchbooks."
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
matcher.add("LENGTH", [[{'LENGTH': 2}]])
doc = nlp(text)
matches = matcher(doc)
for match_id, start, end in matches:
print(doc[start:end])
is
of
or
is
be
by
by
in
Info about spaCy
- spaCy version: 3.1.0
- Platform: Darwin-20.5.0-x86_64-i386-64bit
- Python version: 3.7.3
- Pipelines: en_core_web_sm (3.1.0)
Thanks for the report, that does seem to not be working.
Note that the backends of the public demos haven't been updated in a while, so they're still running the v2.2 series of spaCy. Even in the older versions that attribute should work though... We'll take a look at it.
Checking this again it still seems to be an issue - it looks like the LENGTH matches specifically usually, but not always, cause some kind of server-side error.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.