replaCy icon indicating copy to clipboard operation
replaCy copied to clipboard

Automatically infer syntactic match refinements

Open sam-writer opened this issue 4 years ago • 2 comments

When positive/negative matches are included, we should be able to infer POS information for the spaCy match.

This is cool by itself, but there is at least one other thing we can do: if the match is a verb, if we notice that in the positive match, the verb is transitive (resp. intransitive) and the negative match is intransitive (resp. transitive), we could automatically only match when the verb is used transitively (resp. intransitively)

sam-writer avatar Feb 27 '20 20:02 sam-writer

Hmm... I kind of like having the test matches be true, e.g. test.positive sentences are sentences which CURRENTLY are matched against, and test.negative sentences are ones that don't trigger a match.

Maybe this would look like a separate file, replacy_from_examples.json, which would look like:

{
  "require": {
    "positive": [
      ["I require more food", "I need more food"],
      ["Proof of ID will be required.", "Proof of ID will be needed."]
    ],
    "negative": [
      "But I satisfy all the requirements in the job posting!"
    ],
    "features": ["pos_", "tag_", "dep_"],
    "allowed_hooks": ["hook names", "this might be crazy though, for v1 we probably can't support hooks, possibly ever"]
  }
}

Then we would run python -m replacy.builder replacy_from_examples.json and it would auto-generate a match_dict.json?

sam-writer avatar May 01 '20 19:05 sam-writer

Could use https://github.com/cyclecycle/spacy-pattern-builder for some of this

sam-writer avatar May 01 '20 19:05 sam-writer