iepy
iepy copied to clipboard
Information Extraction in Python
Specifically, those facts that are seed facts. For instance: disease,botulism,symptom,paralysis,CAUSES,,,,,1
The tokenizer is not following standard contraction tokenization [0], expected by the Stanford POS tagger. Contractions are not splitted and should be. Also, the apostrophe character ´ is not handled....
Features such as bag_of_words, bag_of_pos, bag_of_wordpos and their bigram versions shouldn't include the two entities involved in the evidence.
Right now our LiteralNER is _very_ literal, so in some cases is not working. Example: an entry like this ``` takayasu's arteritis ``` Is never found because the documents will...
Once the BootstrappedIEPipeline starts running the Evidence instances generated are (kind of) read-only. The same evidences will be generated over and over in stage 1. Therefore, the vectorization needed for...
Kelly