neuralcoref
neuralcoref copied to clipboard
✨Fast Coreference Resolution in spaCy with Neural Networks
Hi there, Running this against many trivial examples and it seems to miss obvious co-references ... here's an example of what I mean: Input: "My dogs love the beach. They...
neuralcoref works well for majority of our use cases, but we're trying to eek out whatever remaining bits of performance we could. I noticed in https://github.com/huggingface/neuralcoref/blob/master/neuralcoref/train/training.md there is a reference...
Just want to ask a question regarding the dataset format i need to have for training, seeing there is already all the code necessary for training, evaluation, and everything, i...
This is way to fix #340. This is done by avoiding slicing and using coordinate indexing instead for the assignment.
I think there is a small bug in dataset.py that affects the building of the Mention Type one-hot vectors of antecedent mentions in the pair features during training. Due to...
This closes #338. The implementation follows the one of the method [get_document_embedding](https://github.com/huggingface/neuralcoref/blob/60338df6f9b0a44a6728b442193b7c66653b0731/neuralcoref/train/document.py#L534-L542) from neuralcoref.train.document, which is the method that calculates document embeddings during training.
Fix #336
Document embeddings are not calculated during inference in [neuralcoref.pyx](https://github.com/huggingface/neuralcoref/blob/60338df6f9b0a44a6728b442193b7c66653b0731/neuralcoref/neuralcoref.pyx), but they are left at zeros. https://github.com/huggingface/neuralcoref/blob/60338df6f9b0a44a6728b442193b7c66653b0731/neuralcoref/neuralcoref.pyx#L717 This causes a mismatch between inference and training input features (doc embeddings during training...
The average embeddings can be wrongly calculated during inference due to a small bug in neuralcoref.pyx: https://github.com/huggingface/neuralcoref/blob/60338df6f9b0a44a6728b442193b7c66653b0731/neuralcoref/neuralcoref.pyx#L896 `PUNCTS` is a list of strings, while `token.lower` is an integer hash. This...
Try: https://huggingface.co/coref/?text=Wi-Fi `Wi-Fi` is mysteriously changed to `Wi-fuck it`.