spacyface
spacyface copied to clipboard
Align the token outputs from Spacy and Huggingface to help understand what language structures transformers see
Hi @bhoov, Thanks for your great work! It unifies the popular libraries of HuggingFace and Spacy. I wonder the whole list of pos tagging, i.e., **all categories** one word can...
There are cases when the default options of transformers tokenizer don't meet our demands, but the tokenizer is wrapped inside Aligner, i.e. we wish the `sentence_to_input` function to also return...
Any idea on how to assign weights `transformers-interpret` to spacy tokens? https://github.com/cdpierse/transformers-interpret https://stackoverflow.com/questions/70107997/mapping-huggingface-tokens-to-original-input-text
Tokenization is perfectly aligned for many english sentences, but breaks whenever a SPACY_EXCEPTION is part of a larger, hyphenated word. For example, "whatve-you-dont" would produce two different tokenizations: ``` alnr...