spacyface icon indicating copy to clipboard operation
spacyface copied to clipboard

Align the token outputs from Spacy and Huggingface to help understand what language structures transformers see

Results 4 spacyface issues
Sort by recently updated
recently updated
newest added

Hi @bhoov, Thanks for your great work! It unifies the popular libraries of HuggingFace and Spacy. I wonder the whole list of pos tagging, i.e., **all categories** one word can...

There are cases when the default options of transformers tokenizer don't meet our demands, but the tokenizer is wrapped inside Aligner, i.e. we wish the `sentence_to_input` function to also return...

Any idea on how to assign weights `transformers-interpret` to spacy tokens? https://github.com/cdpierse/transformers-interpret https://stackoverflow.com/questions/70107997/mapping-huggingface-tokens-to-original-input-text

Tokenization is perfectly aligned for many english sentences, but breaks whenever a SPACY_EXCEPTION is part of a larger, hyphenated word. For example, "whatve-you-dont" would produce two different tokenizations: ``` alnr...