Stuart Axelbrooke

Results 6 comments of Stuart Axelbrooke

Hey @kootenpv, thanks for the suggestion! Care to put up a PR?

For context, in NLP tasks it can be important to retain the usage of contractions, since it can be informative about other aspects of the author of the text.

@kootenpv I'm fine with people having different preferences on how they'd like those split up, my biggest concern here is changing the interface, which is a breaking change that would...

I'm not knowledgeable of the `encoder.json` that GPT-2 uses, or the tokenization it uses, could you link me to relevant documentation?

A standardized tokenization implementation! Tokenization fills the role of "turn the text into fixed vectors" that you'd feed into standard models. As an NLP practitioner and Rust user, tokenization is...

Problem is even worse on mobile Chrome :sweat: