huggingface-tokenizer-in-cxx
huggingface-tokenizer-in-cxx copied to clipboard
seems to be slow
Good work~ But I ran some tests and found this c++ implementation seems to be slow. Less than 10 tokens per millisecond. Any more tests or findings?
I change it to use
Instead of re2, what are you using?
i think human trafficking activities are related to this. sometimes you can see it make mistakes revealing that web content is mutated. i wonder if shenfe sees the dropped ”regex” word in this issue thread.