Nicolas Patry
Nicolas Patry
You're running a too old Python version (or too new). Thatś the only reason for needing to build from source, everything else should be prebuilt.
Post processors AND decoders are now sequential, so it would definitely be doable right now ! I'll try to tackle it not too far into the future.
https://github.com/huggingface/tokenizers/pull/1183
@sarrahbbh Please re-open an issue with the appropriate details to reproduce it
Should be good after rebase.
> > ... The logs are rather poor compared to the regular endpoints. > > ``` > > 2024-04-16T10:42:49.931556Z INFO text_generation_router::server: router/src/server.rs:500: Success > > ``` > > > >...
Hey you're trying to convert the model. There are other scripts for the tokenizer. I haven't finished it yet (just requires more testing). For dependencies you can use no-default -...
the tokenizer is ready here: https://huggingface.co/hf-internal-testing/tiny-random-llama/tree/main But it does require `tokenizers@main` and is not released yet. Will try to do a release next week (there's still a few needed updates...
tokenizers=0.13.3 is released and can be used. The tokenizer is here https://huggingface.co/hf-internal-testing/llama-tokenizer (tokenizer.json). ```rust let tokenizer = Tokenizer::from_file("tokenizer.json").unwrap(); let encoded = tokenizer.encode("This is a test"); # None is the optional...
> We are considering a potential integration of BLOOM and RWKV in the future. Would it be possible to use this library to tokenize input for those models? Bloom is...