Arthur
Arthur
Yes! Actually the best way to do this is to use the converters from `transformers` see here: https://github.com/huggingface/transformers/blob/2965b204593df9d5652313386ec280ffbfd1753b/src/transformers/convert_slow_tokenizer.py#L1340 . In rust we would need to read and parse the `.model`...
- @vody-am the support for fast SIGLIP tokenizer is on it's way, and should actually be pretty straighforward. https://github.com/huggingface/transformers/pull/29969 @EricLBuehler we actually shipped this in `transformers`, but sure I can...
cc @Wauplin
Cuda graphs are supported in transformers for models that support static kv cache
The compile should be run on the forward not generate for now! https://github.com/huggingface/transformers/pull/30788 will add end to end spport
Sorry did not have time to finish
PR IS DONE!
sounds good
Damn that's impressive! Reviewing now!