For beginner, this library is practically useless without any viable tokenizer available in dart ecosystem
In order to do an inference using a model, you need to preprocess the input to tensor using tokenizer. Currently, there isn't viable tokenizer available in dart ecosystem which make this library practically useless for most beginner.
Unless you want to write your own tokenizer, don't waste your time searching the example to use this library. Alternatively, you can go back to python or javascript who has well known tokenizer like the one maintained by HuggingFace:
- Python https://huggingface.co/docs/transformers/en/index
- Javascript https://huggingface.co/docs/transformers.js/en/index
Perhaps we can entrust AI(ChatGPT or DeepSeek) to help us complete the above task
Would highly recommend using the HuggingFace tokenizers Rust lib via the Flutter/Rust Bridge. We use this internally and it works great
Would highly recommend using the HuggingFace tokenizers Rust lib via the Flutter/Rust Bridge. We use this internally and it works great
Hey, I tried this for ages now. I am not sure what I am doing wrong on the rust side. The crate is not scanned correctly automatically on the rust side hence the bindings result in wrong code generation. Any way you could help out / show me your public API with the tokenizers? :)
I would appreciate it A LOT.
Would highly recommend using the HuggingFace tokenizers Rust lib via the Flutter/Rust Bridge. We use this internally and it works great
Hey, I tried this for ages now. I am not sure what I am doing wrong on the rust side. The crate is not scanned correctly automatically on the rust side hence the bindings result in wrong code generation. Any way you could help out / show me your public API with the tokenizers? :)
I would appreciate it A LOT.
Sorry it's a private project so I can't share the code. Are you trying to generate bindings for the entire tokenizers lib? If so, I would recommend just writing a simple wrapper Rust file that only includes the required functionality from tokenizers. Then just generate bindings for that one file
Would highly recommend using the HuggingFace tokenizers Rust lib via the Flutter/Rust Bridge. We use this internally and it works great
Hey, I tried this for ages now. I am not sure what I am doing wrong on the rust side. The crate is not scanned correctly automatically on the rust side hence the bindings result in wrong code generation. Any way you could help out / show me your public API with the tokenizers? :) I would appreciate it A LOT.
Sorry it's a private project so I can't share the code. Are you trying to generate bindings for the entire tokenizers lib? If so, I would recommend just writing a simple wrapper Rust file that only includes the required functionality from tokenizers. Then just generate bindings for that one file
Hey Brian, thanks for getting back! I successfully managed to implement it yesterday. Thank you anyways, appreciate you :)