[WIP] Minimal Tokenizer Implementation
Overview
In response to #262 and #263, and building off of #271, I've been working on
a minimal tokenizer for exo. Initially aimed to remove the transformers dependency,
but discovered it's a transitive dependency of mlx_lm, which will require more extensive changes.
So far I've tested on:
-
llama-3.2-1b -
qwen-2.5-coder-1.5b(both with MLX inference engine)
Questions
- #263 mentions removing both
Jinja2andtiktokendependencies. Since my implementation is currently a wrapper around these packages, I wanted to confirm if removing these dependencies is still the desired direction before proceeding with changes? - Should we tackle the
transformersdependency in a separate PR given its transitive nature throughmlx_lm?
This is awesome! Much needed addition.
I'm going to assign a $500 retrospective bounty for this if we can get a Minimal Tokenizer implementation working for all models without any dependency on tokenizers.
To answer your questions:
-
Jinja2andtiktokenare fine to keep. - Yeah we can tackle getting rid of
transformerscompletely in a separate PR.
Thanks so much for your contribution and for taking the time to open this PR.
Since this repository has been fully rewritten and the license has changed, I’m closing all existing open PRs to avoid confusion and to align with the new codebase.
I really appreciate your interest in the project, and you’re very welcome to open a new PR against the updated version if you’d like and we look forward to reviewing it!