[WIP] Minimal Tokenizer Implementation

Open snico432 opened this issue 1 year ago • 1 comments

Overview

In response to #262 and #263, and building off of #271, I've been working on a minimal tokenizer for exo. Initially aimed to remove the transformers dependency, but discovered it's a transitive dependency of mlx_lm, which will require more extensive changes.

So far I've tested on:

llama-3.2-1b
qwen-2.5-coder-1.5b (both with MLX inference engine)

Questions

#263 mentions removing both Jinja2 and tiktoken dependencies. Since my implementation is currently a wrapper around these packages, I wanted to confirm if removing these dependencies is still the desired direction before proceeding with changes?
Should we tackle the transformers dependency in a separate PR given its transitive nature through mlx_lm?

Nov 28 '24 01:11 snico432

This is awesome! Much needed addition. I'm going to assign a $500 retrospective bounty for this if we can get a Minimal Tokenizer implementation working for all models without any dependency on tokenizers.

To answer your questions:

Jinja2 and tiktoken are fine to keep.
Yeah we can tackle getting rid of transformers completely in a separate PR.

Nov 28 '24 06:11 AlexCheema

Thanks so much for your contribution and for taking the time to open this PR.

Since this repository has been fully rewritten and the license has changed, I’m closing all existing open PRs to avoid confusion and to align with the new codebase.

I really appreciate your interest in the project, and you’re very welcome to open a new PR against the updated version if you’d like and we look forward to reviewing it!

Dec 18 '25 14:12 Evanev7