Alex Cheema
Alex Cheema
The default model should be something smaller that can be downloaded quicker. We'd also want to make https://github.com/exo-explore/exo/issues/16 work to make the download even quicker.
- A lot of people are confused when exo shows 0 TFLOPS because their device has not been explicitly added to the lookup. - This is just a visual bug,...
See: https://github.com/exo-explore/exo/issues/14 Right now, every device downloads the entire model which is unnecessary. The design of exo makes things like this particularly difficult since all nodes are equal and p2p.
See: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/
- A lot of people have asked how to get exo running on iOS - Now that things are a lot more stable, we should have parity with iOS soon...
- A user's first experience with exo is to run it, open the web ui, then send a message - Often users don't know that exo has to download the...
LLaVA support was added for MLX here: #88 However, it wasn't implemented for tinygrad. The ground work was already done in #88 so this should be an easy fix.
- We are going to package exo as an installable (first step is nix package see: https://github.com/exo-explore/exo/pull/210) - We want the exo installable to be lightweight with minimal dependencies Dependencies...
- The transformers package is ~10MB - We only use this for loading tokenizers - Tokenizers are actually very simple and we can write a minimal implementation in exo