exo
exo copied to clipboard
Smart context handling
Perhaps something like https://github.com/tinygrad/tinygrad/blob/master/examples/llama3.py -- this doesn't prefill part of the prompt that's already been filled, it's super simple to implement.