exo icon indicating copy to clipboard operation
exo copied to clipboard

Tinygrad Inference Router

Open reddyn12 opened this issue 1 year ago • 1 comments

Add logic to be able to support building different model types in tinygrad inference

reddyn12 avatar Oct 12 '24 19:10 reddyn12

~~I can move the build_llama into llama.py and rename the function to build_model. This will make the if/else and raise not necessary. lmk how I should approach this~~ I'm dumb, still will need if else for loading the function in

reddyn12 avatar Oct 12 '24 19:10 reddyn12

@AlexCheema is this proposal fine, or is there a better way to do this?

reddyn12 avatar Oct 15 '24 18:10 reddyn12

Ideally we do this like #139 where we support models automatically without needing separate build_{model} functions.

AlexCheema avatar Oct 15 '24 18:10 AlexCheema

Ok, I'll close this pr, do the LLaVa tinygrad implementation pr with this kind of logic, and then put up another pr that refactors the inference logic to used the stateful_shards.

reddyn12 avatar Oct 15 '24 18:10 reddyn12