Tinygrad Inference Router

Open reddyn12 opened this issue 1 year ago • 1 comments

Add logic to be able to support building different model types in tinygrad inference

Oct 12 '24 19:10 reddyn12

~~I can move the build_llama into llama.py and rename the function to build_model. This will make the if/else and raise not necessary. lmk how I should approach this~~ I'm dumb, still will need if else for loading the function in

Oct 12 '24 19:10 reddyn12

@AlexCheema is this proposal fine, or is there a better way to do this?

Oct 15 '24 18:10 reddyn12

Ideally we do this like #139 where we support models automatically without needing separate build_{model} functions.

Oct 15 '24 18:10 AlexCheema

Ok, I'll close this pr, do the LLaVa tinygrad implementation pr with this kind of logic, and then put up another pr that refactors the inference logic to used the stateful_shards.

Oct 15 '24 18:10 reddyn12