jan epic: Jan supports multi Inference Engines

epic: Jan supports multi Inference Engines

Open dan-jan opened this issue 7 months ago • 2 comments

https://github.com/janhq/jan/milestone/21

Jan's architecture will default to Nitro, but have flexibility to incorporate other Model Backends / Inference Engines
We see very fast movement around a few ecosystems
- Intel BigDL
- Intel Extensions for Transformers
- TensorRT-LLM for WIndows
- And many more to come
Jan's Architecture needs to be able to incorporate modular Inference Engines
- Parallelize efforts to support different inference engines
- Allow us to hedge technical architectural risks

[x] #913
[x] #771
[x] Extensions: model extension, inference engine extension - #783
[x] POC using OpenAI API endpoint as a "remote" inference engine - #783
- [x] Users should be able to set model.json parameters
- [x] Inference engine should have defaults
[x] Refactor Nitro into a "default" inference engine - #783
[ ] Think through app default behaviors
- [ ] What happens when there's no internet
- [ ] What happens when users don't set engine for their GGUF/GPT model? (obv)
[ ] Documentation
- [ ] Docs for Model Extension
- [ ] Docs for Inference Extension
- [ ] Docs for inference-engine-*

Nov 28 '23 14:11 dan-jan