jan
jan copied to clipboard
epic: Jan supports multi Inference Engines
Linked Milestone
https://github.com/janhq/jan/milestone/21
Objective
- Jan's architecture will default to Nitro, but have flexibility to incorporate other Model Backends / Inference Engines
- We see very fast movement around a few ecosystems
- Intel BigDL
- Intel Extensions for Transformers
- TensorRT-LLM for WIndows
- And many more to come
- Jan's Architecture needs to be able to incorporate modular Inference Engines
- Parallelize efforts to support different inference engines
- Allow us to hedge technical architectural risks
Tasklist
- [x] #913
- [x] #771
- [x] Extensions:
model extension
,inference engine extension
- #783 - [x] POC using OpenAI API endpoint as a "remote" inference engine - #783
- [x] Users should be able to set
model.json
parameters - [x] Inference engine should have defaults
- [x] Users should be able to set
- [x] Refactor Nitro into a "default" inference engine - #783
- [ ] Think through app default behaviors
- [ ] What happens when there's no internet
- [ ] What happens when users don't set
engine
for their GGUF/GPT model? (obv)
- [ ] Documentation
- [ ] Docs for Model Extension
- [ ] Docs for Inference Extension
- [ ] Docs for inference-engine-*
Related Epics
- #761
- #762