jan icon indicating copy to clipboard operation
jan copied to clipboard

feat: Queue System for Inference?

Open dan-jan opened this issue 7 months ago • 2 comments

Objective

  • Do we need a simple queue system?

Motivation

Nullpointer Errors?

  • Currently, inference requests are handled FIFO
  • We are adopting an OpenAI API, which means that we will receive requests across Chat, Audio, Vision etc
  • Given that users are on laptops with limited RAM and VRAM, we are likely to have to switch models

Preparing for Cloud Native

  • Our long-term future is likely as an enterprise OpenAI-alternative, which will be multi-user and have a queue system
  • Should we bake in this abstraction, and use a local file-based queue (which is later swapped out for a more sophisticated queue?)

dan-jan avatar Nov 28 '23 14:11 dan-jan