helix
helix copied to clipboard
Ideas for different scheduling strategies
-
Imagine the situation where you are under constant load from a single model type. Then a user comes in with another model type. It will never get scheduled.
-
Imagine prod. We have a lot of machines. It's annoying that image models are constantly evicted for text models, because they take a while to load. It would be great if we could pin models.
3... more?
Related to #602