exo
exo copied to clipboard
Implement protection from: reduced capacity may crash network
Scenario: model require 3 node to match minimal VRAM. Once one of the node experienced timeout, task will fallback to host and lead to crash due not enough memory. In case of swap enabled, it may degrade entire host. In case no swap, it may just crash exo.
Ideally: have setting to stop request execution if existed resources is not enough to run desired mode. Expert may override setting once know what doing.