truss
truss copied to clipboard
Any processes started in the load function of Truss are not available for predict
Load function runs on a separate thread, any processes created there die when the thread exits, which is immediately after the load function finishes. Some models such as those using vllm rely upon running model on a separate process. When these processes get killed after load the model is not longer available for prediction, which fail.