Foundry-Local icon indicating copy to clipboard operation
Foundry-Local copied to clipboard

Out of memory is not well handled

Open a1exwang opened this issue 4 months ago • 0 comments

Foundry local version 0.6.87+e69a6c3d2b

Repro steps:

  1. Load model A
  2. Call /v1/chat/completions API with model A
  3. Load model B
  4. Call /v1/chat/completions API with model B

The server returns http 500 error without an error message or error code. So the client is unable to know what happened. By checking the log, I can see

E:\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:129 onnxruntime::CudaCall E:\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall CUDA failure 2: out of memory ; GPU=0 ; hostname=ALEX-P14S ; file=E:\_work\1\s\onnxruntime\core\providers\cuda\cuda_execution_provider.cc ; line=287 ; expr=cudaDeviceSynchronize();

Expected:

  1. The 500 error body should contain error code or message indicating out of memory error.
  2. The model could be automatically unloaded if not used for sometime like Ollama.

AB#74041

a1exwang avatar Aug 06 '25 06:08 a1exwang