bug: cortex pull of some hf models does nothing
Describe the bug When pulling BAAI/bge-reranker-v2-m3 reranking model progress bar stays at 0 forever
To Reproduce
cortex pull BAAI/* (any model)
Expected behavior
I'm expecting it to download the model and make it available locally
Desktop (please complete the following information):
Both macOS M2 and Linux Fedora
Just seen this:
cortex pull BAAI/bge-m3
✔ Dependencies loaded in 274ms
✔ API server is online
Downloading model...
✔ Model downloaded
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 0s | 0/100TypeError: terminated
at Fetch.onAborted (node:internal/deps/undici/undici:10815:53)
at Fetch.emit (node:events:520:28)
at Fetch.terminate (node:internal/deps/undici/undici:9973:14)
at Object.onError (node:internal/deps/undici/undici:10927:38)
at Request.onError (node:internal/deps/undici/undici:2079:31)
at Object.errorRequest (node:internal/deps/undici/undici:1576:17)
at Socket.<anonymous> (node:internal/deps/undici/undici:6045:16)
at Socket.emit (node:events:532:35)
at TCP.<anonymous> (node:net:337:12)
at TCP.callbackTrampoline (node:internal/async_hooks:130:17) {
[cause]: BodyTimeoutError: Body Timeout Error
at Timeout.onParserTimeout [as callback] (node:internal/deps/undici/undici:5979:32)
at Timeout.onTimeout [as _onTimeout] (node:internal/deps/undici/undici:2356:17)
at listOnTimeout (node:internal/timers:581:17)
at process.processTimers (node:internal/timers:519:7) {
code: 'UND_ERR_BODY_TIMEOUT'
}
}
Missed the point about GGUF, but seem to be an issue with some GGUF as well:
cortex pull pervll/bge-reranker-v2-gemma-Q4_K_M-GGUF
✔ Dependencies loaded in 438ms
✔ API server is online
Downloading model...
✔ Model downloaded
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 0s | 0/100
the model i was pulling is https://huggingface.co/cortexso/codestral, encountered similar problem:
errors out around 15min mark. i notice UND_ERR_BODY_TIMEOUT only happens to .guff files that are huge, cortexso/codestral is at around 13gb. perhaps longer timeout in @nestjs/axios?
Related: #3521 #3519
@namchuai I am scheduling this in Sprint 20 - I know this is a Cortex Platform issue, but we should make sure that this behavior does not occur for cortex.cpp.
Once verified, we can close
Linking this to #1077 main issue, and queuing for Sprint 20
Related #1288
I am closing this, as it's covered by #1288