cortex.cpp bug: cortex pull of some hf models does nothing

Describe the bug When pulling BAAI/bge-reranker-v2-m3 reranking model progress bar stays at 0 forever

To Reproduce

cortex pull BAAI/* (any model)

Expected behavior

I'm expecting it to download the model and make it available locally

Desktop (please complete the following information):

Both macOS M2 and Linux Fedora

Aug 14 '24 16:08 oatmealm

Just seen this:

cortex pull BAAI/bge-m3                 
✔ Dependencies loaded in 274ms
✔ API server is online
Downloading model...
✔ Model downloaded
 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 0s | 0/100TypeError: terminated
    at Fetch.onAborted (node:internal/deps/undici/undici:10815:53)
    at Fetch.emit (node:events:520:28)
    at Fetch.terminate (node:internal/deps/undici/undici:9973:14)
    at Object.onError (node:internal/deps/undici/undici:10927:38)
    at Request.onError (node:internal/deps/undici/undici:2079:31)
    at Object.errorRequest (node:internal/deps/undici/undici:1576:17)
    at Socket.<anonymous> (node:internal/deps/undici/undici:6045:16)
    at Socket.emit (node:events:532:35)
    at TCP.<anonymous> (node:net:337:12)
    at TCP.callbackTrampoline (node:internal/async_hooks:130:17) {
  [cause]: BodyTimeoutError: Body Timeout Error
      at Timeout.onParserTimeout [as callback] (node:internal/deps/undici/undici:5979:32)
      at Timeout.onTimeout [as _onTimeout] (node:internal/deps/undici/undici:2356:17)
      at listOnTimeout (node:internal/timers:581:17)
      at process.processTimers (node:internal/timers:519:7) {
    code: 'UND_ERR_BODY_TIMEOUT'
  }
}

Aug 14 '24 17:08 oatmealm

Missed the point about GGUF, but seem to be an issue with some GGUF as well:

cortex pull pervll/bge-reranker-v2-gemma-Q4_K_M-GGUF    
✔ Dependencies loaded in 438ms
✔ API server is online
Downloading model...
✔ Model downloaded
 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 0s | 0/100

Aug 14 '24 17:08 oatmealm

the model i was pulling is https://huggingface.co/cortexso/codestral, encountered similar problem:

errors out around 15min mark. i notice UND_ERR_BODY_TIMEOUT only happens to .guff files that are huge, cortexso/codestral is at around 13gb. perhaps longer timeout in @nestjs/axios?

Aug 16 '24 15:08 simboonlong

Related: #3521 #3519

Sep 06 '24 07:09 freelerobot

@namchuai I am scheduling this in Sprint 20 - I know this is a Cortex Platform issue, but we should make sure that this behavior does not occur for cortex.cpp.

Once verified, we can close

Sep 06 '24 08:09 dan-menlo

Linking this to #1077 main issue, and queuing for Sprint 20

Sep 08 '24 08:09 dan-menlo

Related #1288

Sep 23 '24 05:09 freelerobot

I am closing this, as it's covered by #1288

Sep 26 '24 14:09 dan-menlo