fastembed-js icon indicating copy to clipboard operation
fastembed-js copied to clipboard

fast-multilingual-e5-large.tar.gz Access denied

Open sorokinvj opened this issue 1 year ago • 3 comments

Hello, I want to use fast-multilingual-e5-large, but when the lib is trying to download it I get:

An error occurred: Error: TAR_BAD_ARCHIVE: Unrecognized archive format
    at Unpack.warn (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/tar/lib/warn-mixin.js:21:40)
    at Unpack.warn (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/tar/lib/unpack.js:236:18)
    at Unpack.<anonymous> (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/tar/lib/parse.js:83:14)
    at Unpack.emit (node:events:526:35)
    at [emit] (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/tar/lib/parse.js:313:12)
    at [maybeEnd] (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/tar/lib/parse.js:468:17)
    at [consumeChunk] (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/tar/lib/parse.js:500:21)
    at Unpack.write (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/tar/lib/parse.js:427:25)
    at Unpack.end (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/tar/lib/parse.js:548:14)
    at Pipe.end (/Users/vladislavsorokin/Projects/tax-chatbot/node_modules/minipass/index.js:75:17) {
  recoverable: false,
  file: 'local_cache/fast-multilingual-e5-large.tar.gz',
  code: 'TAR_BAD_ARCHIVE',
  tarCode: 'TAR_BAD_ARCHIVE'
}

then when I click on fast-multilingual-e5-large.tar.gz I see the file with content:

<?xml version='1.0' encoding='UTF-8'?><Error><Code>AccessDenied</Code><Message>Access denied.</Message><Details>Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object. Permission 'storage.objects.get' denied on resource (or it may not exist).</Details></Error>

sorokinvj avatar Jun 09 '24 13:06 sorokinvj

I downloaded the model from Hugging Face but I am missing model_optimized.onnx

sorokinvj avatar Jun 09 '24 13:06 sorokinvj

I think this model has a bad Google Cloud Storage source.

We definitely would need to move to HF. Like FastEmbed-py and FastEmbed-rs.

Anush008 avatar Jun 09 '24 13:06 Anush008

source: https://github.com/Anush008/fastembed-js/blob/main/src/fastembed.ts#L214 The model is getting a different link. https://storage.googleapis.com/qdrant-fastembed/intfloat-multilingual-e5-large.tar.gz The model link should be: https://storage.googleapis.com/qdrant-fastembed/fast-multilingual-e5-large.tar.gz You can download the link and put it in "local_cache" manualy

j-o-r avatar Aug 14 '24 09:08 j-o-r

:tada: This issue has been resolved in version 1.14.2 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

github-actions[bot] avatar Apr 03 '25 04:04 github-actions[bot]

The fast-all-MiniLM-L6-v2 also has the same problem.

PsySecCorp avatar Apr 11 '25 03:04 PsySecCorp