transformers.js
transformers.js copied to clipboard
jinaai/jina-clip-v1: support for model names with prefixes
Model description
Prerequisites
- [X] The model is supported in Transformers (i.e., listed here)
- [X] The model can be exported to ONNX with Optimum (i.e., listed here)
Additional information
You just added the onnx files to their HF repo, that's great! 🥳
Now that model files are getting more complex and have a prefix like text_ or vision_ (or even audio_ in the future), transformers.js needs an update as it doesn't support loading files other than model.onnx or model_quantized.onnx if see it correctly. You'll get this kind of error atm with 17.2 as it cannot locate the files with above prefixes:
Uncaught (in promise) Error: Could not locate file: "https://huggingface.co/jinaai/jina-clip-v1/resolve/main/onnx/model_quantized.onnx".
at handleError (webpack://semanticfinder/./node_modules/@xenova/transformers/src/utils/hub.js?:248:11)
at getModelFile (webpack://semanticfinder/./node_modules/@xenova/transformers/src/utils/hub.js?:481:24)
at async constructSession (webpack://semanticfinder/./node_modules/@xenova/transformers/src/models.js?:451:18)
at async Promise.all (index 1)
at async PreTrainedModel.from_pretrained (webpack://semanticfinder/./node_modules/@xenova/transformers/src/models.js?:1121:20)
at async AutoModel.from_pretrained (webpack://semanticfinder/./node_modules/@xenova/transformers/src/models.js?:5852:20)
at async Promise.all (index 1)
at async loadItems (webpack://semanticfinder/./node_modules/@xenova/transformers/src/pipelines.js?:3269:5)
at async pipeline (webpack://semanticfinder/./node_modules/@xenova/transformers/src/pipelines.js?:3209:21)
at async self.onmessage (webpack://semanticfinder/./src/js/worker.js?:420:24)
You're probably already working on this, but I still though it might be useful to have it documented here for anyone else looking for support.
Or is there already another way to specify the name?
Your contribution
I can gladly test!
You can specify model_file_name as one of the options in .from_pretrained(model_id, { model_file_name: 'model' } :)
Although, do note that the weights I uploaded only work for Transformers.js v3 (unless you manually override the onnxruntime-web/node version to >= 1.16.0).
See the README for example Transformers.js code:
import { AutoTokenizer, CLIPTextModelWithProjection, AutoProcessor, CLIPVisionModelWithProjection, RawImage, cos_sim } from '@xenova/transformers';
// Load tokenizer and text model
const tokenizer = await AutoTokenizer.from_pretrained('jinaai/jina-clip-v1');
const text_model = await CLIPTextModelWithProjection.from_pretrained('jinaai/jina-clip-v1');
// Load processor and vision model
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch32');
const vision_model = await CLIPVisionModelWithProjection.from_pretrained('jinaai/jina-clip-v1');
// Run tokenization
const texts = ['A blue cat', 'A red cat'];
const text_inputs = tokenizer(texts, { padding: true, truncation: true });
// Compute text embeddings
const { text_embeds } = await text_model(text_inputs);
// Read images and run processor
const urls = [
'https://i.pinimg.com/600x315/21/48/7e/21487e8e0970dd366dafaed6ab25d8d8.jpg',
'https://i.pinimg.com/736x/c9/f2/3e/c9f23e212529f13f19bad5602d84b78b.jpg'
];
const image = await Promise.all(urls.map(url => RawImage.read(url)));
const image_inputs = await processor(image);
// Compute vision embeddings
const { image_embeds } = await vision_model(image_inputs);
// Compute similarities
console.log(cos_sim(text_embeds[0].data, text_embeds[1].data)) // text embedding similarity
console.log(cos_sim(text_embeds[0].data, image_embeds[0].data)) // text-image cross-modal similarity
console.log(cos_sim(text_embeds[0].data, image_embeds[1].data)) // text-image cross-modal similarity
console.log(cos_sim(text_embeds[1].data, image_embeds[0].data)) // text-image cross-modal similarity
console.log(cos_sim(text_embeds[1].data, image_embeds[1].data)) // text-image cross-modal similarity
Feel so fuck for the v3 version. Because there is no v3 for nodejs and the new onnx package is only work for v3
The code not work at all, and when I try to using optimum-cli to build the onnx model, the optimum not support the nomic-bert type model(nomic-embed-text-v1.5 can be build but the nomic-embed-vision-v1.5 failed) so there is no way to run the demo code in transformer.js even stable version If v3 not ready please not release the onnx only for v3