transformers.js icon indicating copy to clipboard operation
transformers.js copied to clipboard

Question about supporting Float16Array

Open xmcp opened this issue 5 months ago • 7 comments

Question

I am trying transformers.js with WebGPU. The performance is great, but I found that transformers.js returns a Float32Array where the model is quantized to fp16:

const extractor = await pipeline(
    "feature-extraction",
    "bge-small-zh-v1.5",
    {
        device: "webgpu",
        dtype: "fp16",
        local_files_only: true,
    },
);
// ...
const embeddings = await extractor(texts, {pooling: "mean", normalize: true});
console.log(embeddings.data);
// -> Float32Array(5120000) [...]

Since the model itself has only 16-bit precision, returning a Float32Array (instead of Float16Array that is supported in latest browsers) seems a waste of performance. Is this comment correct, and do we have plans to support Float16Array for better performance? Thanks!

xmcp avatar Jun 11 '25 07:06 xmcp