transformers.js icon indicating copy to clipboard operation
transformers.js copied to clipboard

`onnxruntime-node` uncompressed too large for NextJS 15 API routes

Open raymondhechen opened this issue 11 months ago • 11 comments

Question

Hello! I'm trying to deploy xenova/bge-small-en-v1.5 locally to embed text in an Next 15 API route, but I'm encountering this error with the route's unzipped max size exceeding 250 MB. Wanted to check in to see if there's some error on my side? Doesn't seem like onnxruntime-node should be ~720 MB uncompressed by itself? Thanks!

Image

generateEmbeddingV2() below is called within the API route.

import {
  FeatureExtractionPipeline,
  layer_norm,
  pipeline,
  PreTrainedTokenizer,
  env,
} from '@huggingface/transformers'

const MAX_TOKENS = 512
const MATRYOSHKA_DIM = 768

let cachedExtractor: FeatureExtractionPipeline | null = null
const getExtractor = async () => {
  if (!cachedExtractor) {
    cachedExtractor = await pipeline(
      'feature-extraction',
      'xenova/bge-small-en-v1.5',
      { dtype: 'fp16' }
    )
  }
  return cachedExtractor
}

const chunkText = (text: string, tokenizer: PreTrainedTokenizer) => {
  const tokens = tokenizer.encode(text)

  const chunks = []
  for (let i = 0; i < tokens.length; i += MAX_TOKENS) {
    const chunk = tokens.slice(i, i + MAX_TOKENS)
    chunks.push(chunk)
  }

  return chunks.map((chunk) => tokenizer.decode(chunk))
}

export const generateEmbeddingV2 = async (value: string) => {
  const extractor = await getExtractor()

  const chunks = chunkText(value, extractor.tokenizer)

  let embedding = await extractor(chunk[0], { pooling: 'mean' })
  embedding = layer_norm(embedding, [embedding.dims[1]])
    .slice(null, [0, MATRYOSHKA_DIM])
    .normalize(2, -1)

  return embedding.tolist()[0]
}

I also tried downloading the model file locally, but that didn't work in deployment either.

raymondhechen avatar Jan 23 '25 03:01 raymondhechen

have the same issue

wassgha avatar Jan 29 '25 03:01 wassgha

running into the same problem when using transformer.js on the server side.

next-server

to reproduce this, i took the next-server and updated it to use the latest nextjs and transformers: https://github.com/TimPietrusky/next-server-transformers.js

when you deploy this to vercel, then you get an output like this:

â—‹  (Static)   prerendered as static content
Æ’  (Dynamic)  server-rendered on demand
Traced Next.js server files in: 89.439ms
Warning: Max serverless function size of 250 MB uncompressed reached
Serverless Function's page: classify.js
Large Dependencies                             Uncompressed size
node_modules/onnxruntime-node/bin                      727.79 MB
node_modules/@img/sharp-libvips-linuxmusl-x64           15.75 MB
node_modules/@img/sharp-libvips-linux-x64               15.48 MB
node_modules/next/dist                                   4.22 MB
node_modules/@huggingface/transformers                   2.64 MB
node_modules/react-dom/cjs                             544.24 KB
All dependencies                                       768.12 MB
Max serverless function size was exceeded for 1 function
Created all serverless functions in: 4.599s
Collected static files (public/, static/, .next/static): 15.856ms
Build Completed in /vercel/output [49s]
Deploying outputs...
Error: A Serverless Function has exceeded the unzipped maximum size of 250 MB. : https://vercel.link/serverless-function-size

this makes using transformers.js impossible on Vercel right now if you want to use it on the server.

what i don't understand: Why does it think that this is 727 MB? as the unpacked size of onnxrumtime-node is around 180 MB.

@xenova do you have any idea on how we can work around this limitation?

next-client

i also tested this example and it is working fine on vercel with the latest versions of nextjs / transformersjs: https://github.com/TimPietrusky/next-client-transformers.js

TimPietrusky avatar Mar 18 '25 12:03 TimPietrusky

I faced the same issue. Two things helped

  1. Use app dir
  2. include below in serverExternalPackages in next.config.js i.e
/** @type {import('next').NextConfig} */
const nextConfig = {
  serverExternalPackages: ['sharp', 'onnxruntime-node'],
}

module.exports = nextConfig

datduyng avatar Jun 11 '25 00:06 datduyng

Anyone else also experiencing this issue.

I've added the serverExternalPackages to the config but no luck ;(

Zefty avatar Sep 13 '25 07:09 Zefty

Did some debugging and tried different versions of the transformers package and found that if we use @xenova/[email protected] everything seems to work fine, but as soon as we try @huggingface/[email protected] it breaks and i'm getting those errors.

I noticed that in @huggingface/[email protected] the "onnxruntime-node": "1.18.0" is now part of dependencies and not optional dependency as in @xenova/[email protected]. Could this be the potential problem we are having?

@xenova any ideas what changed between v2 and v3?

Zefty avatar Sep 13 '25 10:09 Zefty

Had the same issue with @huggingface/[email protected] and [email protected] After a record-breaking 36 failed deployments it's ~~solved~~ worked-around.

It seems that there's some big file generated in @huggingface/transformers/node_modules/onnxruntime-node/bin/napi-v3/linux/x64/ causing the A Serverless Function has exceeded.. error.

By excluding stuff, it seems there is a huge *.so file present, but no idea what it is exactly. I excluded all files except libonnxruntime.so.1 and onnxruntime_binding.node from that dir with outputFileTracingExcludes in next.config.js:

/** @type {import('next').NextConfig} */
const nextConfig = {
  // see https://vercel.com/guides/troubleshooting-function-250mb-limit
  // and https://nextjs.org/docs/app/api-reference/config/next-config-js/output#caveats
  outputFileTracingExcludes: {
      '/': [
        'node_modules/@huggingface/transformers/node_modules/onnxruntime-node/bin/napi-v3/linux/x64/!(libonnxruntime.so.1|onnxruntime_binding.node)',
      ],
  },
};

module.exports = nextConfig;

After that, creating the cache dir failed on vercel which is solved with:

import { pipeline, env } from "@huggingface/transformers";

env.cacheDir = "/tmp/.cache"

My fork of next-server-transformers.js with the working version and 3 hours of painful commit history is here: https://github.com/geronimi73/next-server-transformers.js

geronimi73 avatar Sep 17 '25 07:09 geronimi73

Had the same issue with @huggingface/[email protected] and [email protected]

One more - probably crucial - thing to add since my answer appears to be confusing everyone. This is the error I got when deploying on vercel:

Traced Next.js server files in: 81.685ms
Warning: Max serverless function size of 250 MB uncompressed reached
Serverless Function's page: classify.js
Large Dependencies                             Uncompressed size
node_modules/@huggingface/transformers                 730.57 MB
node_modules/@img/sharp-libvips-linuxmusl-x64           15.75 MB
node_modules/@img/sharp-libvips-linux-x64               15.48 MB
node_modules/next/dist                                   4.22 MB
node_modules/react-dom/cjs                             544.24 KB
All dependencies                                       768.12 MB

node_modules/@huggingface/transformers was too large in my case which why I started poking around in that dir.

geronimi73 avatar Sep 18 '25 04:09 geronimi73

I got some time to try your hacks @geronimi73. No idea how you figured out what you did lol.

I also managed to get it working and deployed even with latest version of the library @huggingface/[email protected]. One thing i also had to do was to make sure I had latest version "onnxruntime-node": "^1.22.0-rev"

Zefty avatar Sep 18 '25 07:09 Zefty

@Zefty so did you get it to work only with the hacks from @geronimi73? or was the latest version "just" working? i also tried a couple of things when i was running into this problem, but couldn't figure it out.

i think the best path forward would be to have a node and a browser version, so that people can decide on their use case what they want to use. i do understand that it makes it harder if you just want to install one thing, but this comes at the cost of problems like this.

TimPietrusky avatar Sep 22 '25 13:09 TimPietrusky

from what i understand: there are multiple versions of the onnx-runtime-node for different os bundled with transformers-js, which makes it impossible to run this on platforms that have a size limitation.

TimPietrusky avatar Sep 22 '25 13:09 TimPietrusky

@Zefty so did you get it to work only with the hacks from @geronimi73? or was the latest version "just" working? i also tried a couple of things when i was running into this problem, but couldn't figure it out.

i think the best bath forward would be to have a node and a browser version, so that people can decide on their use case what they want to use. i do understand that it makes it harder if you just want to install one thing, but this comes at the cost of problems like this.

I had to do everything that @geronimi73 did to get things working

Zefty avatar Sep 22 '25 20:09 Zefty