NodeJS Local Files Only - Headers Not Defined & Incorrect Path Splitters
System Info
Windows 10 - 10.0.19045 Build 19045
Alienware m17 R3
CPU - Intel i7-10750H
Node version:
v16.14.2
main.mjs:
import { pipeline } from '@xenova/transformers';
let pipe = await pipeline('feature-extraction','gte-small',{local_files_only:true});
let out = await pipe('Hey model! Respond to me!');
Package.json:
{
"name": "js-hf",
"version": "1.0.0",
"description": "",
"main": "main.mjs",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"author": "",
"license": "ISC",
"dependencies": {
"@xenova/transformers": "^2.14.0"
}
}
Clone this repository directly into root of project:
https://huggingface.co/Supabase/gte-small
Your final project outlook will look like this:
--${YOUR_PROJ_NAME}
----- gte-small
----- node_modules
----- main.mjs
----- package-lock.json
----- package.json
Environment/Platform
- [ ] Website/web-app
- [ ] Browser extension
- [X] Server-side (e.g., Node.js, Deno, Bun)
- [ ] Desktop app (e.g., Electron)
- [ ] Other (e.g., VSCode extension)
Description
When trying to import models locally, it looks like there are still HTTP requests trying to be fired off. Expected behavior is that when local_files_only is true, that it would only try to use local files.
Secondarily, it looks like the paths to load assets is incorrect on a Windows computer. It is using / instead of \ for transformer assets. It also doesnt seem to be respecting relative path vs absolute path... perhaps that needs to be changed as well?
Error output:
Axrati@DESKTOP-H8KG7FT MINGW64 ~/Desktop/Code/js-hf
$ node main.mjs
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer.json": "ReferenceError: Headers is not defined"
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer_config.json": "ReferenceError: Headers is not defined"
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/config.json": "ReferenceError: Headers is not defined"
file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462
throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
^
Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer.j
at getModelFile (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462:27)
at async getModelJSON (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:575:18)
at async Promise.all (index 0)
at async loadTokenizer (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:61:18)
at async Function.from_pretrained (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:4296:50)
at async Promise.all (index 0)
at async loadItems (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3115:5)
at async pipeline (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3055:21)
at async file:///C:/Users/Axrati/Desktop/Code/js-hf/main.mjs:5:12
As you can see, the "Unable to read local path" is trying to reference node_modules\@xenova\transformers\models\/gte-small/tokenizer.json, which wouldn't be valid Windows path... That said, it looks to not be respecting the relative path (if you see the System Requirements section, you can see the model is a directory in the root of the project, and this is searching through your library in node_modules)
If you look at your code in https://github.com/xenova/transformers.js/blob/main/src/utils/hub.js, you can see on lines 55-56 that the constructor for a FileResponse is instantiating Headers. This leads me to believe that even if the getFile function had its first 2 criteria met (env.useFS && !isValidHttpUrl(urlOrPath))), that its still executing unnecessary code for the protocol its trying to use.
I am happy to help create a PR for this! Please reach out and let me know. Would be helpful to catch up with someone on the team for repo direction/etc.
Reproduction
Based on steps in Sys Reqs / Description
npm install
node main.mjs
Node version: v16.14.2
Hi there 👋 Transformers.js requires Node.js v18+ to function correctly. Since Node 16 has reached EOL, we will not be adding support for it in future. See here for more information.
Hello! - Thanks for the quick response! :)
I upgrade to Node v20.11.0 and reinstalled node_modules, still seeing path error. When adding absolute path, it still seems to append it to search through your library directly. Happy to help on this if you'd like!
$ node main.mjs
file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462
throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
^
Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer.j
son".
at getModelFile (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462:27)
at async getModelJSON (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:575:18)
at async Promise.all (index 0)
at async loadTokenizer (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:61:18)
at async AutoTokenizer.from_pretrained (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:4296:50)
at async Promise.all (index 0)
at async loadItems (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3115:5)
at async pipeline (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3055:21)
at async file:///C:/Users/Axrati/Desktop/Code/js-hf/main.mjs:4:12
Node.js v20.11.0
I was able to fix this with the following code:
import { pipeline, env } from '@xenova/transformers';
env.localModelPath = './';
let pipe = await pipeline('feature-extraction','gte-small',{local_files_only:true});
I think changing the semantics of this may benefit more diverse projects, I am building an Electron app for users to point to and use models wherever they may be on their computer.
If I start setting env.localModelPath, any time I try to reference an arbitrary model on their computer it needs to be relative to the Electron apps path (or whatever I set there). Would much rather have the ability to provide both relative and absolute paths. I would suggest perhaps a default variable for that (relative vs absolute) in the config alongside local_files_only?
Happy to make the changes myself, please let me know your thoughts!
@xenova - bump! Let me know and I will start a PR for this
I have the same issue too, I dont know why my model is there but it cannot file that path:
env.localModelPath = new URL('../../../../../', import.meta.url).pathname;
this is not working too
env.localModelPath = '../../../../../';
Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/C:/Users/hiepx/small-cosmos/colbert-ir/colbertv2.0/tokenizer.json".
at getModelFile (file:///C:/Users/hiepx/small-cosmos/losa/node_modules/.pnpm/@[email protected]/node_modules/@xenova/transformers/src/utils/hub.js:462:27)
but it actually there @axrati I think you should create an PR for this
@hiepxanh Can you provide sample code and your directory structure so I can use as test case as well as mine?
@xenova @hiepxanh - I have found a quick patch. unable to push a branch up though?
$ git push origin explicit-path
remote: Permission to xenova/transformers.js.git denied to axrati.
fatal: unable to access 'https://github.com/xenova/transformers.js.git/': The requested URL returned error: 403
@xenova - the change here isn't major, and doesnt supply full vs relative path. Its an issue with how localPath & requestURL are derived. Please let me open branch to submit a PR!
@xenova Bump!
@xenova @hiepxanh - I have found a quick patch. unable to push a branch up though?
$ git push origin explicit-path remote: Permission to xenova/transformers.js.git denied to axrati. fatal: unable to access 'https://github.com/xenova/transformers.js.git/': The requested URL returned error: 403
Hi there 👋 Feel free to fork the repository, then submit a pull request. In that way, I can review your changes.
the same issue to me!
@xenova , @hiepxanh, @lsb , @TrumanDu
Sorry for the delay on this, have been working on other projects. Opened a forked PR here for review!: https://github.com/xenova/transformers.js/pull/602
@axrati thanks for you contribute,realy need you PR.
@TrumanDu @hiepxanh @lsb @xenova
No problem TrumanDu :) ... xenova - can you please check the PR? Small but effective change! https://github.com/xenova/transformers.js/pull/602
@xenova - reminder for this PR! if you can approve the checks to run it'd help ~
same issue, any update now?
@xenova @wujohns
I have a PR open to address this! Need it to be checked:
https://github.com/xenova/transformers.js/pull/602