transformers.js There is no way to stop / abort / cancel a pipeline()

Feature request

Provide a way to abort a long running operation.

Motivation

Many things can take a long time (downloading a multi-gig model on 3G) or are not operating normally (infinite output) and the user may change their mind.

Your contribution

I can look into how to use AbortSignals for the download issue. Not sure what ONNX provides.

https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal

Any API suggestion?

Feb 04 '25 21:02 sroussey

Good idea! I'm also open to API suggestions since this functionality isn't available in the python library. Following the examples in the docs, perhaps passing a signal parameter to calls to from_pretrained or pipeline could work. Similarly for calls to generate.

Feb 08 '25 11:02 xenova

What is currently blocking this feature?

Jun 06 '25 15:06 natanfudge

No idea, but you can test my version by using @sroussey/transformers instead.

Differences in fork (README version)

Differences in fork (code version)

Jun 06 '25 22:06 sroussey

@sroussey Thanks, that will do. Hopefully your version will be adopted soon!

Jun 06 '25 22:06 natanfudge

@sroussey I see there's a way to abort loading a model, but is there a way to abort text generation? I don't see something like that for GenerationConfig.

Jun 07 '25 10:06 natanfudge

Is there a new status on adding support for aborting a pipeline generation?

Jul 23 '25 12:07 supermoos

Hi @xenova! Will the model download abort be added soon to transformers.js?

Nov 26 '25 20:11 jdp8

@xenova If there is interest, I can rebase my work.

@natanfudge and @jdp8 -- there is a way to abort generation since there is a break between words. There is an example project on hugging face, but the api is not intuitive. I was thinking of adding an abort controller to that as well to have a unified JS way of doing that across operations. But not everything is a text generation in transformers.js, and I haven't looked at the others.

I had a moment where I was rewriting the whole lib in typescript, but I got sidetracked. If I ever do it, I will known the library well enough to do more.

Nov 27 '25 01:11 sroussey

@sroussey sorry for the late response. I know the issue is more focused on pipeline generation cancellation but in my case I'm more interested in model download cancellation. I would appreciate if possible for your work regarding the model download cancellation (with the Abort Signals) be submitted as a PR and hopefully @xenova or somebody else merges it to transformers.js. Thank you for your work!

Dec 02 '25 22:12 jdp8

I submitted a long time ago. And the use case is the same for me with model downloads. I have published a fork on npm you can try. Check the readme for changes. The onnx version is different, for example.

My PR means adding the signal to the options object, and many API functions do not have an options object at all, so that means a large surface area on the API front.

I'll make another PR to v4 and we will see. @xenova has a different proposal for just allowing a custom fetch object.

That's not enough for me for having multiple downloads and letting the user cancel one model (actually related set of urls), but it could work for you if you just want to cancel everything (almost, not cache requests or streams for files).

Dec 02 '25 22:12 sroussey

@jdp8 Forgot to @ you so you may have missed the above.

Dec 03 '25 01:12 sroussey

@sroussey thank you so much! I'll look into your fork

Dec 03 '25 11:12 jdp8

@sroussey In my opinion, a custom fetch would be the simpler solution. In your PR, the same signal has to be passed through many places. With a custom fetch, you would only have to make a fairly small change and would have almost the same advantages.

With your solution:

const controllerOne = new AbortController();
const controllerTwo = new AbortController();

const segmenter = await pipeline("background-removal", "Xenova/modnet", {
  device: "webgpu",
  signal: controllerOne.signal,
});

const translator = await pipeline("translation", "Xenova/opus-mt-en-de", {
  device: "webgpu",
  signal: controllerTwo.signal,
});

function abortSegmenter() {
  // this will only abort the segmenter download, the translator will not be affected
  controllerOne.abort();
}

@xenova's approach:

const controller = new AbortController();

env.fetch = (url, options) =>
  fetch(url, { ...options, signal: controller.signal });

const segmenter = await pipeline("background-removal", "Xenova/modnet", {
  device: "webgpu",
});

const translator = await pipeline("translation", "Xenova/opus-mt-en-de", {
  device: "webgpu",
});

function abort() {
  // this will abort all requests
  controller.abort();
}

I also lean towards @xenova's idea. It is easier to implement (the abort signal does not have to be passed through various layers) and it is more flexible because you are then completely free to implement the fetch mechanism yourself for other use cases.

@sroussey if I got your comment right in your usecase having one abort controller for all requests is not enought, right? @jdp8 would that solve your problem?

Dec 04 '25 07:12 nico-martin

Correct. If I was just doing some server side stuff, aborting all would be fine. Likely I'd just terminate the process instead.

Even in the browser, if you have transformers.js in a worker, you can terminate that. So having a custom fetch is kinda meh. Nice to have though, as there may be uses beyond aborting.

But on the client side, making a nice UI, requirements change.

Your example is very server oriented.

I'll work on mine this weekend to show an example. I want a UI that lets people download models and you might use and if some are bigger than you think you might stop them and let the others continue.

Here is liquid's app:

Your suggestion would mean changing to the text to "cancel this download and all other downloads".

Dec 05 '25 16:12 sroussey

Maybe i am thinking of this all wrong, and I should be using a custom cache and prefilling it. Not sure i will get to it today, but I should be able to do an implementation early in the week.

Dec 07 '25 20:12 sroussey

Your view on the server vs client way of handling env variables is pretty spot on. Having global variables that overwrite the behaviour of every implementation is something very server-ish. In the client I would expect a more instance-based approach.

One problem we have in Transformers.js is the a lot of the onnxruntime uses this env-pattern and we also want to stay very close to the transformers python library, that of course focuses more on patterns that work well on the server side.

Maybe we should rethink some env variables and move to a more instance-based architecture. @xenova , what do you think?

Dec 07 '25 20:12 nico-martin

For reference, you can see an example here: https://workglow-web.netlify.app/. It lets the user cancel the pipeline which cancels the download. Right now, models load when you try to run a pipeline, and the UI for that is messy (the progress is dominated by the download progress not the progress of the pipeline), which is why I have this "Download" task.

Anyhow, being able to stop this workflow from continuing was my original reason for making the PR and the fork.

I'll have a more full featured app by the end of this next week which lets you drag and drop the transformers.js pipeline stuff around. I have not decided yet if I should keep the "Download" task or pre-flight the run and ensure its downloaded in advance.

Dec 08 '25 00:12 sroussey

@sroussey, before you invest too much time into this, I opened a new issue with a request for comments: https://github.com/huggingface/transformers.js/issues/1479

If we would go forward with this I would definitely suggest to have the fetch as one of the session-options that you could pass per Session so you could add the signal there.

Dec 09 '25 11:12 nico-martin