docker-faster-whisper icon indicating copy to clipboard operation
docker-faster-whisper copied to clipboard

[FEAT] Models is always loaded in vram

Open ecker00 opened this issue 11 months ago • 5 comments

Is this a new feature request?

  • [x] I have searched the existing issues

Wanted change

Save GPU VRAM when not in use. VRAM is quite valuable resource and should be possible to configure a keep_alive value. For example with Ollama it is configured like this:

  • keep_alive=-1 keeps model in memory indefinitely
  • keep_alive=0 unloads model after each use
  • keep_alive=60 keeps the model in memory for 1 minute after use

This can be a environment variable, default to -1 to not be a breaking change for anyone.

Reason for change

Right now the model is loaded into memory as soon as the container starts, and stays there even when not in use.

Proposed code change

No response

ecker00 avatar Jan 22 '25 19:01 ecker00

Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid.

github-actions[bot] avatar Jan 22 '25 19:01 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.

LinuxServer-CI avatar Feb 22 '25 10:02 LinuxServer-CI

@ecker00 What I do for my infra is to stop the container to free the VRAM using a service like sablier can help you do it automatically.

mg-dev25 avatar Feb 27 '25 16:02 mg-dev25

This is my home assistant voice, so I kind of need it available at all times, but I don't mind waiting a few seconds for the model to load on first wake up after being inactive.

ecker00 avatar Feb 27 '25 21:02 ecker00

This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.

LinuxServer-CI avatar Mar 31 '25 10:03 LinuxServer-CI

This issue is locked due to inactivity

LinuxServer-CI avatar Jul 06 '25 11:07 LinuxServer-CI