LocalAI
LocalAI copied to clipboard
Automatically sync model folder
Is your feature request related to a problem? Please describe. When creating federated networks, currently the nodes needs to have installed the same models, or rely on the fact that LocalAI automatically installs models that are available in the gallery on the first request
Describe the solution you'd like A way to sync the models folders between LocalAI federated instances
Describe alternatives you've considered N/A
Additional context
Adding myself to this issue to watch it
One idea that comes to mind: we should generate a gallery file on the host and automatically share that out to each worker.
That way, we can leverage existing downloader support, and make it a "suggested models" prompt when connecting a worker up? If the worker is non-interactive, we can have flags or commands to download one or all of the models of the gallery
I'd like to advocate for a single/elected background downloader service (if not already present; it doesn't seem to be from the startup behavior I've seen so far):
Separating the downloading part from LocalAI would:
- Decouple startup from downloading models
- Allow you to run one instance/service for downloads, and as many inference nodes as needed.
- Prevent LocalAI from failing if the downloader crashes, runs out of disk space, etc. [don't know if this happens right now, but probably not]
For completeness, here's some thinking on the alternatives part of the bug report / feature request:
Describe alternatives you've considered
In order of complexity:
Local file copies, network proxies for downloads
- Don't worry about the actual filesystem part, just support caching network proxies (like squid) for the downloads. Results in multiple file copies and increased storage
- simple and reliable, and might already be supported: e.g., set up the network proxy, and then maybe configure each node's http_proxy, https_proxy, etc. env vars?
Third-party sync tools (like syncthing)
- Might be relatively simple to get going with
- Users can configure for themselves
- Possibly could be integrated with LocalAI as a background service with managed config files etc. Initial node connection might need side-channel comms (a cluster/node control protocol) to authorize/connect anyway
- Probably wouldn't handle large files well
- Probably not fast
- Unsure if it even scales to model files
- Still probably requires distributed file locking or a single/elected downloader.
Network filesystem with some form of distributed locking
- Simpler network filesystems, as in CIFS/NIFS/S3, etc.
- Distributed locking via custom protocol, or something like postgresql locks/etcd/etc.
- NOTE: A separate downloader service would still work in this scenario
- Likely much trickier and more duplicate work / time-wasting than it sounds like.
Proper network filesystems, k8s, etc.
- Filesystems/block storage like Longhorn can replicate/mirror without intervention.
- Shared write-many filesystems like CephFS and GlusterFS could help here
- Still probably need distributed locking for downloading files with multiple weights files (or even just the weights + metadata files) (so again, likely much trickier and more duplicate work / time-wasting than it sounds like).
- NOTE: A separate downloader service would still work in this scenario
I'd like to advocate for a single/elected background downloader service (if not already present; it doesn't seem to be from the startup behavior I've seen so far):
There is one already - it's the one in charge of downloading/applying models in runtime via the API endpoint.
One idea that comes to mind: we should generate a gallery file on the host and automatically share that out to each worker.
Good point, we generate these files already (they are hidden, prefixed with .) - however we would need to make sure to support hot-reload of the configurations