Aaron Pham
Aaron Pham
`--adapter-id` allows you to serve and merge multiple LoRA weights into the base model. We decided internally that having a simplified fine-tune API is probably not the best UX for...
This is already supported, you can pass it in via `--adapter-id` during startup, and then specify the `adapter_name` per request
But I haven't announced it yet, since I'm still testing with this I think the more important use case is for users to provide a remote adapter; what would that...
> Hey @aarnphm, are you asking for a way to load in an adapter per query for a given base model? For e.g., say you have facebook/opt-350m running and you're...
> Right, I gave that a try and it looks like that works. > > I guess what I was wondering is if it's possible to do the following: >...
The purpose of `adapter_name` per request here is that after you train a new lora layer, you can pass a remote URL to it and then it will just load...
I don't think local path would makes sense, since there is no way for the server to resolve it or import it into memory
you can provide the `adapter_name` via the swagger UI, under the same level as `prompt` key.
In terms of supporting path, maybe we can add support for s3. But initially we will probably support loading from huggingface hub for now
Oh I need to update how to load the lora layers. The endpoint `/v1/adapters` doesn't make sense because of broadcasting issue with multiple runners. Can you try with passing `adapter_name`...