Aaron Pham

Results 433 comments of Aaron Pham
trafficstars

`--adapter-id` allows you to serve and merge multiple LoRA weights into the base model. We decided internally that having a simplified fine-tune API is probably not the best UX for...

This is already supported, you can pass it in via `--adapter-id` during startup, and then specify the `adapter_name` per request

But I haven't announced it yet, since I'm still testing with this I think the more important use case is for users to provide a remote adapter; what would that...

> Hey @aarnphm, are you asking for a way to load in an adapter per query for a given base model? For e.g., say you have facebook/opt-350m running and you're...

> Right, I gave that a try and it looks like that works. > > I guess what I was wondering is if it's possible to do the following: >...

The purpose of `adapter_name` per request here is that after you train a new lora layer, you can pass a remote URL to it and then it will just load...

I don't think local path would makes sense, since there is no way for the server to resolve it or import it into memory

you can provide the `adapter_name` via the swagger UI, under the same level as `prompt` key.

In terms of supporting path, maybe we can add support for s3. But initially we will probably support loading from huggingface hub for now

Oh I need to update how to load the lora layers. The endpoint `/v1/adapters` doesn't make sense because of broadcasting issue with multiple runners. Can you try with passing `adapter_name`...