llama-stack Extract provider data properly (attempt 2)

Extract provider data properly (attempt 2)

Open ashwinb opened this issue 1 year ago • 0 comments

In the previous design, the server endpoint at the top-most level extracted the headers from the request and set provider data (e.g., private keys) that the implementations could retrieve using get_request_provider_data().

However, as Yogish has shown in #138, this is not sufficient. Consider the /agents API -- Together uses the standard "meta-reference" implementation for it but the inference provider is set to be Together. When an incoming request arrives, no "provider data validator" is registered for Agents because Agents isn't using the Together provider at all. However, when the agent calls inference as its dependency, the inference implementation does need the Together API key.

The solution is straightforward:

the server at the top-level does not have the correct context to validate the headers. what it should only do is extract them, json-decode them and stash the resulting dict into a thread-local.
the provider implementation when it needs the data, queries it using an instance method on the implementation. The instance method works via a utility mixin (NeedsRequestProviderData).
this mixin is able to query the underlying provider spec, get to it the required validator class and parse the correct keys from there.

This design is more general and also allows for multiple providers needing multiple private keys to co-exist peacefully with each other.

Sep 29 '24 05:09 ashwinb

llama-stack llama-stack copied to clipboard

Extract provider data properly (attempt 2)

llama-stack
llama-stack copied to clipboard