List all available models
I work on the Aspire team which is a framework and tools to create distributed applications. We have integrated most of Microsoft AI stack already, including Foundry Local, and may need some direction in order to make it even better,
Here is an example of how you can create a system using an AI Foundry model that will work locally or online when published:
https://github.com/dotnet/aspire/blob/main/playground/AzureAIFoundryEndToEnd/AzureAIFoundryEndToEnd.AppHost/Program.cs#L8-L18
If you look at the example you can see that we use an enumeration to list the available models on Foundry. This list is taken from the Azure AI Foundry portal automatically, and obviously only some of these are available locally. We also found other discrepancies, like for instance in the logical model names (c.f. Phi 3.5 instruct). Because of these discrepancies I am looking into creating a custom enumeration for Foundry Local, so users would know if the model exists directly. I understand that all models are not compatible with the local machine but that would already be an improvement. We have done that for GitHub models too btw.
Is there an API we can request to get the same results as foundry model ls (online or using foundry cli), and even better if there are not GPU filters applied, or a documented list of models, anything would help.
You can do this with /foundry/list in the API while the service is running
Actually /foundry/list also filters based on the device you are running on.
We are adding the ability to list all models regardless of device
Ah, my mistake, I misread your original post
@natke just posting this totally undocumented behavior that I found on accident from #259
Works perfectly for now, you need the correct UA, which seems to be the main criterion. I imagine that this will not be recommended for general use and probably is not a solid plan forward.
curl --location 'https://ai.azure.com/api/eastus/ux/v1.0/entities/crossRegion' \
--header 'User-Agent: AzureAiStudio' \
--header 'Content-Type: application/json' \
--data '{
"resourceIds": [
{
"resourceId": "azureml",
"entityContainerType": "Registry"
}
],
"indexEntitiesRequest": {
"filters": [
{
"field": "type",
"operator": "eq",
"values": [
"models"
]
},
{
"field": "kind",
"operator": "eq",
"values": [
"Versioned"
]
},
{
"field": "labels",
"operator": "eq",
"values": [
"latest"
]
},
{
"field": "properties/variantInfo/variantMetadata/executionProvider",
"operator": "eq",
"values": [
"CPUExecutionProvider",
"QNNExecutionProvider",
"CUDAExecutionProvider"
]
}
],
"pageSize": 50,
"skip": null,
"continuationToken": null
}
}'
Just leaving this comment so others can find it because its certainly something myself and others are curious about
@timothycarambat awesome! We were already using this endpoint but couldn't figure how to filter the models. The portal has been using a different model recently with a more obvious filter, but authenticated.
glad to help 😅
I made good progress and the other issue was interesting to read. One thing I still don't understand is why the phi-4-reasoning model is returned, is available for generic-cpu execution provider, but it's not listed on the machines I tried. I can't find a property that is unique to this model that would get it excluded from foundry local.
The logs explain why phi-4-reasoning is not displayed and made available, it fails to find a prompt. Interestingly the gtp-oss-20 model is listed on CUDA but doesn't have a prompt either.
Another oddity is that two whisper models are in the list for CPU but are not shown.