Foundry-Local icon indicating copy to clipboard operation
Foundry-Local copied to clipboard

List all available models

Open sebastienros opened this issue 2 months ago • 8 comments

I work on the Aspire team which is a framework and tools to create distributed applications. We have integrated most of Microsoft AI stack already, including Foundry Local, and may need some direction in order to make it even better,

Here is an example of how you can create a system using an AI Foundry model that will work locally or online when published:

https://github.com/dotnet/aspire/blob/main/playground/AzureAIFoundryEndToEnd/AzureAIFoundryEndToEnd.AppHost/Program.cs#L8-L18

If you look at the example you can see that we use an enumeration to list the available models on Foundry. This list is taken from the Azure AI Foundry portal automatically, and obviously only some of these are available locally. We also found other discrepancies, like for instance in the logical model names (c.f. Phi 3.5 instruct). Because of these discrepancies I am looking into creating a custom enumeration for Foundry Local, so users would know if the model exists directly. I understand that all models are not compatible with the local machine but that would already be an improvement. We have done that for GitHub models too btw.

Is there an API we can request to get the same results as foundry model ls (online or using foundry cli), and even better if there are not GPU filters applied, or a documented list of models, anything would help.

sebastienros avatar Sep 15 '25 16:09 sebastienros

You can do this with /foundry/list in the API while the service is running

timothycarambat avatar Sep 15 '25 18:09 timothycarambat

Actually /foundry/list also filters based on the device you are running on.

We are adding the ability to list all models regardless of device

natke avatar Sep 18 '25 00:09 natke

Ah, my mistake, I misread your original post

timothycarambat avatar Sep 18 '25 04:09 timothycarambat

@natke just posting this totally undocumented behavior that I found on accident from #259

Works perfectly for now, you need the correct UA, which seems to be the main criterion. I imagine that this will not be recommended for general use and probably is not a solid plan forward.

curl --location 'https://ai.azure.com/api/eastus/ux/v1.0/entities/crossRegion' \
--header 'User-Agent: AzureAiStudio' \
--header 'Content-Type: application/json' \
--data '{
    "resourceIds": [
        {
            "resourceId": "azureml",
            "entityContainerType": "Registry"
        }
    ],
    "indexEntitiesRequest": {
        "filters": [
            {
                "field": "type",
                "operator": "eq",
                "values": [
                    "models"
                ]
            },
            {
                "field": "kind",
                "operator": "eq",
                "values": [
                    "Versioned"
                ]
            },
            {
                "field": "labels",
                "operator": "eq",
                "values": [
                    "latest"
                ]
            },
            {
                "field": "properties/variantInfo/variantMetadata/executionProvider",
                "operator": "eq",
                "values": [
                    "CPUExecutionProvider",
                    "QNNExecutionProvider",
                    "CUDAExecutionProvider"
                ]
            }
        ],
        "pageSize": 50,
        "skip": null,
        "continuationToken": null
    }
}'

Just leaving this comment so others can find it because its certainly something myself and others are curious about

timothycarambat avatar Oct 08 '25 00:10 timothycarambat

@timothycarambat awesome! We were already using this endpoint but couldn't figure how to filter the models. The portal has been using a different model recently with a more obvious filter, but authenticated.

sebastienros avatar Oct 08 '25 02:10 sebastienros

glad to help 😅

filipw avatar Oct 08 '25 13:10 filipw

I made good progress and the other issue was interesting to read. One thing I still don't understand is why the phi-4-reasoning model is returned, is available for generic-cpu execution provider, but it's not listed on the machines I tried. I can't find a property that is unique to this model that would get it excluded from foundry local.

sebastienros avatar Oct 15 '25 00:10 sebastienros

The logs explain why phi-4-reasoning is not displayed and made available, it fails to find a prompt. Interestingly the gtp-oss-20 model is listed on CUDA but doesn't have a prompt either. Another oddity is that two whisper models are in the list for CPU but are not shown.

sebastienros avatar Oct 29 '25 17:10 sebastienros