Foundry-Local How to load a local QNN model using C #？

I successfully loaded and ran the QNN model converted from VSCode in cmd according to the tutorial.

I obtained the model list using C # code but couldn't find the model I wanted.


                    foundryLocalManager = new FoundryLocalManager();
                    if (!foundryLocalManager.IsServiceRunning)
                    {
                        await foundryLocalManager.StartServiceAsync();
                    }
                    var list = await foundryLocalManager.ListCatalogModelsAsync();
                    var list1 = await foundryLocalManager.ListCachedModelsAsync();
                    var list2 = await foundryLocalManager.ListLoadedModelsAsync();
                    modelInfo = await foundryLocalManager.GetModelInfoAsync(aliasOrModelId: modelId);
                    ApiKeyCredential key = new ApiKeyCredential(foundryLocalManager.ApiKey);
                    client = new OpenAIClient(key, new OpenAIClientOptions
                    {
                        Endpoint = foundryLocalManager.Endpoint
                    });
                    chatClient = client.GetChatClient(modelInfo?.ModelId);

How do I load the local model？

AB#76094

Aug 14 '25 10:08 nnbw-liu

Hi @qihui-liu,

You can load the local model using the LoadModelAsync API

e.g.

ModelInfo loaded = await manager.LoadModelAsync("aliasOrModelId");

Aug 18 '25 21:08 natke

@natke Sorry, this does not load the model and cannot be found in the C # model list.

                    var list = await foundryLocalManager.ListCatalogModelsAsync();
                    var list1 = await foundryLocalManager.ListCachedModelsAsync();
                    var list2 = await foundryLocalManager.ListLoadedModelsAsync();
                    modelInfo = await foundryLocalManager.LoadModelAsync("phi-3.5-wu8au16-qnn");

Aug 19 '25 01:08 nnbw-liu

@natke The following is the decompilation code for jumping in VS. In LoadModelAsynchronous, the model needs to be in ListCachedModelsSync, but it cannot be obtained from the list

public async Task<ModelInfo> LoadModelAsync(string aliasOrModelId, TimeSpan? timeout = null, CancellationToken ct = default(CancellationToken))
{
    ModelInfo modelInfo = (await GetModelInfoAsync(aliasOrModelId, ct)) ?? throw new InvalidOperationException("Model " + aliasOrModelId + " not found in catalog.");
    List<ModelInfo> source = await ListCachedModelsAsync(ct);
    if (!source.Any(MatchAliasOrId(aliasOrModelId)))
    {
        throw new InvalidOperationException("Model " + aliasOrModelId + " not found in local models. Please download it first.");
    }

    Dictionary<string, string> dictionary = new Dictionary<string, string> {
    {
        "timeout",
        (timeout ?? TimeSpan.FromMinutes(10.0)).TotalSeconds.ToString(CultureInfo.InvariantCulture)
    } };
    if (modelInfo.Runtime.DeviceType == DeviceType.GPU)
    {
        bool flag = source.Any((ModelInfo m) => m.Runtime.ExecutionProvider == ExecutionProvider.CUDAExecutionProvider);
        dictionary["ep"] = (flag ? "CUDA" : modelInfo.Runtime.ToString());
    }

    UriBuilder uriBuilder = new UriBuilder(ServiceUri)
    {
        Path = "/openai/load/" + modelInfo.ModelId,
        Query = string.Join("&", dictionary.Select((KeyValuePair<string, string> kvp) => Uri.EscapeDataString(kvp.Key) + "=" + Uri.EscapeDataString(kvp.Value)))
    };
    (await _serviceClient.GetAsync(uriBuilder.Uri, ct)).EnsureSuccessStatusCode();
    return modelInfo;
}

Aug 19 '25 01:08 nnbw-liu

Hi again @qihui-liu, we have identified a bug in model loading. Please stay tuned for a fix. In the mean time, you should be able to run the cached model using the REST API. please let us know if that works

Sep 04 '25 18:09 natke

Duplicate of #233

Sep 04 '25 18:09 natke