Samuel Kemp comments

Results 24 comments of


                                            Samuel Kemp

Token counts from chat completions

Duplicate of #115

Out of Memory Warning Needed for NPU Devices

duplicate of #130

Hugging Face Onnx Models

@anktsrkr thanks for raising the issue! Yes - it is possible to download models from HuggingFace (HF) and consume them in Foundry Local. The only caveat it that the model...

So, the `genai_config.json` is different to the `generation_config.json`. The `genai_config.json` is used by ONNX runtime and the repo you are using does not have that available. Probably, the easiest way...

Olive Model Conversion

@Justinius There are a couple of options we are working on to pull in models without changing the cache location: 1. Run a model outside of the cache e.g. `foundry...

Olive Model Conversion

ONNX stores the graph representation of the model in an Intermediate Format (IR). To create the IR, Olive needs to pass through some dummy data at graph capture (aka export)...

foundry model run deepseek-r1-14b

Thanks for raising an issue @jasonwtli - would you be able to upload the log files here: ```bash foundry service diag —logs ``` This will create a zip file on...

foundry model run deepseek-r1-14b

Thanks @jasonwtli - I *think* your NPU is running out memory. Does your NPU have ~8GB? This would explain why the 7B model (~3.7GB) works but not the 14B model...

Foundry returns 500 status code when using ResponseFormat (aka structured response) in ChatClient options

Duplicate of #112

Clarify which models support Function calling

Thanks for raising an issue @eavanvalkenburg ! Will add this to our list of feature adds.