ipex-llm
ipex-llm copied to clipboard
Can The FlashMoe support in ipex-llm run on windows ?
such as the release of the ollama-ipex-llm-2.3.0b20250429-win.zip ,can it run the qweb3-moe model on windows ? And if run the qwen3-30b-a3b model , using only one A770 or one B580 , then how much memory is required ? Is there any requirement for the memory frequency ?
Yeah we have already supported qwen3-30b-a3b moe model, it takes ~ 19GB memory. You may set OLLAMA_SET_OT before starting ollama server to offload only parts of moe layers to GPU, which could reduce vram usage. Please see https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_portable_zip_quickstart.md for more details.