[BUG] Could not run DirectML models on Intel laptop
Describe the bug My laptop doesn't have any discrete GPU, but an Intel integrated GPU (Meteor Lake). However, I have 32G system memory, and Intel iGPU may use up to half of it. The application disallows to download DirectML models.
To Reproduce Steps to reproduce the behavior:
- Go to 'Samples'
- All the DirectML based models couldn't be downloaded
Expected behavior DirectML based models could run on integrated GPUs.
Screenshots
Please complete the following information: Self-built from GitHub
Additional context NA
+1 from my side
This is caused by your BIOS only preallocating ~64MB of RAM to your iGPU. All other memory is being allocated dynamically.
I have an RX6400 with 4GB of VRAM, yet the AI Dev Gallery detects it as 3.9GB of VRAM, thus the check fails
For now, you can patch the check yourself by inputting a fake value https://github.com/microsoft/ai-dev-gallery/blob/dce64ecbe39bb3c6115ca2e2a4c5240113ed25da/AIDevGallery/Utils/DeviceUtils.cs#L62
After patching this single file, Phi 3 Mini runs fine on my system.
I agree we need to enable DML models on integrated GPUs. We won't have time to work on this this month and can add it for next month, but if anyone wants to pick it up before then please feel free to submit a PR.
Related to https://github.com/microsoft/ai-dev-gallery/issues/47 I guess
Having a "download anyway" button could be an easy fix
@nmetulev What do you think of the idea to allow shared memory? Instead of calculating the maximum allowed dedicated video memory, we check the maximum allowed shared video memory.
Unless the dev is using a dGPU with enough VRAM dedicated, the system always has to use to shared memory anyway.
If this Idea is fine with the team, I will create a pr
Good point. I had decided to only check dedicated vram because I noticed that language models have a degraded performance when they overflow to shared memory, and in some cases, it would cause blue screens. This is likely a DML bug.
However, that should not be the case when using just the shared vram on integrated gpu, so we should be able to just check for either one, and if no dGPU, then check the shared vram.
And I like the suggestion from @BobLd of having a "download anyway" option where the user is presented with a warning and they agree to the risks.
Thoughts?
when they overflow to shared memory, and in some cases, it would cause blue screens. This is likely a DML bug.
I created https://github.com/microsoft/DirectML/issues/683. Feel free to add to it.
@nmetulev Is this bug reported and tracked in Microsoft already? If not, can you please ensure that it is?
Lots of models can't be run on the GPU due to this issue (because dedicated VRAM is often too small) and CPU is a factor slower, so having this fixed would be really welcome.