ai-dev-gallery [BUG] Could not run DirectML models on Intel laptop

Describe the bug My laptop doesn't have any discrete GPU, but an Intel integrated GPU (Meteor Lake). However, I have 32G system memory, and Intel iGPU may use up to half of it. The application disallows to download DirectML models.

To Reproduce Steps to reproduce the behavior:

Go to 'Samples'
All the DirectML based models couldn't be downloaded

Expected behavior DirectML based models could run on integrated GPUs.

Screenshots

Please complete the following information: Self-built from GitHub

Additional context NA

Jan 04 '25 15:01 gyagp

+1 from my side

This is caused by your BIOS only preallocating ~64MB of RAM to your iGPU. All other memory is being allocated dynamically.

I have an RX6400 with 4GB of VRAM, yet the AI Dev Gallery detects it as 3.9GB of VRAM, thus the check fails

For now, you can patch the check yourself by inputting a fake value https://github.com/microsoft/ai-dev-gallery/blob/dce64ecbe39bb3c6115ca2e2a4c5240113ed25da/AIDevGallery/Utils/DeviceUtils.cs#L62

After patching this single file, Phi 3 Mini runs fine on my system.

Jan 07 '25 13:01 Pinguin2001

I agree we need to enable DML models on integrated GPUs. We won't have time to work on this this month and can add it for next month, but if anyone wants to pick it up before then please feel free to submit a PR.

Jan 07 '25 20:01 nmetulev

Related to https://github.com/microsoft/ai-dev-gallery/issues/47 I guess

Having a "download anyway" button could be an easy fix

Jan 07 '25 21:01 BobLd

@nmetulev What do you think of the idea to allow shared memory? Instead of calculating the maximum allowed dedicated video memory, we check the maximum allowed shared video memory.

Unless the dev is using a dGPU with enough VRAM dedicated, the system always has to use to shared memory anyway.

If this Idea is fine with the team, I will create a pr

Jan 08 '25 09:01 Pinguin2001

Good point. I had decided to only check dedicated vram because I noticed that language models have a degraded performance when they overflow to shared memory, and in some cases, it would cause blue screens. This is likely a DML bug.

However, that should not be the case when using just the shared vram on integrated gpu, so we should be able to just check for either one, and if no dGPU, then check the shared vram.

And I like the suggestion from @BobLd of having a "download anyway" option where the user is presented with a warning and they agree to the risks.

Thoughts?

Jan 08 '25 13:01 nmetulev

when they overflow to shared memory, and in some cases, it would cause blue screens. This is likely a DML bug.

I created https://github.com/microsoft/DirectML/issues/683. Feel free to add to it.

@nmetulev Is this bug reported and tracked in Microsoft already? If not, can you please ensure that it is?

Lots of models can't be run on the GPU due to this issue (because dedicated VRAM is often too small) and CPU is a factor slower, so having this fixed would be really welcome.

Jan 17 '25 10:01 hansmbakker