bgfx icon indicating copy to clipboard operation
bgfx copied to clipboard

Incorrect VRAM reporting on Metal and DX12-UWP

Open Ravbug opened this issue 3 years ago • 4 comments

Describe the bug bgfx::Stats::gpuMemoryMax and bgfx::Stats::gpuMemoryUsed are always -9223372036854775807 on Metal and DX12-UWP

To Reproduce Steps to reproduce the behavior:

  1. Initialize the Metal backend
  2. Call bgfx::frame() at least once (may not be necessary)
  3. Call bgfx::getStats() then look at gpuMemoryMax and gpuMemoryUsed

Expected behavior It should return the max VRAM of the GPU, and the VRAM in use.

Screenshots Metal: image

DX12-UWP: image

DX12-Non UWP image

Additional context OS: macOS 11.6.2 Xcode 13.2.1 GPU: AMD Radeon M395X

OS: Windows 10 21H1 Visual Studio 2022 GPU: NVIDIA RTX 2070 Super Driver: 456.71

Ravbug avatar Dec 23 '21 20:12 Ravbug

UWP is dead, so this is non-issue.

Do you have suggestion how to query / find out amount of GPU memory on Metal?

bkaradzic avatar Dec 24 '21 03:12 bkaradzic

For getting the current allocation size, it looks like currentAllocatedSize in MTLDevice will work.

For total amount of GPU memory, This note in MoltenVK commit history seems to indicate that Metal does not expose total memory, instead exposes a "recommended maximum":

if (getHasUnifiedMemory()) { return mvkGetSystemMemorySize(); } // There's actually no way to query the total physical VRAM on the device in Metal. // Just default to using the recommended max working set size (i.e. the budget). return getRecommendedMaxWorkingSetSize();

recommendedWorkingSetSize in MTLDevice is described as:

An approximation of how much memory, in bytes, this device can use with good performance. Performance may be improved by keeping the total size of all resources and heaps associated with this device object less than this threshold. Going above the threshold may incur a performance penalty.

Ravbug avatar Dec 28 '21 19:12 Ravbug

This code I adapted from Maya seems to work for macOS for getting the total VRAM:

void queryVRAMandModelMac(uint64_t& vram, std::string& manufacturer, std::string& model)
{
	vram = 0;
	CGError res = CGDisplayNoErr;
	// query active displays
	CGDisplayCount dspCount = 0;
	res = CGGetActiveDisplayList(0, NULL, &dspCount);
	if (res || dspCount == 0) {
		return;
	}
	// use boost here
	CGDirectDisplayID* displays = (CGDirectDisplayID*)calloc((size_t)dspCount, sizeof(CGDirectDisplayID));
	res = CGGetActiveDisplayList(dspCount, displays, &dspCount);
	if (res || dspCount == 0) {
		return;
	}
	SInt64 maxVramTotal = 0;
	for (int i = 0; i < dspCount; i++) {
		// get the service port for the display
		io_service_t dspPort = CGDisplayIOServicePort(displays[i]);
		// ask IOKit for the VRAM size property
		/* HD 2600: IOFBMemorySize = 256MB. VRAM,totalsize = 256MB
		 HD 5770: IOFBMemorySize = 512MB. VRAM,totalsize = 1024MB
		 Apple's QA page is not correct. We should search for IOPCIDevice's VRAM,totalsize property.
		 CFTypeRef typeCode = IORegistryEntryCreateCFProperty(dspPort,
		 CFSTR(kIOFBMemorySizeKey),
		 kCFAllocatorDefault,
		 kNilOptions);
		 */
		SInt64 vramScale = 1;
		CFTypeRef typeCode = IORegistryEntrySearchCFProperty(dspPort,
															 kIOServicePlane,
															 CFSTR("VRAM,totalsize"),
															 kCFAllocatorDefault,
															 kIORegistryIterateRecursively | kIORegistryIterateParents);
		if (!typeCode) {
			// On the new Mac Pro, we have VRAM,totalMB instead.
			typeCode = IORegistryEntrySearchCFProperty(dspPort,
													   kIOServicePlane,
													   CFSTR("VRAM,totalMB"),
													   kCFAllocatorDefault,
													   kIORegistryIterateRecursively | kIORegistryIterateParents);
			if (typeCode) {
				vramScale = 1024 * 1024;
			}
		}
		// ensure we have valid data from IOKit
		if (typeCode) {
			SInt64 vramTotal = 0;
			if (CFGetTypeID(typeCode) == CFNumberGetTypeID()) {
				// AMD, VRAM,totalsize is CFNumber
				CFNumberGetValue((const __CFNumber*)typeCode, kCFNumberSInt64Type, &vramTotal);
			}
			else if (CFGetTypeID(typeCode) == CFDataGetTypeID()) {
				// NVIDIA, VRAM,totalsize is CFData
				CFIndex      length = CFDataGetLength((const __CFData*)typeCode);
				const UInt8* data   = CFDataGetBytePtr((const __CFData*)typeCode);
				if (length == 4) {
					vramTotal = *(const unsigned int*)data;
				}
				else if (length == 8) {
					vramTotal = *(const SInt64*)data;
				}
			}
			vramTotal *= vramScale;
			CFRelease(typeCode);
			
			if (vramTotal > maxVramTotal) {
				maxVramTotal = vramTotal;
				typeCode = IORegistryEntrySearchCFProperty(dspPort,
														   kIOServicePlane,
														   CFSTR("NVDA,Features"),
														   kCFAllocatorDefault,
														   kIORegistryIterateRecursively | kIORegistryIterateParents);
				if (typeCode) {
					manufacturer = "NVIDIA";
					CFRelease(typeCode);
				}
				typeCode = IORegistryEntrySearchCFProperty(dspPort,
														   kIOServicePlane,
														   CFSTR("ATY,Copyright"),
														   kCFAllocatorDefault,
														   kIORegistryIterateRecursively | kIORegistryIterateParents);
				if (typeCode) {
					manufacturer = "Advanced Micro Devices, Inc.";
					CFRelease(typeCode);
				}
				// GPU model
				typeCode = IORegistryEntrySearchCFProperty(dspPort,
														   kIOServicePlane,
														   CFSTR("model"),
														   kCFAllocatorDefault,
														   kIORegistryIterateRecursively | kIORegistryIterateParents);
				if (typeCode) {
					if (CFGetTypeID(typeCode) == CFDataGetTypeID()) {
						model = (const char*)CFDataGetBytePtr((const __CFData*)typeCode);
					}
					CFRelease(typeCode);
				}
			}
		}
	}
	vram = maxVramTotal;
}

When run on my iMac I get the following, which is correct:

std::string manufacturer, model;
uint64_t vram;
queryVRAMandModelMac(vram, manufacturer,model);

// manufacturer = Advanced Micro Devices, Inc
// model = AMD Radeon R9 M395X
// vram = 4294967296 (which is 4096 MB)

Don't know what this will do on an Apple Silicon device, I don't have one so I can't test it.

Ravbug avatar Dec 28 '21 20:12 Ravbug

Don't know what this will do on an Apple Silicon device, I don't have one so I can't test it.

Cool, thanks for research!

I can test this.

bkaradzic avatar Dec 29 '21 01:12 bkaradzic