Vulkan.jl
Vulkan.jl copied to clipboard
Presence of garbage data when retrieving physical device memory types
Hi, I'm getting what seems to be an overflowing buffer reads when trying to enumerate the available physical device memory types.
Reduced to the simplest reproducer I know the problem looks like this:
using Vulkan
println.(get_physical_device_memory_properties(unwrap(enumerate_physical_devices(Instance([],[])))[1]).memory_types);
The command lists a few valid memory types, but these are followed by garbage that looks like random data picked from uninitialized memory, such as:
MemoryType(MemoryPropertyFlag(), 0x00000001)
MemoryType(MemoryPropertyFlag(), 0x00000001)
MemoryType(MemoryPropertyFlag(), 0x00000001)
MemoryType(MemoryPropertyFlag(), 0x00000001)
MemoryType(MemoryPropertyFlag(), 0x00000001)
MemoryType(MemoryPropertyFlag(), 0x00000001)
MemoryType(MemoryPropertyFlag(), 0x00000001)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT), 0x00000000)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT), 0x00000001)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT | MEMORY_PROPERTY_HOST_CACHED_BIT), 0x00000001)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT | MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT), 0x00000002)
MemoryType(MemoryPropertyFlag(), 0x00000004) #this is the first invalid one, there's only 3 memory hea entries
MemoryType(MemoryPropertyFlag(), 0x00004000)
MemoryType(MemoryPropertyFlag(), 0x0000001f)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT | MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT | MEMORY_PROPERTY_HOST_CACHED_BIT | MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT), 0x0000001f)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT | MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT | MEMORY_PROPERTY_HOST_CACHED_BIT | MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT), 0x00000008)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT | MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT | MEMORY_PROPERTY_HOST_CACHED_BIT | MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT), 0x0000001f)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT | MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT | MEMORY_PROPERTY_HOST_CACHED_BIT | MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT), 0x0000001f)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT), 0x00000001)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT), 0x42a6aaab)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_HOST_CACHED_BIT), 0x00000008)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_HOST_CACHED_BIT), 0x00000002)
MemoryType(MemoryPropertyFlag(), 0x437fe000)
MemoryType(MemoryPropertyFlag(), 0x40ffc000)
MemoryType(MemoryPropertyFlag(), 0x7d9d8d00)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT | MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT | MEMORY_PROPERTY_HOST_CACHED_BIT | MEMORY_PROPERTY_RESERVED_8_BIT_NV), 0x00000001)
MemoryType(MemoryPropertyFlag(), 0xffffff80)
MemoryType(MemoryPropertyFlag(MEMORY_PROPERTY_DEVICE_LOCAL_BIT | MEMORY_PROPERTY_HOST_VISIBLE_BIT | MEMORY_PROPERTY_HOST_COHERENT_BIT | MEMORY_PROPERTY_HOST_CACHED_BIT | MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT | MEMORY_PROPERTY_PROTECTED_BIT | MEMORY_PROPERTY_DEVICE_COHERENT_BIT_AMD | MEMORY_PROPERTY_DEVICE_UNCACHED_BIT_AMD | MEMORY_PROPERTY_RESERVED_8_BIT_NV), 0x00000002)
MemoryType(MemoryPropertyFlag(), 0x0223f5f0)
MemoryType(MemoryPropertyFlag(), 0x00000000)
MemoryType(MemoryPropertyFlag(), 0x00000002)
MemoryType(MemoryPropertyFlag(), 0x03a16470)
The beginning of the listing checks with the vulkaninfo
, the rest seems to be garbage.
I'll try to debug what's wrong and report back. Is there any guess on whether this is a possible "premature GC" issue?
Thanks! -mk
Relevant vulkaninfo section:
VkPhysicalDeviceMemoryProperties:
=================================
memoryHeaps: count = 3
memoryHeaps[0]:
size = 4294967296 (0x100000000) (4.00 GiB)
budget = 4099145728 (0xf4540000) (3.82 GiB)
usage = 0 (0x00000000) (0.00 B)
flags: count = 1
MEMORY_HEAP_DEVICE_LOCAL_BIT
memoryHeaps[1]:
size = 25050470400 (0x5d51fd800) (23.33 GiB)
budget = 25050470400 (0x5d51fd800) (23.33 GiB)
usage = 0 (0x00000000) (0.00 B)
flags:
None
memoryHeaps[2]:
size = 257949696 (0x0f600000) (246.00 MiB)
budget = 254935040 (0x0f320000) (243.12 MiB)
usage = 3014656 (0x002e0000) (2.88 MiB)
flags: count = 1
MEMORY_HEAP_DEVICE_LOCAL_BIT
memoryTypes: count = 11
memoryTypes[0]:
heapIndex = 1
propertyFlags = 0x0000:
None
usable for:
IMAGE_TILING_OPTIMAL:
None
IMAGE_TILING_LINEAR:
color images
(non-sparse, non-transient)
memoryTypes[1]:
heapIndex = 1
propertyFlags = 0x0000:
None
usable for:
IMAGE_TILING_OPTIMAL:
color images
(non-sparse)
IMAGE_TILING_LINEAR:
None
memoryTypes[2]:
heapIndex = 1
propertyFlags = 0x0000:
None
usable for:
IMAGE_TILING_OPTIMAL:
FORMAT_D16_UNORM
(non-sparse, non-transient)
IMAGE_TILING_LINEAR:
None
memoryTypes[3]:
heapIndex = 1
propertyFlags = 0x0000:
None
usable for:
IMAGE_TILING_OPTIMAL:
FORMAT_X8_D24_UNORM_PACK32
FORMAT_D24_UNORM_S8_UINT
(non-sparse, non-transient)
IMAGE_TILING_LINEAR:
None
memoryTypes[4]:
heapIndex = 1
propertyFlags = 0x0000:
None
usable for:
IMAGE_TILING_OPTIMAL:
FORMAT_D32_SFLOAT
(non-sparse, non-transient)
IMAGE_TILING_LINEAR:
None
memoryTypes[5]:
heapIndex = 1
propertyFlags = 0x0000:
None
usable for:
IMAGE_TILING_OPTIMAL:
FORMAT_D32_SFLOAT_S8_UINT
(non-sparse, non-transient)
IMAGE_TILING_LINEAR:
None
memoryTypes[6]:
heapIndex = 1
propertyFlags = 0x0000:
None
usable for:
IMAGE_TILING_OPTIMAL:
FORMAT_S8_UINT
(non-sparse, non-transient)
IMAGE_TILING_LINEAR:
None
memoryTypes[7]:
heapIndex = 0
propertyFlags = 0x0001: count = 1
MEMORY_PROPERTY_DEVICE_LOCAL_BIT
usable for:
IMAGE_TILING_OPTIMAL:
color images
FORMAT_D16_UNORM
FORMAT_X8_D24_UNORM_PACK32
FORMAT_D32_SFLOAT
FORMAT_S8_UINT
FORMAT_D24_UNORM_S8_UINT
FORMAT_D32_SFLOAT_S8_UINT
IMAGE_TILING_LINEAR:
color images
(non-sparse, non-transient)
memoryTypes[8]:
heapIndex = 1
propertyFlags = 0x0006: count = 2
MEMORY_PROPERTY_HOST_VISIBLE_BIT
MEMORY_PROPERTY_HOST_COHERENT_BIT
usable for:
IMAGE_TILING_OPTIMAL:
None
IMAGE_TILING_LINEAR:
color images
(non-sparse, non-transient)
memoryTypes[9]:
heapIndex = 1
propertyFlags = 0x000e: count = 3
MEMORY_PROPERTY_HOST_VISIBLE_BIT
MEMORY_PROPERTY_HOST_COHERENT_BIT
MEMORY_PROPERTY_HOST_CACHED_BIT
usable for:
IMAGE_TILING_OPTIMAL:
None
IMAGE_TILING_LINEAR:
color images
(non-sparse, non-transient)
memoryTypes[10]:
heapIndex = 2
propertyFlags = 0x0007: count = 3
MEMORY_PROPERTY_DEVICE_LOCAL_BIT
MEMORY_PROPERTY_HOST_VISIBLE_BIT
MEMORY_PROPERTY_HOST_COHERENT_BIT
usable for:
IMAGE_TILING_OPTIMAL:
None
IMAGE_TILING_LINEAR:
color images
(non-sparse, non-transient)
Hi, thanks for raising the issue.
The correct usage is to restrict the memory_types
field according to the memory_type_count
field of the PhysicalDeviceMemoryProperties
. That is because internally the specification uses a tuple of fixed size into which it writes elements, along with count values (both memory_type_count
and memory_heap_count
for heaps). This checks out with the output from vulkaninfo
.
While the workaround is clear, it would make for less confusion to just special-case this struct/API call to have variable-size Vector
s for heaps and memory types that are indexed with the right count from the start (such that you can rely on the command you pasted without having to check for the counts manually).
hi! thanks for quick reply. I kindof figured I should simply not read more items than what's specified in memory_type/heap_count
, otoh it is certainly a bit surprising that there's actual vector with uninitialized memory getting returned. Perhaps a small warning in documentation would suffice.
(Btw how is this different from other array-returning functions (such as the queries for queue families) that seem to behave just right?)
(Btw how is this different from other array-returning functions (such as the queries for queue families) that seem to behave just right?)
The difference is in the API. vkGetPhysicalDeviceMemoryProperties
does not allow you to query a count, initialize the array, and query for data; it just retrieves the data in one go. I don't know why they adopted this particular design, but it's confusing for sure.
As of 1.3.207, it seems that the issue with bounded arrays containing garbage concerns the following structures:
-
VkDeviceGroupPresentCapabilitiesKHR
-
VkPhysicalDeviceGroupProperties
-
VkQueueFamilyGlobalPriorityPropertiesKHR
-
VkPhysicalDeviceMemoryBudgetPropertiesEXT
-
VkPhysicalDeviceMemoryProperties
Ideally, instead of having docs about this (since there are lots of things described in the docs already) we should replace these fixed-size tuples with Vector
s that contain only valid elements.
Hi, thanks for the detailed explanation, actually when compared to vulkan API this makes much more sense now. I guess the methods in question aren't particularly performance-center-ish so a bit of extra processing for making the vectors look nice won't be a problem right? (Perhaps returning a correct "view" into the vector should be sufficient?)
That's what I think as well, as these structures look like they'll be used mostly in application setup/initialization. Returning a view assumes that the original vector exists somewhere, so I think it would be easier to remove the original fields (count + fixed-size tuple) to replace that with a vector. Though there are several possibilities, we should probably take whichever is easiest on the generator code.
:thinking: looks like the structures aren't static, at least SwiftShader is copying the data out: https://github.com/google/swiftshader/blob/master/src/Vulkan/libVulkan.cpp#L672-L677
I guess that depends what the vk::PhysicalDevice::GetMemoryProperties()
does, pMemoryProperties
is intended to be a pointer filled with new data. Or maybe I didn't understand your comment?
Ah yes I thought that the data would be laying somewhere indefinitely and we could just safely pick a view for them. Unfortunately not the case :sweat_smile: