igl
igl copied to clipboard
【Metal】The DeviceFeatureLimits::BufferAlignment is not always equals 16 byte on Metal backend
The DeviceFeatureLimits::BufferAlignment is not always equals 16 byte on Metal backend. For example , on iOS simulator is 256.
https://developer.apple.com/documentation/metal/developing_metal_apps_that_run_in_simulator When you set arguments for the render or compute command, align constant buffer offsets to 256 bytes.
And I have not test the value on MacOS. Maybe it is same with iOS simulator. I cannot verify it.
From the https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf it shows 256B on M2.
@corporateshark has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@vinsentli are you running into this issue while trying to allocate a uniform buffer or a buffer to hold the texture?
Please give an example of the usage so we can better assess next steps.
@vinsentli Could you please share some test code snippet or a usage example showing how exactly this value is being used?
@syeh1 @corporateshark Does DeviceFeatureLimits::BufferAlignment mean GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT? I fix the same problem on opengl backend in another pull request: https://github.com/facebook/igl/pull/113
https://www.khronos.org/opengl/wiki/Uniform_Buffer_Object#Data_storage If you bind a uniform buffer with glBindBufferRange, the offset field of that parameter must be a multiple of GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT (this is a global value, not a per-program or per-block one). Thus, if you want to put the data for multiple uniform blocks in a single buffer object, you must make sure that the data for each within that block matches this alignment.
This is my use scene: put the data for multiple uniform blocks in a single buffer object.
In that scene,must make sure that the data for each within that block matches this alignment.
Ok, got it. In that case, iOS is 256b, but M2 would be 32b (looking at the Minimum Constant Buffer offset Alignment row)
Ok, got it. In that case, iOS is 256b, but M2 would be 32b (looking at the Minimum Constant Buffer offset Alignment row)
On m2 is 32B,means 128byte.@syeh1
I update the value on macos. So the alignment on iOS simulator is 256byte, on macos is 128 byte, on iOS phones is 16 byte.
@corporateshark has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
I'm trying to review this, but things are very unclear. Let's start by establishing some things:
- If we want to keep the OpenGL and Metal PRs separate, let's not mix them in the conversation. It just adds noise.
- The Metal tables document uses uppercase
B
for bytes, not bits. IGL's values are also in bytes. There should be no conversion between the two. - GPU families got weird after M1 macs. Starting with
Apple7
, mobile and mac GPUs are in the same family. The limits in themac2
column represent Intel GPUs, but note that arm macs will also report support for themac2
family. All of this is in the document linked above.- Note that IGL does not distinguish between arm and Intel macs, it just looks at the platform.
Now, onto the speculative:
- If I understand the motivation, the type of alignment that matters here is for "constant buffers".
-
DeviceFeatureLimits::BufferAlignment
doesn't necessarily map to constant buffers only, but buffers in general. We might have to add a new enum for it, but we can also just go with the more restrictive of all use cases -- and I say let's go that way until someone comes up with actual code exposing some issue/limitation, as otherwise it's just guessing. - According to the Metal table, iOS physical device alignment should be 16, and mac should be 32. And according to the simulator guide, alignment should be 256. I think these are the more correct numbers to use.
- HOWEVER, if we also want to account for the "Buffer alignment for copying an existing texture to a buffer" row given the lack of granularity in IGL's enums, we should use 256 as the alignment for mac as well.
IMO, given the lack of real examples, we should go with this simpler/safer alternative:
#if IGL_PLATFORM_MACOS || IGL_PLATFORM_IOS_SIMULATOR
result = 256;
#else
result = 16;
#endif
Thoughts?
There is a easy way to verify the alignment value on MacOS。
In the ColorSession demo, modify the bufferOffset value from 0 to 1, the Xcode will report an error.
I verify the value on MacOS with arm GPU(Apple M1 Pro), the alignment value is 16 bytes. 【the offset into the buffer color that is bound at buffer index 0 must be a multiple of 16】
It aligns with the instructions provided in the documentation.
I verify the value on MacOS with Intel GPU(Apple M1 Pro), the alignment value is 32 bytes.
【the offset into the buffer color that is bound at buffer index 0 must be a multiple of 32】
Indeed, it has been confusing me.
Yes , use 256 bytes on MacOS more safer. @tgoulart @corporateshark
#if IGL_PLATFORM_MACOS || IGL_PLATFORM_IOS_SIMULATOR result = 256; #else result = 16; #endif
@corporateshark merged this pull request in facebook/igl@0df3ce0d78eeb0d362a704a707dd7d02421fd525.