MoltenVK icon indicating copy to clipboard operation
MoltenVK copied to clipboard

WIP: Initial implementation of Metal argument buffers for Vulkan descriptor sets.

Open billhollings opened this issue 4 years ago • 13 comments

I've created a new argument-buffers branch to support Metal argument buffers for use in descriptor sets, and have pushed the initial release of this implementation to that branch. Some notes on current status:

  1. The use of Metal argument buffers dramatically increases the available quantities of shader resources in most modern (Tier 2) GPU's, properly supporting the goals of the VK_EXT_descriptor_indexing extension.
  2. The MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS environment variable can be used to enable or disable the use of Metal argument buffers. It is enabled by default in this branch.
  3. The Metal argument buffer implementation passes the same 20K+ CTS descriptor set tests that MoltenVK passes when not using Metal argument buffers.
  4. However, the current implementation does cause issues and crashes with some applications, and that is being tested and addressed. Please add comments to this issue to identify any problems encountered while testing.
  5. In order to streamline descriptor set allocations, this implementation allocates a single large MTLBuffer to use for all descriptors managed by a descriptor pool (as opposed to a MTLBuffer per descriptor set stage). However, because Metal doesn't seem to allow arbitrary indexed access into Metal argument buffers, each descriptor layout stage requires a separate argument encoder. The result is that, in theory, each pipeline stage could use all descriptors, meaning that at pool allocation time, the size of the single large MTLBuffer is grossly over-allocated. For example, a graphics pipeline of 4 stages will have 4 separate argument encoders, and 4 sets of resources in the argument buffer. An alternative for this could include allocating a separate MTLBuffer per descriptor set when the descriptor set is allocated, but in CTS testing, this causes significant performance degradation (50x) during descriptor set allocation and freeing, relative to not using Metal argument buffers, because MTLBuffers are performance-expensive to create and destroy. Another alternative might be to manage a pool of smaller MTLBuffers.
  6. I will continue to do further improvements, optimizations, and testing on the argument-buffers branch before rolling it into master.

@cdavis5e I wouldn't mind getting your thoughts on item 5 above.

billhollings avatar Dec 26 '20 22:12 billhollings

I tried this branch with a couple of games, and there's a problem - some objects jump around each frame, so to speak. I made a short recording that illustrates it:

https://user-images.githubusercontent.com/4086836/103773236-0608fc00-502b-11eb-92b0-a9a2e4c7a39b.mov

I tried to pin down the problem but so far no luck. It looks like maybe some buffers get mixed up? Could it be a threading issue?

js6i avatar Jan 06 '21 13:01 js6i

@js6i Thanks for testing and providing the feedback.

billhollings avatar Jan 06 '21 14:01 billhollings

curious if argument buffers will be "in" for next Vulkan SDK.. sadly may be not judging from present issues shared here..

oscarbg avatar Feb 13 '21 22:02 oscarbg

curious if argument buffers will be "in" for next Vulkan SDK.. sadly may be not judging from present issues shared here..

Unfortunately, argument buffers won't be in this month's SDK release. But expect it in the SDK release following that.

billhollings avatar Feb 14 '21 22:02 billhollings

Just wanted to ask how argument buffers are doing. Is there some time to expect it in the master branch?

DaeBasduhr avatar Mar 09 '21 16:03 DaeBasduhr

Is there some time to expect it in the master branch

The aim is to have this implemented within the next few weeks.

billhollings avatar Mar 09 '21 18:03 billhollings

If I read the code correct, it looks like shaders like this: https://github.com/KhronosGroup/SPIRV-Cross/blob/HEAD/reference/opt/shaders-msl/comp/array-length.msl2.argument.discrete.comp#L33 are not supported by the current argument buffer impl because it requires multiple intrinsic "spvBufferSizeConstants" buffers to be bound by MoltenVK.

Is this a known issue?

VZout avatar Aug 09 '21 14:08 VZout

Is this a known issue?

Affirmative. In MoltenVK, the additional implicit buffers such as those containing shader texture swizzles and buffer sizes have not yet been moved to be bound to argument buffers, instead of discrete buffers.

Do you have a sample app that requires this and can be used to validate development in this area?

billhollings avatar Aug 11 '21 17:08 billhollings

I'm having success using VK_EXT_descriptor_indexing via the MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS env var on MacOS, but I run into errors when deploying this to iOS. The original release notes for this feature imply it was only supported on MacOS. Are there any plans to bring this to iOS, or perhaps there's already a way to enable it?

Here are the errors I'm getting:

On device (iPhone 8):

[mvk-error] VK_ERROR_INITIALIZATION_FAILED: Shader library compile failed (Error code 3):
program_source:44:226: error: cannot reserve 'sampler' resource locations at index 0
fragment main0_out main0(main0_in in [[stage_in]], constant constants& PushConstants [[buffer(0)]], constant UniformBlock& uniformBlock [[buffer(1)]], array<texture2d<float>, 26> texSampler [[texture(0)]], array<sampler, 26> texSamplerSmplr [[sampler(0)]])

On simulator:

[mvk-error] VK_ERROR_FEATURE_NOT_PRESENT: Device Apple iOS simulator GPU does not support arrays of samplers.

Thanks

asivitz avatar Aug 14 '22 03:08 asivitz

The original release notes for this feature imply it was only supported on MacOS. Are there any plans to bring this to iOS

Yes. VK_EXT_descriptor_indexing is currently only available on macOS. Support for it on iOS is dependent on Metal 3, which will be released by Apple in iOS 16 later this fall. We do plan to add support for iOS soon after that.

billhollings avatar Aug 14 '22 19:08 billhollings

That's great news; thank you so much.

asivitz avatar Aug 14 '22 20:08 asivitz

You should only need A13 support for descriptor indexing on the gpu. That's when you can reference into an array of argument buffers randomly, and I think do nested argument buffers. That might open up more platforms. But Metal3 did change the way argument buffers are built to be much simpler.

alecazam avatar Mar 26 '23 22:03 alecazam

EDIT: I was actually using Vulkan SDK version 1.3.250, so this specific crash appears to already be fixed. It looks like this specific issue is resolved, now that I've updated to 1.3.261.1.

firstly, thank you so much for all the hard work on MVK @billhollings! I am currently trying to get my Vulkan abstraction working on MacOS, and I've noticed an issue with bindless support. Of course, it looks like it's not complete yet, so I wanted to share my findings here.

I'm using a 2023 M2 Mac Mini, which I believe has full support of Metal 3 and Tier 2 argument buffers

I have the following super simple compute shader, compiled to SPIR-V via glslang:

layout(binding = 0, set = 0, rgba8) uniform image2D my_image;
layout(binding = 1, set = 0) uniform texture2D my_texture;

layout(local_size_x = 1, local_size_y = 1, local_size_z = 1) in;
void main() {
}

This compiles fine, however since our abstraction uses bindless, I believe we need to enable the use of argument buffers in MVK. When I enable the MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS environment variable (either with setenv or globally in my zshrc), I get a crash during my call to vkCreateComputePipelines. I'm not really sure why this would happen, with the limited amount of information I can gather, but is this a known issue? Here's the call stack: image

I'd be willing to look into it myself, though I'm not sure how I could get debug symbols for moltenvk, especially since MacOS is very new to me. Any insight or assistance would be very greatly appreciated!

GabeRundlett avatar Oct 05 '23 18:10 GabeRundlett