MoltenVK WIP: Implementing Acceleration Structures

This PR provides an implementation of the VK_KHR_acceleration_structure extension which provides a gateway to ray queries and ray tracing pipelines. This PR is still very WIP due to not being anywhere close to done. The reason for opening this PR so early on is to allow for more concrete discussion of the implementation of acceleration structures, and also keeps people up to date on the implementation.

This PR is related to: #427 #1953 - Not directly related but may have some slight discussion on Acceleration Structures #1956

Jul 06 '23 15:07 AntarticCoder

Acceleration Structure and Raytracing in general does not seem to be supported before MacOS 11, so Xcode 11.7 will always fail.

Jul 06 '23 22:07 AntarticCoder

Acceleration Structure and Raytracing in general does not seem to be supported before MacOS 11, so Xcode 11.7 will always fail.

Then the parts of MoltenVK that deal with Acceleration Structures need to be inside MVK_XCODE_12 blocks.

Jul 06 '23 22:07 cdavis5e

Acceleration Structure and Raytracing in general does not seem to be supported before MacOS 11, so Xcode 11.7 will always fail.

Then the parts of MoltenVK that deal with Acceleration Structures need to be inside MVK_XCODE_12 blocks.

Agreed. But before forcing that, we should discuss whether that makes sense at this point. Xcode 11 is now 4 years old, and at some point, we must give up support for it, from a practicality perspective (like this one). Retaining support for Xcode 11 was added a couple of years ago because some devs required it for their internal processes.

A few months ago, I reached out to the community about this exact question, and received no responses. Unless we can determine a good reason for maintaining Xcode 11, maybe now is the time to drop support for it.

Jul 07 '23 16:07 billhollings

@billhollings MacOS 11 seems to have support from devices as old as 2013 and newer. So it's a matter of dropping support of these pre-2013 devices, as well as that, some people stay on MacOS 10 for support of 32 bit applications and other reasons. This is just something to take into consideration.

Jul 07 '23 16:07 AntarticCoder

A few months ago, I reached out to the community about this exact question, and received no responses. Unless we can determine a good reason for maintaining Xcode 11, maybe now is the time to drop support for it.

I have added a ping post to that feedback request thread.

@AntarticCoder Hold off wrapping your code in any MVK_XCODE_12 guards while this PR remains a WIP. When this PR is ready to go, based on any feedback we receive to my query ping, we can decide whether we need to actually implement those guard wraps, or abandon Xcode 11.

Jul 07 '23 16:07 billhollings

@billhollings Alright, I'll hold off on the MVK_XCODE_12 guards. Thanks

Jul 07 '23 16:07 AntarticCoder

@billhollings MacOS 11 seems to have support from devices as old as 2013 and newer. So it's a matter of dropping support of these pre-2013 devices, as well as that, some people stay on MacOS 10 for support of 32 bit applications and other reasons. This is just something to take into consideration.

The MVK_XCODE_12 guard is strictly for API compilation during MoltenVK builds (ie- will it build with the Metal API supported by Xcode 11). Support for older OS runtimes is handled independently, through things like respondsToSelector:.

Jul 07 '23 16:07 billhollings

@billhollings Ah, yes my mistake. Also, just a thought but only about 120 people actually watch this repository, so I'm not sure how many people will see your message.

Jul 07 '23 16:07 AntarticCoder

@billhollings @cdavis5e An issue I've run into during this PR, is accessing the provided scratch buffer, via the provided device address. To solve this, I got a reply from @K0bin in issue #1956, which is as followed.

@AntarticCoder @rcaridade145 The contents function will just give you a CPU pointer to the data of a shared buffer. That's not useful here unless you want to copy all the data around on the CPU every time. (which would also involve a GPU sync)

What you have to do is basically maintain a map that maps BDA VAs to their original buffer objects. Keep in mind that this VA map has to be extremely fast and should minimize locking as much as possible. An example for that can be found in vkd3d-Proton: https://github.com/HansKristian-Work/vkd3d-proton/blob/master/libs/vkd3d/va_map.c

Basically create a map from scratch that is fast, and thread safe, and when you call vkGetBufferDeviceAddress, we could push the address along with buffer. I just wanted to ask if this is a good idea, and what you would change about it.

Jul 10 '23 13:07 AntarticCoder

and when you call vkGetBufferDeviceAddress, we could push the address along with buffer

It's probably better to do that at buffer creation time and keep vkGetBufferDeviceAddress fast.

Jul 10 '23 15:07 K0bin

@K0bin But not every created buffer will be used via the device address. So if you pushed it on vkGetBufferDeviceAddress, you would effectivly be keeping uneeded buffers out of the map.

Jul 10 '23 15:07 AntarticCoder

Base it off of VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT.

Jul 10 '23 15:07 K0bin

Okay then, I'll get started on the implementation. Thanks @K0bin

Jul 10 '23 15:07 AntarticCoder

Base it off of VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT.

Search for VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT in existing MoltenVK code. There is already an MVKSmallVector containing a list of these in MVKDevice::_gpuAddressableBuffers. Perhaps this could be modified to use a std::unordered_map, to be used to serve both purposes?

Jul 10 '23 21:07 billhollings

@billhollings That seems like a good idea, I'll go ahead and use that for now, and we can change it in the future if it's not getting the job done.

Jul 10 '23 22:07 AntarticCoder

6/12 commands have been implemented, so I'm halfway there. 🎉

Jul 11 '23 14:07 AntarticCoder

Could anyone give me some feedback on the memory address system for acceleration structure. I'd just like a second pair of eyes on the implementation.

Thanks

Jul 14 '23 18:07 AntarticCoder

@AntarticCoder @cdavis5e

This PR seems to have stalled. I've received a request from a game studio using MoltenVK who would like to see VK_KHR_acceleration_structure completed, and is willing to fund that work.

If either of you have spare time, and are interested in receiving compensation to work on finishing VK_KHR_acceleration_structure, let me know and I'll put you in touch with them. Their schedule is not rushed, so this is something that could start anytime in the next month or two, and be something that could be fit in part-time.

This sponsor is actually interested in seeing general ray tracing added, so if you (or anyone else out there), is interested in working on a funded project to see the following completed (in order or priority, and similar schedule behavior as above), please let me know:

VK_KHR_acceleration_structure
VK_KHR_ray_tracing_pipeline
VK_KHR_ray_query
VK_KHR_pipeline_library

Nov 03 '23 16:11 billhollings

@billhollings

I’m sorry, I’ve been busy with since I have began school and never got around to finishing this PR. I am interested in receiving compensation for finishing up VK_KHR_acceleration_structure. I also would be interested in working on general ray-tracing as well. Could you somehow get me into contact with the game studio?

Thanks

Nov 04 '23 14:11 AntarticCoder

I'm interested in this. I've talked with Holochip, and they're also interested.

Nov 04 '23 18:11 cdavis5e

@billhollings

I’m sorry, I’ve been busy with since I have began school and never got around to finishing this PR. I am interested in receiving compensation for finishing up VK_KHR_acceleration_structure. I also would be interested in working on general ray-tracing as well. Could you somehow get me into contact with the game studio?

Thanks

@AntarticCoder

I think it definitely makes sense to have you working on completing this PR. Can you shoot me an email at [email protected], and we'll sort things out. On your email, can you quote an hourly rate you'd like, how much time you have available, and where you are located (for how to best get you actually paid), please?

Nov 06 '23 22:11 billhollings

@billhollings

Just sent you an email.

Nov 08 '23 00:11 AntarticCoder

Is there any news regarding this PR? I now own an M3 Max Macbook and could test if this is of interest.

Dec 26 '23 19:12 zmarlon

Our game studio is interested in cross-platform ray tracing with Vulkan, wondering whether there's been any progress here.

May 27 '24 04:05 kanerogers

How do you intend to work around the fact that Metal needs a list of all bottom level acceleration structures to build the TLAS while Vulkan only needs a GPU buffer address that contains that data?

You'll probably have to maintain a list that has every single BLAS and use that when creating the Metal TLAS. Then in vkCmdBuildAccelerationStructure you prepare some kind of hashmap on the CPU for BLAS VkDeviceAddress -> uint32_t index. Then you run a compute shader that prepares the actual MTLAccelerationStructureInstanceDescriptors by doing a hashmap lookup for each instance to get the index. Not great, maybe you can come up with a simpler solution.

May 30 '24 04:05 K0bin

MoltenVK MoltenVK copied to clipboard

WIP: Implementing Acceleration Structures

MoltenVK
MoltenVK copied to clipboard