Vulkan-Samples
Vulkan-Samples copied to clipboard
Refactor getting/adding extension features to create a VkDevice with.
Description
As described in #938, requesting and enabling extension features is currently incorrect. With this PR, this is done similar to how it's done for the simple PhysicalDeviceFeatures: one function to get the supported bits (get_extension_features
) and one to actually set them into the structure chain (add_extension_features
).
The perfect handling would look like this:
if (gpu.get_extension_features<VkExtensionFeaturesStruct>(VK_STRUCTURE_TYPE_EXTENSION_FEATURES_STRUCT).featureBit)
{
gpu.add_extension_features<VkExtensionFeaturesStruct>(VK_STRUCTURE_TYPE_EXTENSION_FEATURES_STRUCT).featureBit = VK_TRUE;
}
else
{
// some fallback or error handling!
}
To ease usage, [HPP]PhysicalDevice
got two new template functions: request_optional_feature
and request_required_feature
, which log a message or throw an exception, respectively, if the requested feature is not supported.
Requesting a required feature would now look like
gpu.request_required_feature<VkExtensionFeaturesStruct>(VK_STRUCTURE_TYPE_EXTENSION_FEATURES_STRUCT,
&VkExtensionFeaturesStruct::featureBit,
"VkExtensionFeaturesStruct",
"featureBit");
Where the last two arguments to that functions are just for error reporting and maybe could omitted.
To ease usage even more, some macros have been added, so that requesting a required feature would now look like
REQUEST_REQUIRED_FEATURE(gpu, VkExtensionFeaturesStruct, VK_STRUCTURE_TYPE_EXTENSION_FEATURES_STRUCT, featureBit);
Build tested on Win10 with VS2022. Run tested on Win10 with NVidia GPU.
Fixes #938
General Checklist:
Please ensure the following points are checked:
- [x] My code follows the coding style
- [x] I have reviewed file licenses
- [x] I have commented any added functions (in line with Doxygen)
- [x] I have commented any code that could be hard to understand
- [x] My changes do not add any new compiler warnings
- [x] My changes do not add any new validation layer errors or warnings
- [x] I have used existing framework/helper functions where possible
- [x] My changes do not add any regressions
- [x] I have tested every sample to ensure everything runs correctly
- [x] This PR describes the scope and expected impact of the changes I am making
Note: The Samples CI runs a number of checks including:
- [x] I have updated the header Copyright to reflect the current year (CI build will fail if Copyright is out of date)
- [ ] My changes build on Windows, Linux, macOS and Android. Otherwise I have documented any exceptions
Sample Checklist
If your PR contains a new or modified sample, these further checks must be carried out in addition to the General Checklist:
- [x] I have tested the sample on at least one compliant Vulkan implementation
- [x] If the sample is vendor-specific, I have tagged it appropriately
- [x] I have stated on what implementation the sample has been tested so that others can test on different implementations and platforms
- [x] Any dependent assets have been merged and published in downstream modules
- [ ] For new samples, I have added a paragraph with a summary to the appropriate chapter in the readme of the folder that the sample belongs to e.g. api samples readme
- [ ] For new samples, I have added a tutorial README.md file to guide users through what they need to know to implement code using this feature. For example, see conditional_rendering
- [ ] For new samples, I have added a link to the Antora navigation so that the sample will be listed at the Vulkan documentation site
I don't think we can just assert in all these cases can we?
You're kind of right. As I stated in the Description above, that should be replaced by some meaningful error or fall-back handling. What would you suggest to do instead until then?
What would you suggest to do instead until then?
Well throwing a std::runtime_error
would be better, as it doesn't terminate batch mode, just the current sample I guess.
I'm also not really sure what was wrong with just having a single call to request_extension_features? I can't see why unrolling that code everywhere is better. Maybe I'm just missing the issue that's being solved here.
I'm also not really sure what was wrong with just having a single call to request_extension_features?
The problem, as described in #938, was, that currently you get the supported extension features bits, possibly set some of them to true, no matter if they are supported at all, and store that structure in the structure chain for device creation. That is, there are two issues:
- All supported flags are requested, no matter if you actually need/want the support of all those flags.
- Potentially support of a not supported feature is requested.
With this PR, you just set the flags you're interested in and when they're supported at all.
Well throwing a
std::runtime_error
would be better, as it doesn't terminate batch mode, just the current sample I guess.
Sounds good. I can change that accordingly.
I think you'll need to rebase this over https://github.com/KhronosGroup/Vulkan-Samples/pull/1031. A lot of the samples don't run without that, so testing this isn't really possible without it.
Tbh. I'm kinda torn over the syntax that now needs to be used. Will that actually work with e.g. code completion/intellisense? Or will it fail at compile time when I e.g. request a member of a feature that isn't present or that I did misspell?
I'm referring to this:
REQUEST_REQUIRED_FEATURE(gpu,
VkPhysicalDeviceGraphicsPipelineLibraryFeaturesEXT,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_GRAPHICS_PIPELINE_LIBRARY_FEATURES_EXT,
graphicsPipelineLibrary);
The old syntax has you access that member (graphicsPipelineLibary) directly, which works fine with intellisense/code compleation and is easier to understand/follow:
auto &requested_vertex_input_features = gpu.request_extension_features<VkPhysicalDeviceMeshShaderFeaturesEXT>(VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MESH_SHADER_FEATURES_EXT);
requested_vertex_input_features.meshShader = VK_TRUE;
(though I don't really like that syntax either)
I fear that this makes working on samples harder, hurts readability and foremost makes it harder to follow our samples.
Is there a different way to handle this that looks more similar to the old syntax?
Will that actually work with e.g. code completion/intellisense?
As it is with all macros, you probably don't have any support by code completion or intellisense when you use [HPP_]REQUEST_[OPTIONAL|REQUIRED]_FEATURE
. Any error will fire up at compile time.
When you use the helper functions gpu.request_[optional|required]_feature
or the actual worker functions gpu.[add|get]_extension_features
, you of course have all the help available by code completion and intellisense.
The old syntax has you access that member (graphicsPipelineLibary) directly, which works fine with intellisense/code compleation and is easier to understand/follow:
But as described in #938 it's just wrong!
Is there a different way to handle this that looks more similar to the old syntax?
I don't know of any.
Actually, I provided three different levels of complexity to handle that correctly. If you have an idea for an additional approach, I'd be happy to add that as well.
As discussed on the call, here is the way I do it in my samples:
- I declare the extension struct(s) in the sample
- I setup the whole extension struct(s) in the sample constructor
- I add the struct(s) to a pNext pointer of my device / instance
An example:
class VulkanExample : public VulkanExampleBase
{
VkPhysicalDeviceTimelineSemaphoreFeaturesKHR enabledTimelineSemaphoreFeaturesKHR{};
VulkanExample() : VulkanExampleBase()
{
...
enabledTimelineSemaphoreFeaturesKHR.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TIMELINE_SEMAPHORE_FEATURES_KHR;
enabledTimelineSemaphoreFeaturesKHR.timelineSemaphore = VK_TRUE;
pNextDevice = &enabledTimelineSemaphoreFeaturesKHR;
}
}
It might not be the best approach from a C++ language point of view (esp. how pNextDevice is implemented), but it makes it very easy for people to follow my samples and you can copy that 1:1 into your own application without having to follow any templates or macros.
It would probably be a more substantial change, so before we take this further we should agree on what way we want to follow.
@SaschaWillems I assume, that pNextDevice
is the pNext
of some VkPhysicalDeviceFeatures2
, that you hold somewhere. And you're using a pointer to that VkPhysicalDeviceFeatures2
to create a VkDevice
using vkCreateDevice()
.
That is, it seems that setting the requested features and actually using that is also distributed to (at least) two separate code sequences.
But, what's more important, I don't see that you first ask if the requested feature actually is supported. You would need a second structure chain to first ask for support using vkGetPhysicalDeviceFeatures2()
.
From a sample point of view, I think, it's important that the reader of the sample can easily see what features are requested. But I would not consider the details on how that is handled to be important for a sample. Besides for a sample that actually demonstrates how feature requesting is done, of course.
But, what's more important, I don't see that you first ask if the requested feature actually is supported. You would need a second structure chain to first ask for support using vkGetPhysicalDeviceFeatures2().
I actually don't do that. If one requests a feature that isn't supported, instance or device creation will fail with an error code hinting at a feature not supported. I want to deliberately keep samples as simple as possible.
If one requests a feature that isn't supported, instance or device creation will fail with an error code hinting at a feature not supported.
Interesting approach! That is, you would get some VK_ERROR_FEATURE_NOT_PRESENT
on vkCreateDevice
and bail out at that point. But you can't tell then, what feature is missing. And how would you handle some optionally used feature?
I would consider to first get the supported features and potentially react accordingly on their absence the "cleaner" and also more descriptive way to handle that.
And I don't think, that code is that much harder to read and understand.
This PR also addresses issue #1144. It fixes the crashes I am seeing due to missing feature support on my AMD 6600XT GPU during vkCreateDevice()
initialization on Windows 10 with SDK 1.3.290. Thanks @asuessenbach for this work.
However, I would like to request a couple of small things:
- Please update the
mutableComparisonSamplers
portability feature activation in samples async_compute and multithreading_render_passes. - If possible, please add line type optionality to the dynamic_line_rasterization sample. Some GPUs support a subset of the line types and it would be good to have partial support vs. all-or-nothing. The implementation could be something like what I tried in the now-retracted PR #1145.
@SRSaunders I adjusted the two samples async_compute and multithreading_render_passes. But the changes in dynamic_line_rasterization would actually change that sample, which I'ld like to avoid with this already pretty big and long standing PR. Would be nice if you could change that after this PR has been merged.
I adjusted the two samples async_compute and multithreading_render_passes
Thanks.
But the changes in dynamic_line_rasterization would actually change that sample, which I'ld like to avoid with this already pretty big and long standing PR. Would be nice if you could change that after this PR has been merged.
No problem. I'd be happy to come back to that once this PR is merged.
I don't want to interfere with the approvals already granted, but two more samples have now been added using the old gpu.request_extension_features<>()
method: dynamic_rendering_local_read and host_image_copy. Applying this PR without fixing these will result in project build failures. I ran into this today when trying to stack up a few pending changes for testing.
Is there any way to speed up approvals for these kinds of framework updates, or at least put in safeguards to restrict certain kinds of changes (e.g. new samples) until they are merged? It appears as an outside user that there is a process weakness here. This one was obvious, but I could see more subtle issues arising in the future.
We usually meet twice a month to discuss PRs. That's the process. We'll discus this on our next call and make sure we don't break anything.
@asuessenbach : Can you merge main into this so we can see if there are any CI failures as mentioned?