wgpu
wgpu copied to clipboard
Extended Presentation API Investigation
Context
I'm working on frame pacing and we need some help from the api. The difficulty of designing this api is each WSI has different pieces of information and give it to us in different ways.
Supersedes #682 Supersedes #2650
Investigation
We have the following major WSIs to think about:
- IDXGISwapchain (Windows 7+ - D3D)
- IPresentationManager (Windows 11+ - D3D)
- CAMetalLayer (Mac - Metal)
- VK_GOOGLE_display_timing (Vulkan - Android)
- VK_KHR_present_wait (Vulkan - Nvidia)
- VK_KHR_incremental_present (Mainly Mesa/Android)
- VK_KHR_swapchain (All Vulkan)
And we have the following primatives:
- Get Present Start/End Time
- Wait for Present Finish
- Present with Damage
- Schedule Present Time
- Primary Monitor Frequency
| Present Time | Wait for Present | Present with Damage | Scheduled Present | Monitor Frequency | |
| IDXGISwapchain | π (1a) | π (2) | β (3) | π (4) | β |
| IPresentationManager | β (1b) | β | β | β | β |
| CAMetalLayer | β (1c) | β | β | β | β (5) |
| VK_GOOGLE_display_timing | β | β | β | β | β |
| VK_KHR_present_wait | β | β | β | β | β |
| VK_KHR_incremental_present | β | β | β | β | β |
| VK_KHR_swapchain | β | β | β | β | β |
Notes: 1a. Presentation times need to be queried actively, it doesn't get told to us. 1b. Presentation times are given through an event queue. 1c. Presentation times are given through callbacks. 2. Can only wait for 1-3 frames ago, not a particular frame. 3. Windows 8+/Windows 7 Platform Update 4. You can schedule presentation for N vblanks from now. 5. Via NSScreen - need to figure out how to get NSScreen from metal layer.
Because of the diversity of the platforms, I think this will inherently be a leaky abstraction - this is okay - we shouldn't try to hide platform differences, just make it as easy to use as possible.
As such I have put together the following api.
Api Suggestion
Feature
First is to add a single Feature.
const EXTENDED_PRESENTATION_FEATURES = ...;
Presentation Features
Add an extended presentation capabilities bitflag that is queryable from the surface. I am separating this from regular features because they are more useful as default-on. Having the single feature means that users have to consciously enable it, but without needing to individually modulate them.
fn Surface::get_extended_presentation_features(&self, &Adapter) -> ExtendedPresentationFeatures;
bitflags! {
// Names bikeshedable
struct ExtendedPresentationFeatures {
const PRESENT_STATISTICS = 1 << 0;
const MONITOR_STATISTICS = 1 << 1;
const WAIT_FOR_PRESENTATION = 1 << 2;
const PRESENT_DAMAGE_REGION = 1 << 3;
const PRESENT_DAMAGE_SCOLL = 1 << 4;
const PRESENT_TIME = 1 << 5;
const PRESENT_VBLANK_COUNT = 1 << 6;
}
}
Presentation Signature
The presentation signature will be changed to the following.
fn Surface::present(desc: PresentationDescriptor<'a>);
#[derive(Default)] // Normal presentations will be PresentationDescriptor::default()
struct PresentationDescriptor<'a> {
// Must be zero-length if PRESENT_DAMAGE_REGION is not true
rects: &'a [Rect],
// Must be None if PRESENT_DAMAGE_SCOLL is not true
scroll: Option<PresentationScoll>,
// Must be NoDelay if PRESENT_TIME or PRESENT_VBLANK_COUNT if not true
presentation_delay: PresentationDelay,
}
struct PresentationScroll {
source_rect: Rect,
offset: Vec2,
}
struct Rect {
offset: Vec2,
size: Vec2,
}
enum PresentationDelay {
// Queue the frame immediately.
NoDelay,
// Queue the frame for N vblanks from now (must be between 1 and 4). Needs PRESENT_VBLANK_COUNT.
ScheduleVblank(u8)
// Queue the frame for presentation at the given time. Needs PRESENT_TIME.
ScheduleTime(PresentationTime)
}
Presentation Timestamp
Because different apis use different timestamps - we need a way of correlating these timestamps with various other clocks. The clocks used are as follows on each WSI:
| WSI | Clock |
| IDXGISwapchain | QueryPerformanceCounter |
| IPresentationManager | QueryInterruptTimePrecise |
| CAMetalLayer | mach_absolute_time |
| VK_GOOGLE_display_timing | clock_gettime(CLOCK_MONOTONIC) |
Add the following function to the surface.
fn Surface::correlate_presentation_timestamp<F, T>(&self, &Adapter, F) -> (PresentationTimestamp, T) where FnOnce() -> T;
// Unit: nanoseconds
struct PresentationTimestamp(pub u64);
Which will let people write the following code to correlate instants and presentation timestamps. We need this because Instants need to be treated as completely opaque as the clock they use can change at any time. In most cases these are actually the same clock, but this is what we get.
let (present_timestamp, now) = surface.correlate_presentation_timestamp(&adapter, Instance::now);
Presentation Statistics
Because of the difference in how all the apis query stats, we need to abstract this carefully. We use a query-based "presentation statistics queue".
- CAMetalLayer: Callbacks will save the time into a queue, which is emptied every time it is queried.
- IPresentationManager: Calling the query function drains the statistics queue.
- IDXGI: Query calls GetPresentationStatistics and returns a single value.
- VK_GOOGLE_present_timing: Calls
vkGetPastPresentationTimingGOOGLEwhich drains the queue.
fn Surface::query_presentation_statistics(&self, &Device) -> Vec<PresentationStatistics>;
struct PresentationStatistics {
presentation_start: PresentationTimestamp,
// Only available on IPresentationManager
presentation_end: Option<PresentationTimestamp>,
// Only available on VK_GOOGLE_display_timing
earliest_present_time: Option<PresentationTimestamp>,
// Only available on VK_GOOGLE_display_timing
presentation_margin: Option<PresentationTimestamp>,
composition_type: CompositionType,
}
enum CompositionType {
// CAMetalLayer is always Composed
Composed,
Independent,
// Vulkan, DXGI is always unknown
Unknown,
}
Presentation Wait
First add the following member to SurfaceConfiguration:
// Requires WAIT_FOR_PRESENTATION and must be between 1 and 2.
maximum_latency: Option<u8>
This adjusts either the swapchain frame count to value + 1 - or sets SetMaximumFrameLatency to the value given, or uses a wait-for-present in the acquire method to limit rendering such that it acts like it's a value + 1 swapchain frame set.
Monitor Information
Getting exact frequencies of monitors is important for pacing - they can be derived from presentation stats, but an explicit api is more precise if it is available.
fn Surface::query_monitor_statistics(&self, &Device) -> MonitorStatistics;
struct MonitorStatistics {
// In nanoseconds
min_refresh_interval: u64,
max_refresh_interval: u64,
// On available on CAMetalLayer
display_update_granularity: u64,
}
Conclusion
This is obviously one hell of an api change, and this doesn't have to happen all at once, but this investigating should give us the place to discuss the changes and make sure it provides the information needed.
For EGL, the WSI can do present with damage if the EGL_KHR_swap_buffers_with_damage extension is supported.
Random thoughts incoming:
- Does
WSImean windowing system integration? It's unfortunately not the most searchable abbreviation. - For
PRESENT_STATISTICSI feel that thePRESENTis needed to differentiate fromMONITOR_STATISTICS. However, I feel it should be at leastPRESENTATION_STATISTICS. MONITOR_STATISTICSreads as if 'monitor' means 'keep track of' as in verb noun. I first thoughtDISPLAY_STATISTICSbut that has the same problem.SCREEN_STATISTICSis maybe a bit less problematic even though grammatically it could go either way. I feel like I wantDISPLAY_REFRESH_STATISTICSto be the outcome. :)- There are
scolltypos in places. Where does the 'scroll' term come from? Scroll-lock comes to mind but that's an old cathode ray tube (CRT) feature. From the structs it looks like it's for requesting which part of the screen would be actually updated with what portion of the provided frame. Is that correct? - It's naΓ―vely surprising to me that Vulkan does not generally support presentation times / scheduling presentation when the others do
- It feels like there must be a way on Windows to get information about display refresh rates and timings...
- What is the difference between
presentation_startandearliest_present_time? - What does
presentation_marginmean?
Taking a step back and thinking about how one would want to use this - the presentation and display refresh statistics provide information that can be used to make some kind of estimation/prediction for frame pacing, and the presentation descriptor then allows making an attempt at controlling presentation of a given frame. I'd have to think through it more thoroughly to be able to figure out whether it's sufficient and ergonomic.
It feels like there must be a way on Windows to get information about display refresh rates and timings...
@superdump There is, it's a bit complicated, but should be implementable within winit. You can get very specific timing info about all your monitors. Now that winit exposes micro-hertz refresh it should be usable for pacing. We just also need to expose the precision of the hz measurement.
I'm not sure what the status here is, but I'd love to implement the VK_GOOGLE_display_timing version, if you can give some guidance.
@badicsalex Sorry this totally got lost in the information firehose. None of this (outside of getting cpu-side presentation timestamps) is implemented yet and we'd love help! Come on our matrix and chat, that'd probably be the easiest way to sync up.
@cwfitzgerald thanks for the answer. We've investigated the issue in detail since then, and it seems that VK_GOOGLE_display_timing wouldn't give us much over simply measuring the acquire times of a simple FIFO mode, so we didn't pursue that angle any further.