inplace_it icon indicating copy to clipboard operation
inplace_it copied to clipboard

Performance overhead of indirect() placement

Open kvark opened this issue 4 years ago • 3 comments

Here is the latest Firefox profile running the stack of: JS WebGPU -> wgpu -> gfx-backend-vulkan -> inplace_it

https://share.firefox.dev/2RmndRr

What I found peculiar is that inplace_or_alloc_from_iter is only half the time of indirect inplace-it-overhead

What else is indirect doing? Can we reduce this overhead?

kvark avatar Apr 09 '21 14:04 kvark

Hi! indirect is necessary evil for disallow function inlining. See https://github.com/NotIntMan/inplace_it/issues/4 for problem description.

Provide more details about current problem. I don't fully understand what the problem is.

NotIntMan avatar Apr 09 '21 16:04 NotIntMan

I assume you mean the high time consumption in this function. If so, then the problem is not in the function itself, but in the library code (remember that inlining this function is prohibited, but other code is not limited by this rule). In this case, a detailed research of the weak point will be required.

NotIntMan avatar Apr 09 '21 17:04 NotIntMan

I haven't investigated this myself in depth yet. What I see is that a lot of time is spent between in indirect() itself, outside of the payload (the library code) I'm actually running. I.e. this code looks like this:

        let sets_iter = sets.map(|set| set.raw);

        inplace_or_alloc_from_iter(sets_iter, |sets| {
            inplace_or_alloc_from_iter(offsets, |dynamic_offsets| unsafe {
                self.device.raw.cmd_bind_descriptor_sets(
                    self.raw,
                    bind_point,
                    layout.raw,
                    first_set as u32,
                    &sets,
                    &dynamic_offsets,
                );
            });
        });
    }

What I expect is having the nested inplace_or_alloc_from_iter essentially free. Instead, it appears to cost as much as cmd_bind_descriptor_sets itself. I guess we could explain it by the fact there is some copying of data taking place, but it's still suspicious. Let's treat this issue as a call to try and inspect what exactly is going on there?

kvark avatar Apr 09 '21 21:04 kvark