OpenUSD icon indicating copy to clipboard operation
OpenUSD copied to clipboard

Performance issues with large amounts of materials

Open sirpalee opened this issue 4 years ago • 7 comments

Description of Issue

When a stage has a large amount of materials (50k+), Hydra spends a significant amount of time in Hd_PrimTypeIndex<HdSprim>::SyncPrims. In the example file (with 250k cubes and materials), the function takes 30.39 ms, while the whole render call 31.737 ms. See attached image from the profiler:

SyncPrims_perf

Steps to Reproduce

  1. Open usd from cubes_with_preview_250k.zip in usdview.
  2. Make meshes invisible.
  3. Move the camera around and observe render time.

System Information (OS, Hardware)

  • Windows 10
  • AMD Threadripper 3970X & NVIDIA RTX 3090

Package Versions

  • Latest build of dev branch.

Build Flags

  • Default flags using build_usd.py and --build-variant set to relwithdebuginfo.

sirpalee avatar Apr 05 '22 04:04 sirpalee

Filed as internal issue #USD-7313

jilliene avatar Apr 11 '22 22:04 jilliene

Just to provide some context, there are a few known issues with material Sync():

  1. Currently, it's single-threaded; it's straightforward to thread, but some render delegates have thread hazards so it's tricky to roll out...
  2. Unlike geometry, we loop through all materials checking if they're invalid; we skip the work of updating them if they're up-to-date, but at scale (as you can see) this still causes issues.

Both of these will be addressed in planned work. It looks like most of the time in your trace is taken up by GetSprimDirtyBits, which is a lookup of prim path -> u32 invalidation flags, and there's not much to address there other than #1 and #2 above.

All of this said, 50k GLSL or OSL compiles seems like a nightmare as well. Can I ask what's driving the shader variation? If the materials are duplicates, it's worth deduplicating them in the scene, regardless of how we optimize hydra. If you want to pass different parameters in per geometry, I'd recommend doing that with primvars on the geometry. We've got some current limitations around texture bindings per-geometry, and around taking full advantage of USD instances of materials; if one of these is the issue, please let us know!

tcauchois avatar May 12 '22 16:05 tcauchois

Also, @sirpalee , the "texture bindings per-geometry" is mostly a render-delegate-specific problem (beyond formalizing the USD/UsdShade convention for indicating primvar-substitutions for assetPaths), and if Storm performance is important enough to NVidia that it needs to be addressed soon, we'd be happy to provide guidance if NVidia wants to make a PR for the work!

spiffmon avatar May 25 '22 23:05 spiffmon

A minor update on the threadedness of material updates: we've added an API "bool HdRenderDelegate::IsParallelSyncEnabled(TfToken primType)" which can be used to selectively turn on multithreaded sync per prim type. We've only turned it on for "extComputation" so far, but if you know your render delegate's material sync is threadsafe you could try turning it on. Neither the hdPrman or hdStorm material sync functions are threadsafe at the moment, but we'd happily take a PR eliminating the thread hazards in them as well, if you get to it before us.

tcauchois avatar Mar 09 '24 00:03 tcauchois

@tcauchois the IsParallelSyncEnabled function is not virtual, so parallel sync can't really be enabled by render delegate implementations. Is this by design or an overlook?

TheMostDiligent avatar Jun 01 '24 17:06 TheMostDiligent

Overlook :(. I'll make it virtual first thing next week.

tcauchois avatar Jun 01 '24 18:06 tcauchois

@tcauchois Did you have a chance to fix the IsParallelSyncEnabled function?

TheMostDiligent avatar Jun 22 '24 18:06 TheMostDiligent