OpenUSD icon indicating copy to clipboard operation
OpenUSD copied to clipboard

APPLE: Add Gaussian Splats API schema

Open dgovil opened this issue 9 months ago • 15 comments

Note: For the latest update to the schema, please read https://github.com/PixarAnimationStudios/OpenUSD/pull/3716#issuecomment-3378477657

This latest update incorporates feedback from the AOUSD Emerging Geometries Interest Group. While we will of course also take feedback left on this PR, please feel free to join via aousd.org if you'd like to participate in live discussion.

Description of Change(s)

This PR adds our proposed schema for Gaussian Splats, based on feedback from the AOUSD Emerging Geometry Interest Group, and feedback from partners like Adobe, NVIDIA etc... I would like to credit @ld-kerley who wrote most of this schema as well as the folks at Adobe and Michael B Johnson who provided earlier ad-hoc schemas.

The schema.usda should provide most of the information on the individual schemas, but some design overviews:

  1. Our schemas apply on top of UsdGeomPoints because we felt that allowed for the most compatibility with the existing ecosystem.
  2. We put the schemas in a light field schema for lack of a better name. We figured we didn't want to overload UsdGeom but also wanted to leave the path open for other representations. We aren't married to the name, so other suggestions are welcome.
  3. We split the GaussiansAPI and SphericalHarmonicsAPI because it is possible to have them be used independently of each other. For example Gaussians without SH data, or Gaussians that use other radiance information.
  4. The Gaussians have a shapes field that is non-comprehensive. I could not find a good facility for this in the schema gen but am open to better ways to present this.
  5. We took inspiration from UsdGeomPointsInstancer and added half and float versions of attributes. We find most gaussians hold up when using Halfs just fine, and there are considerable space savings. We follow the same convention of suffixing the float version with an f, but perhaps it is better to use a variable type like XformOps do?

We shipped a preliminary version of this schema in the macOS 26 Tahoe developer beta to enable users using PLY data, but are keen to replace it with something more standard. This aligns very closely with what Adobe has as an ad-hoc data definition in the adobe file format plugin.

Some more notes:

  1. We do not plan to provide a gaussian renderer at this time. The gaussians will still render as points in Storm of course which provides some measure of view ability.
  2. I can write tests and docs once we're through any technical feedback.
  3. Everything is unmodified after usdGenSchema so you should only need to review schema.usda

Link to proposal

This was proposed in the AOUSD Emerging Geometry IG. A matching issue is created on the Proposals repo https://github.com/PixarAnimationStudios/OpenUSD-proposals/issues/90

Checklist

dgovil avatar Jul 09 '25 17:07 dgovil

Filed as internal issue #USD-11207

(This is an automated message. See here for more information.)

jesschimein avatar Jul 09 '25 17:07 jesschimein

I have a general question.

Light Field, in graphics and optics, refers to a function that describes the radiance at every point in space, in every direction. Radiance is a physically meaningful, directional quantity used for physically-based rendering, simulation and so on. GS and any other splatting don't have canonical radiance or physical units, they try to match colors and bake together emission and reflection using learned densities, and are not energy conserving, and conceptually they are more like awesome billboards.

Is the idea that you are hoping to bias into plenoptic fields and the like in the future? I do realize there is a trend to write papers with titles like "Neural Light Field Representations" but here isn't the place for me to debate that accuracy ;) Gaussian Splats are more like awesome billboards and I worry about leaning into an academic trend that doesn't reflect what we actually do with splats, and an implication of further future functionality in the light field domain, where there may or may not be ambitions in that direction?

I'm totally willing to be argued out of being fussy about this term, but you can see I'm biased to something less evocative of physics, like UsdGeomGaussianSplat ;)

I was thinking that if there was a document page to go with the PR it might help contextualize the PR against questions like this. (on proposals maybe? Right now the proposal PR loops here without a "position" or goal statement.

meshula avatar Jul 09 '25 18:07 meshula

I can put together a document for this like the new user docs and link to it.

Regarding light fields as a name, it just felt like the only term that we could think of that covered possible other representations like Nerfs in the future if someone wanted to propose them. Having just usdGaussians felt too narrow a target name if that ever came to be. Especially since people in the USD community have asked about various ML representations as well.

There wasn't really any thought behind the light field name beyond that, and 5m of Lee and I trying to think of something else.

So happy to rename it to something else if someone can come up with a suitable name.

dgovil avatar Jul 09 '25 18:07 dgovil

Nerfs are density maps, I think you are onto something with that line of thought Volumetric Density Something Something... Nerfs and GS are volume-sample technologies. Are they UsdVol related?

meshula avatar Jul 09 '25 21:07 meshula

@meshula yeah, I debated whether they are a volume. If I squint, it feels like they could be? Happy to move it there if it feels more appropriate. It's shorter to boot.

dgovil avatar Jul 09 '25 23:07 dgovil

In PLY, these 3 are typically stored in an awkward machine-learning format (rather than more standard CG formats)

  • opacity: in sigmoid form (graphics would prefer not sigmoid)
  • scale: in logarithmic form (graphics would prefer not logarithmic)
  • orientation: WXYZ non-normalized quaternion (graphics would prefer XYZW normalized quaternion)

How will these be stored in USD? original machine-learning format? Or CG-format? We should at least document whatever convention it is. (ps: I prefer they would be the CG-format :) )

BrianSharpe avatar Jul 09 '25 23:07 BrianSharpe

@BrianSharpe ah great point, I will add that to the documentation in my next push of updates.

We store it in the CG way so that it works more or less out of the box in USD (albeit as points).

We convert from the PLY encoding to the CG encoding when writing to USD (and I believe Adobe do too? I forget if this was a change we made just on our end). If you have a Mac on the 26.0 betas, you can load a PLY file up in Preview and see the representation in Storm (albeit as points) and can save out a usda with our preliminary schema to see the values.

dgovil avatar Jul 10 '25 00:07 dgovil

@ld-kerley Here's a way to compatibly use a normals primvar without breaking the primvar :)

/// Converts a scaled axis-angle vector (axis * angle) to a unit quaternion.
/// Compatible with OpenUSD GfQuatf and GfVec3f.
///
/// If the input vector is zero, returns identity quaternion.
pxr::GfQuatf ScaledAxisAngleToQuat(const pxr::GfVec3f& scaledAxisAngle) {
    float theta = scaledAxisAngle.GetLength();
    if (theta == 0.0f) {
        return pxr::GfQuatf(1.0f, pxr::GfVec3f(0.0f)); // identity quaternion
    }

    pxr::GfVec3f axis = scaledAxisAngle / theta;
    float halfTheta = 0.5f * theta;
    float sinHalfTheta = std::sin(halfTheta);
    float cosHalfTheta = std::cos(halfTheta);

    return pxr::GfQuatf(cosHalfTheta, axis * sinHalfTheta);
}

/// Converts a unit quaternion to a scaled axis-angle vector (axis * angle).
/// Compatible with OpenUSD GfQuatf and GfVec3f.
///
/// If the quaternion is identity, returns zero vector.
pxr::GfVec3f QuatToScaledAxisAngle(const pxr::GfQuatf& quat) {
    const float w = quat.GetReal();
    const pxr::GfVec3f v = quat.GetImaginary();

    float sinHalfTheta = v.GetLength();
    float cosHalfTheta = w;

    if (sinHalfTheta == 0.0f) {
        return pxr::GfVec3f(0.0f); // no rotation
    }

    float halfTheta = std::atan2(sinHalfTheta, cosHalfTheta);
    pxr::GfVec3f axis = v / sinHalfTheta;

    return axis * (2.0f * halfTheta);
}

meshula avatar Jul 16 '25 21:07 meshula

Thanks for the great feedback everyone. We'll do a pass on these after discussing some details and post an update soon to reflect what is hopefully a good aggregation of everyones thoughts

dgovil avatar Jul 17 '25 16:07 dgovil

Thanks for the comments, @spiffmon

I think some of the recent changes to OpenUSD that you listed will let us do what you suggest.

Also, thanks everyone else for your notes. They're all heard and we'll do our best to integrate whatever feedback we can. We're just waiting on some notes to be posted from a few other places

I'll discuss with Lee and we will start working on the requested changes that are here and we've received offline. We'll probably start on that work after SIGGRAPH.

dgovil avatar Jul 31 '25 15:07 dgovil

I just pushed a pretty big update. This comes as a result of several conversations in different groups, including the AOUSD Emerging Geometries Interest Group. The AOUSD EG IG is where most of the active conversation is happening, and I would encourage anyone who is interested to attend the meetings there (the next one is Thursday, 16th October at 12 p.m. PST).

As with the previous update - the only significant file here is schema.usda - the rest are all generated by usdGenSchema.

The community has expressed a desire to support both a practical, concrete schema as well as having a system that is extensible in the future. To allow for this, we have currently settled on a concrete schema type to provide a base ParticleField. This base contains no data but is the point that a number of different appliedAPI schemas can be attached to. AppliedAPI schemas are used to define the required characteristics of a given type of ParticleField. These currently fall broadly into the following categories:

  • Base attributes - these define the particles themselves, with attributes such as position, scale, and orientation.
  • Kernel definition - these define the shape and falloff used for each particle in the ParticleField.
  • Radiance definition - these define the appearance of the ParticleField. This is the current set of categories, but the system does not preclude new categories being added in the future as the technology evolves.

This allows extension and experimentation with different implementations of each of these. For instance, some stakeholders have expressed interest in defining the position of the particles using a multilayer perceptron (MLP). We do not propose an appliedAPI schema for this yet, but we can imagine someone defining a PositionMLPAPI appliedAPI schema that could be used in place of the currently suggested PositionAttributeAPI schema to define the locations of the particles.

These extensibility/configurability do mean it is possible to construct an "incomplete" ParticleField. We do not currently propose taking an opinion on what "incomplete" means here, as it may evolve. We do, however, offer an opportunity for avoiding the possible complications of an "incomplete" ParticleField, by providing additional concrete type schema for explicit types of ParticleFields. Specifically, in the PR, we see a ParticleField_3DGaussianSplat schema, that pre-applies the necessary appliedAPI schema to create what we consider the USD representation of the classic 3DGS data. This concrete class has been defined to be complete to describe this specific "flavor" of ParticleField.

We hope this approach strikes a good balance between extensibility, robustness, and practical application.

ld-kerley avatar Oct 07 '25 19:10 ld-kerley

@spiffmon - thanks for the great feedback (and proof reading my terrible typos). I left inline comments for the discussion points that aren't obviously resolvable, and will raise those as discussion points at the next AOUSD meeting (in about 90 minutes time :) )

ld-kerley avatar Oct 16 '25 17:10 ld-kerley

@spiffmon - we had a lively discussion in the last AOUSD meeting.

One of the topics we discussed was that of which usd domain this belongs in. UsdLightField was really just an initial proposal to get the conversation going, and several at the meeting, including myself were swaying in the direction of actually moving all of these ParticleField schema to the UsdGeom domain. The ones we got a little hung up on were the radiance providing schema, like the spherical harmonic ones. One idea was to present the idea of these radiance providers as analogous to the current displayColor/displayOpacity attribute. Meaning an indicator of how a ParticleField might be rendered if there is no material bound to the location. We do think that future technological advances will want to bind "materials" to a ParticleField, so this feels like it would align well. I'm guessing if things get moved to UsdGeom then all the schema would have a ParticleField prefix.

Before I reframe the PR in UsdGeom I wanted to check in with you and see if you had any thoughts about this direction.

The other discussion point that I'd love your feedback on was if the ParticleField base case should inherit from GPrim or Boundable. It's unclear if concepts like doubleSided or orientation from the GPrim schema are really all that applicable to ParticleFields at least in terms of current research, and with the idea that perhaps radiance providing schema replace things like displayColor/displayOpacity it feels like Boundable might make more sense, but I'd love your perspective incase I'm missing something.

ld-kerley avatar Oct 18 '25 21:10 ld-kerley

@ld-kerley , I think Boundable makes alot more sense than Gprim, actually.. here's another reason: every gsplat I've ever seen represents something that, if explicitly modeled with actual UsdGeom geometry, would consist of many gprims, many materials, etc.

The other question is going to take more time and talking to folks. My initial concerns are about UsdGeom size and complexity, and what the unifying concepts are. Two related notes:

  1. we formulated the single-schema UsdProcProcedural out into its own domain, and that has a similar "represents many geometries and materials/looks" flavor
  2. It's quite likely that the number of schemas in UsdGeom will nearly double when we add support for double-precision geometry, rather than going the direction that this proposal is going with support for multiple precisions put into multiple properties in the same schema (which does have precedent in PointInstancer, but we have concerns about scaling that pattern up.

spiffmon avatar Oct 18 '25 23:10 spiffmon

@ld-kerley , we spent time discussing unifying concepts, and the TL;DR is that we believe a LightField domain makes the most sense of the options on the table (UsdLightField, UsdGeom, UsdLux, UsdVol) - particularly if we look ahead potentially to nerfs, which fit even less in UsdGeom than ParticleField. Details:

  1. Other than primvars:displayColor and primvars:displayOpacity, which should be thought of as "UI hints" that a renderer can choose to use or ignore if it has no capacity to process higher-level material characteristics, @gitamohr notes that UsdGeom deals only in geometry - or put another way, as stated by @nvmkuruc to me, things that can occlude. While a PointInstancer can instance complete asset descriptions (and will be able to instance ParticleFields) that contain higher-level concepts, it does so in a blind way: it instances prim hierarchies.
  2. @meshula notes also that everything in UsdGeom submits to Euclidean characteristics that every element can be defined with an exact distance from every other element of a Geom primitive. Given the falloffs and density/volumetric nature of the ParticleFields being proposed here, we don't think they adhere to that property.

spiffmon avatar Oct 20 '25 20:10 spiffmon