Objective

Presently, the main notion of curve in bevy_animation is VariableCurve, which is essentially a way of organizing an imported glTF animation. This RFC demonstrates a reorganization of this data so that each Transform component and MorphWeights curve is actually a Curve, with its glTF interpolation modes reified by curve structs that implement them on the underlying data buffers.

This has several advantages:

It makes Bevy's representations of glTF animations more portable and reusable.
It cleans up the code internally surrounding animation modes by dispatching them based on separate types.
It lays the foundation for more general curve-based animation by making the current animation data conform more closely with those notions.
It makes curve importing and sampling more robust, closing avenues where illegal values could sneak in.

Solution

VariableCurve, along with the code that loads it and uses it, has been completely overhauled.

VariableCurve is still an enum, but it is split up in a different way:

/// A curve for animating either a the component of a [`Transform`] (translation, rotation, scale)
/// or the [`MorphWeights`] of morph targets for a mesh.
///
/// Each variant yields a [`Curve`] over the data that it parametrizes.
pub enum VariableCurve {
    /// A [`TranslationCurve`] for animating the `translation` component of a [`Transform`].
    Translation(TranslationCurve),

    /// A [`RotationCurve`] for animating the `rotation` component of a [`Transform`].
    Rotation(RotationCurve),

    /// A [`ScaleCurve`] for animating the `scale` component of a [`Transform`].
    Scale(ScaleCurve),

    /// A [`WeightsCurve`] for animating [`MorphWeights`] of a mesh.
    Weights(WeightsCurve),
}

Each of the Transform component curve types here is actually a Curve, but they are still enums, broken down by mode of interpolation; for example, here is RotationCurve:

/// A curve specifying the scale component of a [`Transform`] in animation. The variants are
/// broken down by interpolation mode (with the exception of `Constant`, which never interpolates).
///
/// This type is, itself, a `Curve<Quat>`, and it internally uses the provided sampling modes; each
/// variant "knows" its own interpolation mode.
#[derive(Clone, Debug, Reflect)]
pub enum RotationCurve {
    /// A curve which takes a constant value over its domain. Notably, this is how animations with
    /// only a single keyframe are interpreted.
    Constant(ConstantCurve<Quat>),

    /// A curve which uses spherical linear interpolation between keyframes.
    SphericalLinear(UnevenSampleAutoCurve<Quat>),

    /// A curve which interpolates between keyframes in steps.
    Step(SteppedKeyframeCurve<Quat>),

    /// A curve which interpolates between keyframes by using auxiliary tangent data to join
    /// adjacent keyframes with a cubic Hermite spline. For quaternions, this means interpolating
    /// the underlying 4-vectors, sampling, and normalizing the result.
    CubicSpline(CubicKeyframeCurve<Vec4>),
}

Some of the curve representations that appear here are new, such as SteppedKeyframeCurve and CubicKeyframeCurve — these belong to bevy_animation, and are built on the data structures from bevy::math::curve::cores, which handles the data storage and access patterns. Others, like UnevenSampleAutoCurve and ConstantCurve, are taken "off-the-shelf" from the Curve API itself.

Its implementation of Curve<Quat> essentially just matches over the variants. (The last one requires special handling to do quaternion normalization, but that's about it.)

Allocation-free `Curve<Vec<T>>`

Since the definitions implicit in the Curve trait would require that anything that looks like a Curve<Vec<T>> allocates to produce owned output, this PR introduces an offshoot trait which mitigates this problem — IterableCurve:

/// A curve which provides samples in the form of [`Iterator`]s.
///
/// This is an abstraction that provides an interface for curves which look like `Curve<Vec<T>>`
/// but side-stepping issues with allocation on sampling. This happens when the size of an output
/// array cannot be known statically.
pub trait IterableCurve<T> {
    /// The interval over which this curve is parametrized.
    fn domain(&self) -> Interval;

    /// Sample this curve at a specified time `t`, producing an iterator over sampled values.
    fn sample_iter<'a>(&self, t: f32) -> impl Iterator<Item = T>
    where
        Self: 'a;
}

This is used in concert with the core data structures from the Curve API to sample from keyframes valued in morph weights without ever allocating, all backed by a contiguous buffer of output data.

Performance

This is probably one of the biggest concerns about substantial changes to VariableCurve, so let me be proactive in addressing this.

First of all, VariableCurve has not changed in size (still 64 bytes); this is because the backing data for every curve is at most a pair of vectors — 48 bytes in total, plus 16 bytes for enum discriminants. This is important for caching reasons.

Secondly, proactive measures have been taken to ensure that the curves are designed with good cache locality properties, internally using a Vec<f32> for keyframe times paired with a contiguous buffer of sample data Vec<T> which is sliced up to actually perform sampling. One nice thing is that bevy_animation is not doing any very fancy gymnastics here; it's mostly just using the bevy_math::curve::core APIs as someone would in user-space. The IterableCurve abstraction mentioned above allows these niceties to extend to the case of morph weights without allocation concerns.

Finally, preliminary performance data from tracing looks fine (huge grain of salt — just my machine, on this one example, etc.). On my machine, the difference (if any) on many_foxes appears to be basically the same, with the PR branch running within +-5% of main on animate_targets.

One thing that I really want is a broader base of examples to pull from for performance benchmarking our animation systems, to get a better idea of the potential impact under more realistic circumstances.

Future direction

I think it's unlikely that this representation will serve as the be-all and end-all for encoding character animations — to the contrary, it's likely in the future that bevy_animation will want to include things like compression, at which point these curve constructions will no longer (in an ideal world) see much direct use. On the other hand, I believe that the Curve API itself might provide some value in accomplishing feats like compression, and it would be especially nice if the tools for such things could actually be made reusable across domains. To me, this change helps facilitate that, in addition to just providing a nicer internal representation for glTF animations.

Apr 25 '24 22:04 mweatherley

Hey cool stuff. However I think it would be important to keep a baked animation data at all times. With curves it's really conveniant to work and modify animations which I would love but to actually play them having to compute curve values every frame would be loss of performence compared to baked animation. Once the curve animation is all set it should be converted to baked animation before being sent to the animation graph that would only read a value and update on engine ticks. Here it feels like you are having variablecurves that only hold CubicSpline for instance

May 02 '24 21:05 nzhao95

Hey cool stuff. However I think it would be important to keep a baked animation data at all times. With curves it's really conveniant to work and modify animations which I would love but to actually play them having to compute curve values every frame would be loss of performence compared to baked animation. Once the curve animation is all set it should be converted to baked animation before being sent to the animation graph that would only read a value and update on engine ticks. Here it feels like you are having variablecurves that only hold CubicSpline for instance

This is pretty much a direct drop-in replacement for how animation currently works in Bevy (which is to say, there is no baking). It also doesn't really preclude things like baking AnimationGraph output in the future; it's more-or-less just making the parts of VariableCurve able to stand on their own as data.

May 02 '24 21:05 mweatherley

This is pretty much a direct drop-in replacement for how animation currently works in Bevy (which is to say, there is no baking). It also doesn't really preclude things like baking AnimationGraph output in the future; it's more-or-less just making the parts of VariableCurve able to stand on their own as data.

I believe right now the transform values are lerped between previous and next key which will always be the case because of variable frame durations. The lerp function comes from the glam crate which is using SIMd making this operation very fast. So I think it would be a nice addition in the future to cache the baked animation and avoid as many operations as possible. I didn't quite look at your code yet but I'll try to do that to see if I can make any useful suggestions ^^

May 02 '24 22:05 nzhao95

I believe right now the transform values are lerped between previous and next key which will always be the case because of variable frame durations. The lerp function comes from the glam crate which is using SIMd making this operation very fast. So I think it would be a nice addition in the future to cache the baked animation and avoid as many operations as possible. I didn't quite look at your code yet but I'll try to do that to see if I can make any useful suggestions ^^

Actually, I think that bevy_animation doesn't use glam's Vec3A for interpolation, but maybe we should switch at some point. In any case, the keyframe interpolation in this implementation works in pretty much exactly the same way, just with the interpolation modes reified to trait implementations of types (vector types just use lerp for this). It may be the case that in doing performance optimizations, however, that we will want to collapse some of these abstractions (in favor, for example, of using SoA); in any case, it would remain true that, for example, TranslationCurve would remain a unified Curve<Vec3>.

One advantage of this is that if we do ever actually want to bake anything, the Curve interface makes this quite standard, since we can just call resample on anything that is a curve and then extract the result. For example, if we get to the point where the animation graph output is a Curve, then we can also bake the result pretty easily. Of course, this is probably overlooking a number of organizational and technical challenges that would intervene along the way 😄

May 02 '24 22:05 mweatherley

I believe right now the transform values are lerped between previous and next key which will always be the case because of variable frame durations. The lerp function comes from the glam crate which is using SIMd making this operation very fast. So I think it would be a nice addition in the future to cache the baked animation and avoid as many operations as possible. I didn't quite look at your code yet but I'll try to do that to see if I can make any useful suggestions ^^

Actually, I think that bevy_animation doesn't use glam's Vec3A for interpolation, but maybe we should switch at some point. In any case, the keyframe interpolation in this implementation works in pretty much exactly the same way, just with the interpolation modes reified to trait implementations of types (vector types just use lerp for this). It may be the case that in doing performance optimizations, however, that we will want to collapse some of these abstractions (in favor, for example, of using SoA); in any case, it would remain true that, for example, TranslationCurve would remain a unified Curve<Vec3>.

One advantage of this is that if we do ever actually want to bake anything, the Curve interface makes this quite standard, since we can just call resample on anything that is a curve and then extract the result. For example, if we get to the point where the animation graph output is a Curve, then we can also bake the result pretty easily. Of course, this is probably overlooking a number of organizational and technical challenges that would intervene along the way 😄

I was looking at lib.rs in bevy_animation : fn apply_single_keyframe() I believe it's the function used for apply animations but I'm not sure I just guess it by the name :D But this function is indeed using the Quat::slerp and Vec3::lerp which are coming from glam. And yeah I think it's awesome to be able to store anims in curves but I was just saying that for the actual display it would be nice to cache it and make the frame iterations lower. Would be easier to tell with some use cases. Where is the interpolation done at the end to be played by the AnimationPlayer ?

May 02 '24 22:05 nzhao95

I was looking at lib.rs in bevy_animation : fn apply_single_keyframe() I believe it's the function used for apply animations but I'm not sure I just guess it by the name :D But this function is indeed using the Quat::slerp and Vec3::lerp which are coming from glam. And yeah I think it's awesome to be able to store anims in curves but I was just saying that for the actual display it would be nice to cache it and make the frame iterations lower. Would be easier to tell with some use cases. Where is the interpolation done at the end to be played by the AnimationPlayer ?

Ah; apply_single_keyframe is only used in the case that the loaded glTF animation only consists of a single keyframe, and the lerp and slerp there is better described as "blending" rather than interpolation (it comes from weights in the animation graph); the actual interpolation logic is in apply_tweened_keyframe. In this proposal, both of these would be largely replaced just by calling sample on the associated curves (the constant curve variants playing the role of apply_single_keyframe).

May 02 '24 22:05 mweatherley

Ok I see, we should probably look into code optimization on this topic later. Even the tweened keyframe function is doing a few unecessary copies :

                let tangent_out_start = keyframes[step_start * 3 + 2];
                let tangent_in_end = keyframes[(step_start + 1) * 3];
                let value_end = keyframes[(step_start + 1) * 3 + 1];
                let result = cubic_spline_interpolation(
                    value_start,
                    tangent_out_start,
                    tangent_in_end,
                    value_end,
                    lerp,
                    duration,
                );

The keyframe update functions should be very fast compared to data management functions which can take there time. But yeah lots of things to do ^^ Having curves is already a huge plus. Would be interesting to reimplement all the actual curve evaluations in SIMd as well

May 03 '24 08:05 nzhao95

Very cool! Obligatory note that we cannot merge this without profiling.

Aug 19 '24 21:08 NthTensor

Very cool! Obligatory note that we cannot merge this without profiling.

Well, I did profile it, but as noted in the description, more profiling would definitely be great.

Aug 19 '24 22:08 mweatherley

Really cool, please accept the merge this will make the logic behind blended mask nodes. Way easier to implement

Aug 20 '24 15:08 Sirmadeira

bevy
bevy copied to clipboard

Make `VariableCurve` into curves

Objective

Solution

Allocation-free `Curve<Vec<T>>`

Performance

Future direction

bevy bevy copied to clipboard

Make `VariableCurve` into curves

Objective

Solution

Allocation-free Curve<Vec<T>>

Performance

Future direction

bevy
bevy copied to clipboard

Allocation-free `Curve<Vec<T>>`