Cubic splines overhaul
Objective
Improve the bevy::math::cubic_splines module by making it more flexible and adding new curve types.
Closes #10220
Solution
Added new spline types and improved existing
Changelog
Added
-
CubicNurbscubic curve generator, allows setting the knot vector and weights associated with every point -
LinearSplinecubic curve generator, allows generating a linear interpolation curve segment - Ability to push additional cubic segments to
CubicCurve -
IntoIteratorandExtendimplementations forCubicCurve
Changed
-
Pointtrait is implemented as a blanket trait now. It is still implemented for previous types but also allows other types that satisfy the trait requirements to be used. -
CubicCurve::coefficientswas moved toCubicSegment::coefficientsbecause the function returnsCubicSegment, so it seems logical to be associated withCubicSegmentinstead. The method is still not public.
Fixed
-
CubicBSpline::newwas referencing Cardinal spline instead of B-Spline
The only thing left is documentation for the new types and methods, I think
@aevyrie Ok, ready for review now
The failed check doesn't seem to be related to the changes in this PR but rather the main branch code.
Given that the conversion of NURBS to CubicCurve is only approximate (as far as I understand, please correct me if I'm wrong), there might be a need for the ability to sample CubicNurbs directly.
@IQuick143 The matrix form is just a simplified version of the original definition, brought into a form that is sometimes easier to use. The conversion is not approximate.
Given that the conversion of NURBS to CubicCurve is only approximate (as far as I understand, please correct me if I'm wrong), there might be a need for the ability to sample CubicNurbs directly.
@IQuick143 The matrix form is just a simplified version of the original definition, brought into a form that is sometimes easier to use. The conversion is not approximate.
My concern is that NURBS curves are known to be able to express curves such as conic sections, whereas polynomial curves (of any degree) are not.
Also looking at the code again, I see that weights are used only in the constructor, when scaling the control_points, but then are completely missing from other calculations, I don't see how that can be right given that NURBS curves are defined by the formula:
Curve(t) = (sum_i N_i(t) weight_i P_i) / (sum_i N_i(t) weight_i) (this ratio is why it's a Rational spline)
Where N_i(t) is a polynomial spline for the i-th control point.
Scaling the points by weight, will calculate the top sum, but completely avoid the denominator expression.
Well, the way you wrote it, (sum_i N_i(t) weight_i) completely cancels out from the expression it seems. I assume you have some error in that expression, so I'll read up on how weights are applied in nurbs.
Also as far as I've read the paper cited in the source code, it does not deal with rational splines at all, only non-uniform B-splines. I think overall NURBS is a misnomer, because we don't actually implement a NURBS curve, just a non-uniform B-spline with a weird weight scaling thing going on.
Well, the way you wrote it,
(sum_i N_i(t) weight_i)completely cancels out from the expression it seems. I assume you have some error in that expression, so I'll read up on how weights are applied in nurbs.
It does not cancel out, because the control points in front of each summand are different. Here's a toy worked example for two control points:
Curve(t) =
(sum_i N_i(t) weight_i P_i) / (sum_i N_i(t) weight_i) =
(N_1(t) * w_1 * P_1 + N_2(t) * w_2 * P_2) / (N_1(t) * w_1 + N_2(t) * w_2)
One can see, this fraction does not cancel down to P_1 + P_2 or anything similar.
Not to mention, if you tried to cancel it out you would end up with P_i which is a malformed expression, because nothing specifies what the index is (like if you tried to move the body of for loop out into an expression and then had an undefined index).
You're not completely wrong in principle however, if the control points were all the same it would work:
(sum_i A_i * X) / (sum_i A_i) = (X * sum_i A_i) / (sum_i A_i) = X * (sum_i A_i) / (sum_i A_i) = X
Here the important thing is we factored out X out of the sum, which we could do because it did not depend on the summation index.
Oh, I see, I was confused by the notation. I though sum_i is just another variable/function, but it's an operator.
Currently I'm working through the math on wikipedia to see how I can simplify rational basis functions.
UPD: got some progress by expanding the denominator into a polynomial, I'm just not sure what I can do now. Try to simplify the whole expression to include the denominator part into the resulting polynomial coefficients if that's possible or try and add a second polynomial by which the result would be divided.
UPD2: Although changing anything in the CubicSegment, which is responsible for the actual polynomial itself, might require some changes to velocity and acceleration calculations as well. Which is part of the reason I'm still trying to break the formulas back into a cubic polynomial form. No major success so far, sadly.
I agree NURBS curves have many uses, but they cannot be reduced to a CubicCurve, as they are rational functions, not polynomials, you can at best find an approximation, but that kinda defeats the purpose imo. Well... they can if you use homogeneous coordinates of a projective space, though you wont avoid the final division in the end.
Projective spaces work based on the idea of using lines as points, the motivation comes from a camera projection, under which every point along a ray hits a certain pixel.
To do this mathematically you take a space like R^3 (3D real space, think Vec3) and consider two points to be equivalent if they are multiples of eachother, ie (a,b,c) and (x,y,z) are equivalent if a = kx, b = ky, c = kz for some multiplier k.
If two vectors are equivalent, then they correspond to the same point, these are homogeneous coordinates of the point, they're sometimes denoted with : like (a : b : c).
(Sidetrack: The term homogeneity refers to the act of scaling of the coordinates, another (somewhat unrelated) example would be homogeneous functions of degree n, which satisfy: f(ka, kb, kc...) = k^n * f(a,b,c...), (they exhibit a similar symmetry under scaling). The term homogeneous polynomial is a special case of this. Vector norms (the length of a vector) are yet another example of a homogeneous function (of degree 1).)
Just like 3D space projects onto a 2D space, the projective space that is created this way is one dimension less than the space you started with. You can identify a 2D space with a part of the projective space with 3 coordinates, by saying: the point (x,y) corresponds to the point (x : y : 1) (the same point could be also described with coordinates (2x : 2y : 2), but we fix the last coordinate to be 1 in order to ensure consistency).
(Sidetrack: this does not actually cover every point in the projective space, because points given by (x : y : 0) cannot be represented using (x : y : 1), but we don't need them here, we just want to represent our usual points.)
(Sidetrack: This representation is useful in graphics and transforms, because it allows us to represent translations with the following matrix:
[1 0 0 x]
[0 1 0 y]
[0 0 1 z]
[0 0 0 1]
If I'm not mistaken, projection matrices work similarly, but they do not preserve the 1 in the last spot, one then has to "normalise" the vector by dividing it by the last component (which makes the last component 1 again). This division then finishes the projection transformation... which leads us back to this subject being named projective geometry)
So how does this relate to NURBS? You can represent NURBS by a CubicCurve in a 4D space, and then convert the homogeneous coordinates into 3D as described earlier. So to compute a NURBS curve in 3D you take your control points (x_i, y_i, z_i), convert them into points in projective space (x_i, y_i, z_i, 1), scale them by their weights (which changes the coordinates but not the point): (w_ix_i, w_iy_i, w_i*z_i, w_i) calculating a CubicCurve through these points. This gives you a point described by homogeneous coordinates (a : b : c : w), and then you convert them to a point by doing (a / w, b / w, c / w).
Wikipedia has a good image to show how this works: https://en.wikipedia.org/wiki/B-spline#/media/File:RationalBezier2D.svg
This mechanism is also why NURBS curves are able to describe conic sections, all you need to do is to describe a curve on the surface of a cone (a parabola does the job, and that is a quadratic curve), these points correspond to lines passing through the origin, forming the surface of the cone, computing the last step effectively computes an intersection of these lines with a w = 1 plane, giving the desired conic section, again wikipedia has a good visualisation: https://en.wikipedia.org/wiki/Non-uniform_rational_B-spline#/media/File:NURBS-circle-3D.svg
So... to sum up: A (cubic) NURBS curve cannot be converted to a CubicCurve (at least not losslessly), it is simply too expressive for that. But one can compute a NURBS<Vec2/3> with a CubicCurve<Vec3/4> with the caveat that after computing the point, one has to divide the vector by its last component.
I'm in the middle of another much more thorough review of this, given that there seems to be a little contention in this thread.
First off, I think we can assume given the topic of this PR that we are all familiar with homogeneous coordinates.
Some of the confusion here may stem from our not directly using de Boor's algorithm to sample B-splines. Instead, we pre-compute the coefficients for a sequence of piecewise continuous cubics using a characteristic matrix derived from Boor's. This means we don't really have direct access to the B-Spline basis functions relied upon by the Piegl-Tiller formulation of NURBS, so whatever we do is going to look a little different from the classic version.
Looks like we either want to add RationalCurve and RationalSegment structs representing the quotient of the corresponding cubic ones, or just completely replace all latter with rational cubic curves. CubicNurbs can output precomputed piecewise rational curves (Choi et al. derives the homogeneous 4x4 matrix we need here), then we can sample those just like the rest of the API and we can maybe get rid of that weird B-splines paper.
Also we really should just do a second pass on all the documentation of the cubic_splines module. It's gets a little technical and sparse in several places. I would move to merge this first and then do a complete review to bring everything in line.
Apologies, spontaneous math rambles are my weakness. I try to avoid unexplained jargon because most people I meet are not aware of it.
I understand that we use piecewise CubicSegments for the sampling instead of Basis functions, but the end result is the same that we are handling piecewise polynomial functions, which therefore don't express rational curves like NURBS is. If I read the code correctly, we dropped the R from NURBS. I haven't read Piegl-Tiller's work, so I'm not sure how their formulation looks (I assume it's similar to what I wrote given that you mention it).
(I'm not a spline expert, so I apologise if I'm getting anything wrong here, I just had a hunch the code is not doing what it should be.)
A Rational[Cubic]Curve and Rational[Cubic]Segment sound like a good design to me. I'm not sure about moving existing CubicCurves over since sampling a RationalCurve is potentially more expensive (like all performance concerns, benchmark's needed here) even in cases where it provides no benefit. I'd say however that a conversion method (a From impl perhaps?) CubicCurve -> RationalCurve would be desirable.
I've just finished checking and the current form of generate_matrix is absolutely 100% correct. If you go through the derivation, that matrix actually encodes the quotient of the basis functions. Choi and Qin use slightly different derivations and both end up with the exactly this result, and Choi gives an explicit formula using it which matches the existing use in CubicCurve. Nurbs can absolutely be formulated as a vector-valued cubic parametric curve.
I still need to take a very close look at how CubicCurve is actually sampled, and I still want to visually test this, but I am now highly confident in the core method of implementation. I don't think we will need extra types to represent rational cubic curves after all. Very nice work all in all.
Edit: Choi et al. is "Matrix representation for NURB curves and surfaces" btw, I think it's a better reference than the Qin paper for the derivation, but Qin is (apparently) more efficient. Piegl & Tiller wrote "The NURBS Book" back in the 90s, and it's what I used the only other time I had to do stuff with Nurbs.
For reviewers who may be lurking, this is much closer but still not yet ready for merge. @IQuick143 identified a serious flaw with sampling and did a fantastic job fixing and documenting it, but there is still a few pending issues and a lack of tests.
Yep, I'm happy to count your review here @NthTensor :) The main thing is "make sure that we have consensus and eyes across multiple Bevy community members", and this PR certainly counts.
Going to hold off on merging this until the 0.14 cycle: I want to give us time for this to bake and get tested and refined, rather than shipping it last minute.
Yep, that's a fair call. I will submit some more tests to the upstream fork next week. I'm sure we can get this rock-solid for 0.14.
Thanks Alice!
Love the addition of NURBS, this was on my list of things I wanted to add eventually. I'm glad to see the characteristic matrix design still works for this. I referenced the NURBS matrix paper during v2 of this module, so I had hoped this was the case.
Please re run benchmarks; performance was one of my key focuses when writing this, and some performance sensitive code was touched. I seem to recall removing exponents, for example, but found the naive approach was faster. It would be good to double check that there are no surprising perf regressions.
Ran bezier benchmarks group, performance has improved compared to current main:
easing_1000 time: [21.065 µs 21.071 µs 21.075 µs]
change: [-8.2162% -8.1920% -8.1685%] (p = 0.00 < 0.05)
Performance has improved.
cubic_position_Vec2 time: [1.4079 ns 1.4096 ns 1.4113 ns]
change: [-20.278% -19.988% -19.452%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
cubic_position_Vec3 time: [2.2770 ns 2.2782 ns 2.2792 ns]
change: [-19.322% -19.203% -19.082%] (p = 0.00 < 0.05)
Performance has improved.
Found 22 outliers among 100 measurements (22.00%)
4 (4.00%) low severe
1 (1.00%) low mild
1 (1.00%) high mild
16 (16.00%) high severe
cubic_position_Vec3A time: [1.1719 ns 1.1720 ns 1.1722 ns]
change: [-25.940% -25.825% -25.718%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) low severe
4 (4.00%) high mild
9 (9.00%) high severe
build_pos_cubic_100_points
time: [155.26 ns 155.64 ns 156.25 ns]
change: [-26.596% -26.464% -26.321%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) high mild
4 (4.00%) high severe
build_accel_cubic_100_points
time: [166.01 ns 166.18 ns 166.36 ns]
change: [-21.800% -21.664% -21.512%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
I think we need to think about the orphan rule before making a decision about the blanket trait impl.
Left some thoughts on that comment thread (even thought it is marked resolved).
I still consider this feature blocking on tests. We should not merge this until someone writes a more complete test suite for the nonlinear portion of NURBS. That can be me, if you are willing to wait a week or two.
I've also uploaded the benchmark results to my website: https://jtcf.ru/bevy-benches/
I'm seeing different results (MacOS, M3 Max), with the first bench being a wash, the next three being a ~10% regression, and the last two a ~10% improvement. Here are two runs going from main to this PR branch:
Run 1
easing_1000 time: [8.0191 µs 8.0530 µs 8.0967 µs]
change: [-3.2295% -2.5671% -1.8296%] (p = 0.00 < 0.05)
Performance has improved.
cubic_position_Vec2 time: [1.2105 ns 1.2113 ns 1.2120 ns]
change: [+11.656% +11.899% +12.117%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
5 (5.00%) high severe
cubic_position_Vec3 time: [2.1178 ns 2.1204 ns 2.1245 ns]
change: [+11.433% +11.705% +11.990%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
2 (2.00%) high mild
5 (5.00%) high severe
cubic_position_Vec3A time: [2.2291 ns 2.2304 ns 2.2319 ns]
change: [+9.0580% +9.3894% +9.7763%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) high mild
7 (7.00%) high severe
build_pos_cubic_100_points
time: [108.25 ns 108.35 ns 108.46 ns]
change: [-11.158% -10.965% -10.776%] (p = 0.00 < 0.05)
Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
8 (8.00%) high mild
3 (3.00%) high severe
build_accel_cubic_100_points
time: [108.05 ns 108.13 ns 108.22 ns]
change: [-11.318% -11.175% -11.025%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) high mild
6 (6.00%) high severe
Run 2
easing_1000 time: [7.9431 µs 7.9541 µs 7.9662 µs]
change: [-1.2036% -0.6829% -0.1405%] (p = 0.01 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) high mild
7 (7.00%) high severe
cubic_position_Vec2 time: [1.2122 ns 1.2130 ns 1.2139 ns]
change: [+11.096% +11.780% +12.551%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
5 (5.00%) low severe
5 (5.00%) high mild
6 (6.00%) high severe
cubic_position_Vec3 time: [2.1370 ns 2.1390 ns 2.1412 ns]
change: [+12.254% +12.580% +12.878%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
3 (3.00%) high severe
cubic_position_Vec3A time: [2.2316 ns 2.2333 ns 2.2353 ns]
change: [+8.4694% +8.7661% +9.0287%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
build_pos_cubic_100_points
time: [108.50 ns 108.64 ns 108.81 ns]
change: [-11.073% -10.855% -10.665%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
2 (2.00%) high mild
build_accel_cubic_100_points
time: [109.18 ns 109.43 ns 109.71 ns]
change: [-12.361% -11.958% -11.573%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
On a hunch, I remember performance being very sensitive to change in CubicSegment::position. After reverting this to using powi and retesting, it seems to account for all of the performance change. In other words, reverting that one line accounts for all the perf difference I see between this branch and main.
We'd need to look at the assembly to understand if this is just a fluke or not, but it seems to be consistent. Your benchmark results suggest this is a platform specific micro-optimization, I'm guessing between Arm and x86.
I was using my workstation for the benchmarks, which uses a Ryzen 5 3600 and 4x8GB 3200MHz CL16 RAM configuration.
The change in results on your machine did surprise me, but it does make sense when you consider the vast differences between arm and x86. Either in the discussions or in one of the related commit messages there was a link to godbolt comparing the two implementations, it can probably be used to see the difference on arm as well.
The godbolt link turned out to be in a thread in discord (before math-dev was created): https://godbolt.org/z/M7s3zY3cx
UPD: Same but with rustc 1.76.0 and target set to aarch64-apple-darwin: https://godbolt.org/z/eKczTvd7T
The difference in assembly is miniscule, a couple of instructions less with the new implementation, which is weird considering your benchmark results.
I don't know what our position is on platform specific optimizations, but we could special case the use of pow if the performance difference really is that significant.
WRT the blanket implementation, I am starting to come around to agree with @aevyrie. Feels S-controversial, but silly to mark the whole PR because of this.
Good point that bevy_math should be focused on glam.
bevy_math uses glam as it's the most convenient and best fitting library. It's a primary library, not the only one. I think it's better to provide the implementations instead of making them do the whole new type dance if they dare use some other type.
It could be possible to have a feature to switch between blanket impl and only glam impls. But I don't find it clean and it feels like an overcomplicating due to a conflict in opinions. I don't expect anyone to like this approach, but that's just my opinion.
I don't know what our position is on platform specific optimizations
It's probably not worth it unless we had more confidence/data to suggest they were worthwhile. I think the change in this PR is probably fine. It's a bit of a trade on ARM, but a win on x86, at least with that one sample.
RE: blanket impl:
It's a primary library, not the only one.
Isn't it exactly the only one?
bevy_math only has 3 dependencies (now 4 with thiserror), only uses glam for math types, and exposes features for mint interoperation. Up to this point, bevy_math's M.O. has been to entirely use glam and rely on glam's mint feature to allow users to easily convert to/from glam types.
In other words, the behavior of bevy_math w.r.t. other math types has always been conversion. Without the blanket impl, you can just use the built in mint conversions (or bring your own):
let bezier = CubicBezier::<Vec3>::new([[
my_nalgebra_vec3_1.into(),
my_nalgebra_vec3_2.into(),
my_nalgebra_vec3_3.into(),
my_nalgebra_vec3_4.into(),
]])
.to_curve();
Sorry, my wording was a bit messed up. I meant "not the only one allowed to be used".
Regarding the mint feature, I was not aware of it's existence, and it does make the situation better.
At this point I am unsure on what to do, as the downsides of reverting the blanket impl again have been addressed.
I think if it were to be reverted, the list of types that the trait is impled on should be revised to make sure that it includes all the types that would possibly be used, from glam and std.