Natural Cubic Spline Interpolation
Natural Cubic Spline Interpolation
This Pull Request implements the cubic spline interpolation as one of the polynomial interpolators mentioned in issue https://github.com/avhz/RustQuant/issues/5.
The implementation was benchmarked against SciPy’s CubicSpline.
Mathematical Background
We want to construct a cubic spline $S(x)$ that interpolates the points
$$\left(x_0, y_0\right), \left(x_1, y_1\right), \ldots,\left(x_n, y_n\right)$$
Since we are working with a natural cubic spline, we define
$$ z_i = \frac{d^2S(x_i)}{dx^2}, $$
the second derivative of the spline at each interpolation point and set
$$ z_0 = z_n = 0 $$
The full derivation of the cubic spline formulation is fairly involved, so some details are omitted here. For reference, the derivation can be found in the lecture notes by T. Gambill (2011) Interpolation/Splines (slides 14-27), which served as the main source for this implementation.
The equation of a natural cubic spline on the interval $x\in\left[x_i, x_{i+1}\right)$ is given by:
$$ S_i(x) = \frac{z_i}{6h_i}\left(x_{i+1} - x\right)^3 +\frac{z_{i+1}}{6h_i}\left(x - x_i\right)^3 +\left(\frac{y_{i+1}}{h_i} - \frac{h_i}{6}z_{i+1}\right)\left(x - x_i\right) +\left(\frac{y_{i}}{h_i} - \frac{h_i}{6}z_{i}\right)\left(x_{i+1} - x\right) $$
(Equation (1))
The next step is to determine suitable values for $z_i$ in order to construct the natural cubic spline.
Differentiating $S_i(x)$ yields
$$ S_i'\left(x\right) = -\frac{z_i}{2h_i}\left(x_{i+1}-x\right)^2 +\frac{z_{i+1}}{2h_i}\left(x-x_i\right)^2 +\frac{y_{i+1}}{h_i} -\frac{h_i}{6}z_{i+1} -\frac{y_i}{h_i} +\frac{h_i}{6}z_i $$
At the interpolation point $x_i$, the spline segments must join smoothly. This requires the gradients at the interpolation point to be equal on both sides. Hence, the following condition is enforced:
Substituting these expressions and rearranging gives the relation
$$ h_{i-1}z_{i-1} + 2\left(h_i + h_{i-1}\right)z_i + h_iz_{i+1} = 6\left(b_i - b_{i-1}\right) $$
where
$$ b_i=\frac{y_{i+1} - y_i}{h_i} $$
This leads to an $(n-1)\times(n-1)$ system of equations:
where
$$ u_i = 2\left(h_i + h_{i+1}\right) $$
Now, we can solve the above equation to obtain the values of $z_i$.
All diagonal and off-diagonal entries of $A$ are positive, implying that $A$ is positive definite. This property guarantees the existence of a Cholesky decomposition, allowing us to factor $A$ as:
$$ A=LL^T \Rightarrow LL^T \mathbf{z} = \mathbf{b} $$
We can solve for $\mathbf{z}$ in two steps. First, define $\mathbf{x} := L^T \mathbf{z}$, and then solve for $\mathbf{x}$ using forward substitution:
$$L \mathbf{x} = \mathbf{b}$$
Once $\mathbf{x}$ is obtained, solve for $\mathbf{z}$ using backward substitution:
$$L^T \mathbf{z} = \mathbf{x}$$
Amendments to the InterpolationIndex trait
Additional traits have been added to the type Delta:
Add:Delta+Delta=DeltaSub:Delta-Delta=DeltaIntoValue: Converts aDeltainto anInterpolationValue. This is required for cubic splines, as the implementation involves arithmetic operations betweenDeltaandInterpolationValuetypesCopy: Required when unpackingDeltavalues from vectors
Amendments to the InterpolationValue trait
MulAssign:value_1 *= value_2num::FromPrimitive: In order to take anf64value and convert it to the appropriateInterpolationValuevalue
Removing Unsigned integers from InterpolationIndex implementation
u8, u16, u32, u64, u128, and usize have been removed from the trait implementation of InterpolationIndex because cubic splines can produce negative Delta values. The trait implementation for i8, i16, i32, i64, i128, and isize should be sufficient for cases where integers are used as indices.
Hi @yfnaji :) Thanks for this one, splines have been needed for a while!
I'll take a closer look tomorrow afternoon, and hopefully merge it
Hey @avhz - just saw the Clippy warnings, I am working on them now, but still feel free to review and comment on this PR (:
@yfnaji
I did a bit of refactoring of the Curves today, which required adding Send + Sync to the interpolators. Doesn't look like it created any conflicts, so just a heads up :)
Hi @avhz
This PR has been amended to use Cholesky decomposition and elimination methods instead of LDLT, which has yielded more accurate results (more consistent with SciPy). This, however, introduces an additional trait bound for square roots, needed for the Cholesky decomposition.
I also removed date references from the Interpolators, since implementing cubic splines requires interaction between Delta types and Values; some meaningul conversion of time durations necessary. I have needed to amend curve.rs to reflect these changes (see comment).
This change may also allow us to simplify InterpolationIndex and InterpolationValue. However, doing so increases the complexity of the trait bounds when accounting for interactions between the two. Their distinction was useful when dates were present, since Delta had its own type, but that separation may now be obsolete if we continue without dates.
Note that I also commented out the from_nodes() method in curve.rs for now, since implementing PyFunctionArgument for BTreeMap<OrderedFloat<f64>, f64> is non-trivial without replacing it with a PyDict or another object. I may need some help with this part.
I’d appreciate feedback on whether these changes should remain grouped together in this PR, or if it would be clearer to split them into separate PRs. In particular, I’m unsure whether removing dates (and the possible obsolescence of InterpolationIndex vs. InterpolationValue) should be part of this PR or discussed separately.