oneDAL
oneDAL copied to clipboard
Added 2c_mom reference implementation
Overview
Reference implementation of the statistics routine x2c_mom
These changes enable the example em_gmm_dense_batch with reference backend. This has been removed from the exclude list. The changes were tested on AWS Graviton3 with gcc+openblas build.
Notation
Data is a matrix $X\in\mathbb{R}^{p\times n}$. Each column is a $p$-dimensional vector sampled independently. The matrix $X$ is assumed to be stored in column-major fashion.
1. x2c_mom
The variance estimator is a $p$ dimensional vector whose $i$th component is $$v_i = \frac{1}{n-1}\sum_{j=1}^n (x_{ij} - \mu_i)^2.$$ The implementation first computes the second raw sum ($S^{(2)} := \sum_i x_i^2$) and mean ($\mu$); and then uses $$v = \frac{S^{(2)}}{n-1} - \frac{n}{n-1}\mu^2 = \frac{S^{(2)}}{n-1}-\frac{(S^{(1)})^2}{n(n-1)}$$ to compute the variance.