statsmodels icon indicating copy to clipboard operation
statsmodels copied to clipboard

ENH: (outlier) robust multivariate regression

Open josef-pkt opened this issue 8 months ago • 1 comments

(I found several articles that I printed at around 2016)

  • robust, RLM is only for univariate endog
  • MultivariateLS has been added in #8919
  • a first set of robust covariance estimators is in PR #8129, but it mostly assumes mean is known

The target is to add something like a multivariate RLM equivalent, and possibly other robust and resistant estimators (which are also still missing for univariate endog case).

statsmodels.robust.multivariate this could also be the location for constant mean models, i.e. joint estimation of mean/center and covariance.

AFAIR, I did not read much of the robust multivariate regression literature, but more on joint mean-cov estimation.

References (added to zotero between 2016-09-29 and 1016-10-02):

Agulló, Jose, Christophe Croux, and Stefan Van Aelst. “The Multivariate Least-Trimmed Squares Estimator.” Journal of Multivariate Analysis 99, no. 3 (March 1, 2008): 311–38. https://doi.org/10.1016/j.jmva.2006.06.005.

Ben, Marta García, Elena Martínez, and Víctor J. Yohai. “Robust Estimation for the Multivariate Linear Model Based on a τ-Scale.” Journal of Multivariate Analysis 97, no. 7 (August 1, 2006): 1600–1622. https://doi.org/10.1016/j.jmva.2005.08.007.

Gnanadesikan, R., and J. R. Kettenring. “Robust Estimates, Residuals, and Outlier Detection with Multiresponse Data.” Biometrics 28, no. 1 (1972): 81–124. https://doi.org/10.2307/2528963.

Hubert, Mia, Tim Verdonck, and Özlem Yorulmaz. “Fast Robust SUR with Economical and Actuarial Applications.” Statistical Analysis and Data Mining: The ASA Data Science Journal, May 1, 2016, n/a-n/a. https://doi.org/10.1002/sam.11313.

Jung, Kang-Mo. “Multivariate Least-Trimmed Squares Regression Estimator.” Computational Statistics & Data Analysis 48, no. 2 (February 1, 2005): 307–16. https://doi.org/10.1016/j.csda.2004.01.008.

Kent, John T., and David E. Tyler. “Constrained M-Estimation for Multivariate Location and Scatter.” The Annals of Statistics 24, no. 3 (June 1996): 1346–70. https://doi.org/10.1214/aos/1032526973.

Koenker, Roger, and Stephen Portnoy. “M Estimation of Multivariate Regressions.” Journal of the American Statistical Association 85, no. 412 (1990): 1060–68. https://doi.org/10.2307/2289602.

Kudraszow, Nadia L., and Ricardo A. Maronna. “Estimates of MM Type for the Multivariate Linear Model.” Journal of Multivariate Analysis 102, no. 9 (October 2011): 1280–92. https://doi.org/10.1016/j.jmva.2011.04.011.

Lopuhaa, Hendrik P. “Asymptotics of Reweighted Estimators of Multivariate Location and Scatter.” The Annals of Statistics 27, no. 5 (1999): 1638–65.

Lopuhaä, Hendrik P. “Multivariate τ-Estimators for Location and Scatter.” The Canadian Journal of Statistics / La Revue Canadienne de Statistique 19, no. 3 (1991): 307–21. https://doi.org/10.2307/3315396.

Muler, Nora, and V´ictor J. Yohai. “Robust Estimation for Vector Autoregressive Models.” Computational Statistics & Data Analysis, Special issue on Robust Analysis of Complex Data, 65 (September 2013): 68–79. https://doi.org/10.1016/j.csda.2012.02.011.

Rousseeuw, Peter J., Stefan Van Aelst, Katrien Van Driessen, and Jose Agulló. “Robust Multivariate Regression.” Technometrics 46, no. 3 (2004): 293–305.

josef-pkt avatar Dec 16 '23 18:12 josef-pkt

definitions again

rho(maha) versus rho(maha_squared) as objective function (for mean part)

most theoretical literature (e.g. Tyler, Maronna et al book) uses rho(maha_squared) implementation, eg. for cov DetS, DetMM, including mine based on that literature, use rho(maha)

It looks like with rho(maha) we can use the current Norm classes. Otherwise, we would have to create new norm classes that work with square maha. Current norm classes are defined for scaled residuals, not squared scaled residuals, in univariate case.

elliptical distributions are defined in terms of squared maha However, for pdf we consider the actual observation (multivariate in this case) pdf(y | mu, Sigma) instead of univariate pdf(d2), with d2 = d**2

In implementation, we use rho(sqrt(d**2)) which would correspond to some negative loglikelihood - log pdf(d**2). d**2 in normal case is chisquared distributed, e.g. E rho(sqrt(d2)) for scale_bias is wrt d2 ~ chisquare(df=k_vars)

This means that when comparing implementation with theoretical literature, we have to reparameterize rho, psi, weights, ...

Aside: large parts of the theoretical literature like Tyler do not consider specific robust norm functions, except maybe normal and t as example for elliptically symmetric distributions. So we don't see how for example TukeyBiweight functions are defined for maha-squared.

quick calculation (to verify) rho2(d2) = rho(d), where d = sqrt(d2), for all d2 psi(d) = rho'(d) = rho2'(d2) * 2 * d derivative weights(d) = rho'(d) / d = 2 * rho2'(d2) ....

e.g. Kent and Tyler 1991 use weights(d2) = 2 rho2'(d2) (in my notation) first page and psi = d2 * weights p. 2105, i.e. this psi is not rho2' (but on p 2105 it also has rho2'(.) = 2 u(.) where u =weights (typo?)

similarly Kent and Tyler 1996, p. 1347

objective function, (analogous to negative log-likelihood) in kent tyler 1991 for M-estimation of location and scatter

L(mu Sigma) = sum rho(d2) + n/2 * log det Sigma

Tatsuoka and Tyler 2000 split covariance Sigma into shape * scale with separate auxiliary scale function. with det(shape) = 1, (it's not the shape matrix for a specific elliptical distribution)

Rocke 1996 for CovS has M-estimator expression for S-estimator in terms of mean, cov, without splitting cov in shape and scale, but I don't see a term that includes the fisher consistency term b_0.

Campbell et al 1998 On the calculation ... same moment conditions for M-estimation as Rocke, but iteration in S-estimator rescales cov so that S-estimator restriction holds. step S4.(i). so it still splits cov into a scatter and a scaling term, but without scaling the scatter matrix to be a shape matrix with det(S) = 1

both Rocke and Campbell et al have rho in terms of maha, not maha-squared.

josef-pkt avatar May 09 '24 15:05 josef-pkt