mutar icon indicating copy to clipboard operation
mutar copied to clipboard

Feature request: simultaneous lasso with l1,infinity or l1,lq norm penalization

Open tomwenseleers opened this issue 4 years ago • 1 comments

I was wondering if in the future it might also be possible to support l1,linfinity penalization, which is known as the "simultaneous LASSO", see Turlach 2005, https://www.tandfonline.com/doi/pdf/10.1198/004017005000000139 Liu et al 2009, https://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/fmri/papers/168-Blockwise-Coord-Descent.pdf (maybe the best en fastest algo, confusingly called multi-task LASSO here, but using l1,linfinity penalty unlike the l1,l2 penalty in sklearn's MultiTaskLasso) Quattoni et al 2009, https://dspace.mit.edu/bitstream/handle/1721.1/59367/Collins_An%20efficient.pdf?sequence=1&isAllowed=y Vogt & Roth 2010, https://www.researchgate.net/profile/Volker_Roth/publication/262409253_The_group-lasso_l_1_regularization_versus_l_12_regularization/links/09e41512b178be6c04000000/The-group-lasso-l-1-regularization-versus-l-1-2-regularization.pdf

Or its generalization: l1-lq penalization, see https://arxiv.org/pdf/1009.4766

This creates fits with greater sparsity than l1,l2 and fewer false positives, so could be useful for many multitask learning applications with a shared sparsity structure... But I haven't found any open implementations anywhere...

tomwenseleers avatar Jan 10 '20 10:01 tomwenseleers

I am not aware of any Implementation of a general l1-lq regularization either. It would be a good feature to add in mutar. Even the version of Dirty models (https://papers.nips.cc/paper/4125-a-dirty-model-for-multi-task-learning) which decouples the variables into x = x_1 + x_2 and applies a l1-lq on x_1 and l_1 on x_2 was proposed with q = infty but was implemented in mutar with q = 2 for the simplicity of the algorithm. To update x_1 they use an l1linfty iteration, and their pseudo-code seems to be more comprehensive than the other papers (see page 24 of appendix). It should be straighforward to implement.

This creates fits with greater sparsity than l1,l2 and fewer false positives,

Is this based on empirical or theoretical evidence ? Because It seems that the paper of Quattoni shows a better AUC with q = infty vs q=2 but the other one ( l1-lq penalization https://arxiv.org/pdf/1009.4766) says the opposite.

hichamjanati avatar Jan 12 '20 21:01 hichamjanati