lightning icon indicating copy to clipboard operation
lightning copied to clipboard

How to specify groups for the group lasso penalty?

Open fabianp opened this issue 10 years ago • 4 comments

It is stated that the CDClassifier object supports a group lasso ("l1/l2") penalty, yet it is not clear to me how the groups in the group penalty are specified.

fabianp avatar Nov 04 '15 09:11 fabianp

It's only supported for multiclass classification and multitask classification / regression. The groups are the weights of a feature in all classes / tasks. This question comes up all the time. We need to improve the documentation.

mblondel avatar Nov 04 '15 11:11 mblondel

May i know this problem have been solved ? It is not clear to distinguish the group which have the same group penalty.In another word,in lamda*sum(sqrt(p))||beta||_2^2 , in CDClassifier which paremeter is defined each group size p.thanks

zht012323 avatar Sep 03 '17 10:09 zht012323

@zht012323 there hasn't been any progress on this

fabianp avatar Sep 03 '17 21:09 fabianp

Would defining a custom penalty and using FISTA partially "solve" this problem by basically rolling our own custom penalty?

Not sure if my "group penalty" actually makes sense in this context, but suppose we have two groups of coefficients off a dataset with 40 coefficients. One group being the first 20 indices, and the 2nd group being the second 20 indices.

Would one way to do Group Lasso be to define the penalty as:

class L1Penalty1(object):
    def __init__(self, group=[]):
        self.group = group
        
    def projection(self, coef, alpha, L):
        #np.sign(coef) * np.maximum(np.abs(coef) - alpha / L, 0)
        coef_group = coef.flatten().copy()
        for gp in self.group:
            gp_ = np.sum(np.abs(coef_group[gp]))
            coef_group[gp] = gp_
        coef_group = (coef_group/np.sum(coef_group)) * coef_group.shape[0]
        coef_group = coef_group.reshape(1, -1)
        #print(coef_group)
        #print(np.sum(coef_group))
        #print("---\n")
        
        reg_ = np.sum(np.abs(coef))
        projl1_ = np.sign(coef) * np.maximum(np.abs(coef) - alpha / L, 0) * coef_group
        return projl1_* (reg_/np.sum(np.abs(projl1_)))

    def regularization(self, coef):
        return np.sum(np.abs(coef))

Would this work?

NoRaincheck avatar Dec 07 '17 10:12 NoRaincheck