Trouble finding sparse optima
Hi,
Describe the bug I took a look at the tutorial on how to find sparse solutions[https://github.com/cvxpy/cvxpy/blob/master/examples/notebooks/WWW/sparse_solution.ipynb] and have a question about the "l1-heuristic". Basically, the "heuristic" says that we can find a solution by zeroing out close to 0 coefficients. In my experiments with multitask basis pursuit, applying the heuristic (i.e. refitting with sparse variable sets) actually leads to better (lower loss, better constraint satisfaction) solutions than not applying the heuristic! That is, its not just a heuristic, but actually give a better solution! Thus, I am wondering if there is a theoretical justification that the off-the-shelf solver is unable to find sparse solutions such that I can feel good about throwing away low coefficient values.
To Reproduce
import numpy as np
from einops import rearrange
import cvxpy as cp
import matplotlib.pyplot as plt
import seaborn as sns
print('generating multitask sample data')
np.random.seed(42)
n = 1
d = 4
p = 30
sample_grads = .1* np.random.multivariate_normal(np.zeros(d), np.identity(d),p)
X = rearrange(sample_grads, 'p d -> d p')
y= np.identity(d)
print('computing optimizer of full problem')
beta = cp.Variable((p,d))
objective = cp.Minimize(cp.sum(cp.norm(beta, axis = 1)))
constraints = [X @ beta == y]
problem = cp.Problem(objective, constraints)
result = problem.solve()
beta_optimized = beta.value
print('reconstruction constraint and loss', np.linalg.norm(y - X @ beta_optimized ), np.sum(np.linalg.norm(beta_optimized, axis = 1)))
print('many coefficients are close to 0')
plt.hist(beta_optimized)
plt.xscale('symlog')
plt.title('Coefficients')
plt.figure()
print('sparsity is shared across tasks')
sns.heatmap(low_indices)
print('enforece sparsity following I selected a number here following https://github.com/cvxpy/cvxpy/blob/master/examples/notebooks/WWW/sparse_solution.ipynb')
low_indices = np.abs(beta_optimized) < 1e-6
beta_sparse = beta_optimized.copy()
beta_sparse[np.abs(beta_sparse) < 1e-6] = 0
print('sparse reconstruction constraint and loss', np.linalg.norm(y - X @ beta_sparse ), np.sum(np.linalg.norm(beta_sparse, axis = 1)))
print('refitting with only retained coefficients')
X_restricted = X[:,~low_indices[:,0]]
nonzerorows = len(np.where(~low_indices[:,0])[0])
beta = cp.Variable((nonzerorows,d)) # number of non-zero rows
objective = cp.Minimize(cp.sum(cp.norm(beta, axis = 1)))
constraints = [X_restricted @ beta == y]
problem = cp.Problem(objective, constraints)
result = problem.solve()
beta_restricted = beta.value
print('sparse refit reconstruction constraint and penalty', np.linalg.norm(y - X_restricted @ beta_restricted ), np.sum(np.linalg.norm(beta_restricted, axis = 1)))
print('it looks like the refit solution has better constraint satisfaction and lower loss than the original solution!')
Expected behavior A clear and concise description of what you expected to happen.
Output
generating multitask sample data
computing optimizer of full problem
reconstruction constraint and loss 5.845838050963103e-12 17.868875241224597
many coefficients are close to 0
sparsity is shared across tasks
enforece sparsity following I selected a number here following https://github.com/cvxpy/cvxpy/blob/master/examples/notebooks/WWW/sparse_solution.ipynb
sparse reconstruction constraint and loss 1.0768415105048245e-07 17.868874641499794
refitting with only retained coefficients
sparse refit reconstruction constraint and penalty 1.3190391034620074e-12 17.868875206447616
it looks like the refit solution has better constraint satisfaction and lower loss than the original solution!
Version
- CVXPY Version: '1.1.24'
Thanks!
I suppose this is just an issue of numerical precision and zeroing is a okay?
switching to cp.SCS and lowering eps seems to give ability to crack the "almost zero" values down to 1e-16
Thanks for sharing this! Feel free to comment more if other questions come up.
I wanted to reopen the issue to illustrate a challenge I am currently having. I am trying to understand whether, in the following example, the number of selected coefficients should be 2. However, the solutions I am getting out of SCS are not so sparse. They seem to improve loss by a epsilon small amount (e.g. 1e-16) but cause comparable violations of the constraint as well. It sounded like the scale or epsilon parameters might do this, but that haven't sufficed in my hands. Epsilon in particular seems to not make a difference below 1e-11 or so.
Here's an example of my experiment
import numpy as np
import cvxpy as cp
def group_lasso_norm(beta): # beta_bp
"""Computes the group basis pursuit loss of the matrix beta"""
output = np.linalg.norm(beta, axis=1).sum()
return output
def group_basis_pursuit(
matrix,
eps=1e-12,
threshold=1e-6,
max_iters = 2500,
):
D,P = matrix.shape
beta = cp.Variable((P,D)) # could initialize with lasso?
beta.value = np.asarray([[1.,0], [0,1.],[0,0],[0,0]])
objective = cp.Minimize(cp.sum(cp.norm(beta, axis=1)))
constraints = [matrix @ beta == np.identity(D)]
problem = cp.Problem(objective, constraints)
scs_opts = {"eps": eps, "max_iters": max_iters}
output = problem.solve(solver=cp.SCS, **scs_opts)
# output = problem.solve(solver= cp.CVXOPT)
if output is np.inf:
raise ValueError("No solution found")
beta_optimized = beta.value
beta_sparse = beta_optimized.copy()
beta_sparse[np.abs(beta_sparse) < threshold] = 0
return beta_sparse
np.random.seed(0)
D = 2
P = 4
data = np.random.multivariate_normal(np.zeros(P), np.identity(P), D)
data = data / np.linalg.norm(data, axis = 0)
data[:,:D] = np.identity(D)
beta = group_basis_pursuit(data, eps = 1e-16, max_iters = 100000)
selected_indices = np.where(np.linalg.norm(beta, axis = 1))[0]
print(len(selected_indices), "number of selected coefficients")
print(2 - group_lasso_norm(beta), "loss improvement over naive solution")
print(np.linalg.norm(np.identity(D) - data @ beta), "constraint violation")
Any help is appreciated. Maybe I should try a solver that isn't SCS? Thanks.
It appears this is a form of non-uniqueness in basis pursuit. More about it here https://arxiv.org/abs/2411.18502.