doubleml-for-py icon indicating copy to clipboard operation
doubleml-for-py copied to clipboard

Support for sparse matrix

Open Alalalalaki opened this issue 3 years ago • 4 comments

Even when I set both prediction functions to be lasso, which should support for sparse matrix in sklearn, the doubleml pkg throws the error that sparse matrix is not supported. Transforming the matrix to dense format will explode my memory. Is there any particular reason that sparse matrix cannot be used?

Alalalalaki avatar Oct 04 '21 11:10 Alalalalaki

Could you please provide the error message and a minimal working example to reproduce the problem.

MalteKurz avatar Oct 04 '21 11:10 MalteKurz

@MalteKurz

image

image

Alalalalaki avatar Oct 04 '21 11:10 Alalalalaki

It would have been nice to add the code separately from the screenshot. This makes it easier for everyone to work on this feature request and GitHub then also adds this nice copy-paste-buttons :wink:.

from doubleml.datasets import make_plr_CCDDHNR2018
from doubleml import DoubleMLData

n_obs = 200
n_vars = 150
alpha = 0.5

(x, y, d) = make_plr_CCDDHNR2018(alpha=alpha, n_obs=n_obs, dim_x=n_vars, return_type='array')

from scipy.sparse import csr_matrix, csc_matrix
x = csc_matrix(x)

from sklearn.base import clone
from sklearn.linear_model import Lasso
learner = Lasso()
ml_m = clone(learner)
ml_g = clone(learner)

obj_dml_data = DoubleMLData.from_arrays(x, y, d)

MalteKurz avatar Oct 04 '21 11:10 MalteKurz

@MalteKurz Thanks. Yes, I shouldn't use screenshot for the code.

Alalalalaki avatar Oct 04 '21 11:10 Alalalalaki