doubleml-for-py
doubleml-for-py copied to clipboard
Support for sparse matrix
Even when I set both prediction functions to be lasso, which should support for sparse matrix in sklearn, the doubleml pkg throws the error that sparse matrix is not supported. Transforming the matrix to dense format will explode my memory. Is there any particular reason that sparse matrix cannot be used?
Could you please provide the error message and a minimal working example to reproduce the problem.
@MalteKurz
It would have been nice to add the code separately from the screenshot. This makes it easier for everyone to work on this feature request and GitHub then also adds this nice copy-paste-buttons :wink:.
from doubleml.datasets import make_plr_CCDDHNR2018
from doubleml import DoubleMLData
n_obs = 200
n_vars = 150
alpha = 0.5
(x, y, d) = make_plr_CCDDHNR2018(alpha=alpha, n_obs=n_obs, dim_x=n_vars, return_type='array')
from scipy.sparse import csr_matrix, csc_matrix
x = csc_matrix(x)
from sklearn.base import clone
from sklearn.linear_model import Lasso
learner = Lasso()
ml_m = clone(learner)
ml_g = clone(learner)
obj_dml_data = DoubleMLData.from_arrays(x, y, d)
@MalteKurz Thanks. Yes, I shouldn't use screenshot for the code.