scikit-garden icon indicating copy to clipboard operation
scikit-garden copied to clipboard

sparse matrix input doesn't work

Open BNDS-Robin23 opened this issue 1 year ago • 0 comments

I tried to fit MondrianForestClassifier with sparse matrix, and I set accept_sparse=True in skgarden.mondrian.ensemblem.forest.py, but it didn't work. My classifier is initialized as follows: rf_clf = MondrianForestClassifier(n_estimators=100, random_state=42, n_jobs=-1, verbose=2,max_depth=3,bootstrap=True)

I tried to fit X_train and y_train to rf_clf, where X_train is a sparse matrix, and I got this Traceback (most recent call last): ... File "mf.py", line 350, in train_random_forest rf_clf.fit(X_train[:batch_size], y_train[:batch_size]) File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/skgarden/mo ndrian/ensemble/forest.py", line 364, in fit return super(MondrianForestClassifier, self).fit(X, y) File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/sklearn/ens emble/_forest.py", line 383, in fit for i, t in enumerate(trees)) File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/joblib/para llel.py", line 1061, in call self.retrieve() File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/joblib/para llel.py", line 938, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "/xxx/anaconda/envs/xx/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value File "/xxx/anaconda/envs/xx/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 595, in call return self.func(*args, **kwargs) File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/joblib/parallel.py", line 264, in call for func, args, kwargs in self.items] File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/joblib/parallel.py", line 264, in for func, args, kwargs in self.items] File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/sklearn/ensemble/_forest.py", line 165, in parallel_build_trees tree.fit(X, y, sample_weight=curr_sample_weight, check_input=False) File "/xxx/anaconda/envs/xx/lib/python3.7/site-packages/skgarden/mondrian/tree/tree.py", line 207, in fit builder.build(self.tree, X, y, sample_weight, X_idx_sorted)
File "skgarden/mondrian/tree/_tree.pyx", line 184, in skgarden.mondrian.tree._tree.DepthFirstTreeBuilder.build File "skgarden/mondrian/tree/_tree.pyx", line 210, in skgarden.mondrian.tree._tree.DepthFirstTreeBuilder.build File "skgarden/mondrian/tree/_splitter.pyx", line 250, in skgarden.mondrian.tree._splitter.BaseDenseSplitter.init TypeError: Cannot convert csr_matrix to numpy.ndarray

Here is my environment information: python 3.7.12 numpy 1.21.5 scikit-garden 0.1.3 scikit-learn 0.22 scipy: 1.7.3

BNDS-Robin23 avatar May 27 '24 10:05 BNDS-Robin23