forest-confidence-interval icon indicating copy to clipboard operation
forest-confidence-interval copied to clipboard

ValueError with random_forest_error() on multi-dimensional output

Open mattrossman opened this issue 8 years ago • 1 comments

I'm getting a ValueError when using a random forest estimator trained on multi-dimensional output:

fci.random_forest_error(est, X_train, X_test)
Traceback (most recent call last):

  File "<ipython-input-46-91c6f1ac565a>", line 1, in <module>
    fci.random_forest_error(est, X_train, X_test)

  File "/home/matt/anaconda3/lib/python3.6/site-packages/forestci/forestci.py", line 163, in random_forest_error
    memory_constrained, memory_limit)

  File "/home/matt/anaconda3/lib/python3.6/site-packages/forestci/forestci.py", line 64, in _core_computation
    return np.sum((np.dot(inbag-1,pred_centered.T)/n_trees)**2,0)

ValueError: shapes (135,10) and (10,34,96) not aligned: 10 (dim 1) != 34 (dim 1)

For reference,

X_train.shape
>>> (135, 1252)

y_test.shape
>>> (34, 96)

I would expect the returned variance array to be of shape (34,96).

I am on version 0.2 of forestci.

mattrossman avatar Aug 09 '17 17:08 mattrossman

I think the way to fix this is to change the transpose to a swapaxes - the use of a transpose has consistent behavior only for 2D arrays.

The line should be changed to something like this: return np.sum((np.dot(inbag-1,np.swapaxes(pred_centered,-2,-1))/n_trees)**2,0)

charlesxjyang avatar Jun 24 '19 20:06 charlesxjyang