yellowbrick The PredictionError can't be visualized due to the dim error

Describe the bug The PredictionError can't be visualized due to the dim error.

To Reproduce I use the following code:

  visualizer = PredictionError(model)
  self.y_test = self.y_test.squeeze()
  visualizer.fit(self.x_train, self.y_train)
  visualizer.score(self.x_test, self.y_test)
  visualizer.show()

And I think the error happens in yellowbrick/regressor/prediction_error.py

    def score(self, X, y, **kwargs):
        # super will set score_ on the visualizer
        super(PredictionError, self).score(X, y, **kwargs)

        y_pred = self.predict(X)
        self.draw(y, y_pred)

        return self.score_

The dimension of y_pred is 2. But in draw_best_fit function, y.ndim>1 will raise error!

    # Verify that y is a (n,) dimensional array
    if y.ndim > 1:
        raise YellowbrickValueError(
            "y must be a (1,) dimensional array not {}".format(y.shape)
        )

Traceback

Traceback (most recent call last):
  File "/home/PJLAB/liangyiwen/anaconda3/envs/torch181/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/PJLAB/liangyiwen/anaconda3/envs/torch181/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/PJLAB/liangyiwen/.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/home/PJLAB/liangyiwen/.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/home/PJLAB/liangyiwen/.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/home/PJLAB/liangyiwen/.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 322, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/home/PJLAB/liangyiwen/.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 136, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/home/PJLAB/liangyiwen/.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/home/PJLAB/liangyiwen/Even/code/OpenBaseLab-Edu/demo/boston_reg_demo.py", line 53, in <module>
    boston_reg(algorithm='LinearRegression')
  File "/home/PJLAB/liangyiwen/Even/code/OpenBaseLab-Edu/demo/boston_reg_demo.py", line 32, in boston_reg
    mp.plot()
  File "/home/PJLAB/liangyiwen/Even/code/OpenBaseLab-Edu/BaseML/BaseMetricVisual.py", line 46, in plot
    self.reg_pred_error_plot()
  File "/home/PJLAB/liangyiwen/Even/code/OpenBaseLab-Edu/BaseML/BaseMetricVisual.py", line 70, in reg_pred_error_plot
    visualizer.score(self.x_test, self.y_test)
  File "/home/PJLAB/liangyiwen/anaconda3/envs/torch181/lib/python3.7/site-packages/yellowbrick/regressor/prediction_error.py", line 168, in score
    self.draw(y, y_pred)
  File "/home/PJLAB/liangyiwen/anaconda3/envs/torch181/lib/python3.7/site-packages/yellowbrick/regressor/prediction_error.py", line 218, in draw
    label="best fit",
  File "/home/PJLAB/liangyiwen/anaconda3/envs/torch181/lib/python3.7/site-packages/yellowbrick/bestfit.py", line 142, in draw_best_fit
    "y must be a (1,) dimensional array not {}".format(y.shape)
yellowbrick.exceptions.YellowbrickValueError: y must be a (1,) dimensional array not (102, 1)

Feb 03 '23 08:02 Even-ok

I am using yellowbrick on keras deep learning via sckit-wrapper and can't plot prediction error because of this. It would be great to get this one fixed.

Feb 11 '23 02:02 sdk451

Could you guys try:

 visualizer = PredictionError(model, bestfit=False)
  self.y_test = self.y_test.squeeze()
  visualizer.fit(self.x_train, self.y_train)
  visualizer.score(self.x_test, self.y_test)
  visualizer.show()

To see if the problem is only with the best fit line? If so then it may be tricky to figure out how to incorporate the best fit line but at least you will be able to get a prediction error plot for your models. If not, we'll have to discuss how models that output a 2D array of outputs make sense in a prediction error context which is intended to plot y against y_hat. Potentially if in Keras the second dimension is just the batches, we could find a way to flatten them with an argument.

Let me know how the above goes and we'll move on from there.

Feb 25 '23 18:02 bbengfort

yellowbrick yellowbrick copied to clipboard

The PredictionError can't be visualized due to the dim error

yellowbrick
yellowbrick copied to clipboard