hyperparameter_hunter icon indicating copy to clipboard operation
hyperparameter_hunter copied to clipboard

How to do predict_proba in catboost classifier?

Open caowencai opened this issue 5 years ago • 3 comments

Confused that the example does not show how to do predict_proba clearly. And how to get the predict results?

caowencai avatar Aug 08 '19 06:08 caowencai

Thanks for raising this! predict_proba definitely needs to be better documented and described in more examples.

You can control whether to invoke predict_proba by Environment's do_predict_proba kwarg. do_predict_proba can be a boolean (default=False) or an int. The int form is used to specify the column index of the class probabilities you want to use. So if you're doing binary classification and want the probabilities for the "1" class, you would use do_predict_proba=1.

At the moment, this is the only example that uses do_predict_proba (line 215).

Sorry this was so hard to find. Please let me know if this is what you're looking for, and if you have any suggestions for making this easier!

HunterMcGushion avatar Aug 08 '19 08:08 HunterMcGushion

Not clear yet. In the src, hard to find where the prediction results return. Examples are all about training only. An example about do test_dataset(without labels) predictions would be much appreciated.

caowencai avatar Aug 09 '19 09:08 caowencai

Ah I could definitely document that better. Thanks for bringing that up. Would it help to add an "Attributes" section to the docstring of experiments.BaseExperiment in which I list the different dataset attributes?

In your case, you probably want your Experiment's data_test.predictions attribute. This contains several other attributes for different views of the data at different time divisions. The internals of data_test and the other BaseExperiment.data... attributes are documented in the module docstring of data.data_core. So I'd recommend reading that to better understand what you can access.

However, the easiest way to get the full test predictions at the end of an Experiment would be through the "HyperparameterHunterAssets" directory built by your Environment.results_path. All of your Experiments' test predictions can be found in csv files inside "HyperparameterHunterAssets/Experiments/PredictionsTest" unless you're black-listing them or something.

Are your Experiments' results not being automatically saved? If you do think that something is misbehaving, would you mind sharing some minimal code so I can reproduce the problem?

HunterMcGushion avatar Aug 10 '19 23:08 HunterMcGushion