benchmarks icon indicating copy to clipboard operation
benchmarks copied to clipboard

'data' is numpy array of floating point numerical type, it means no categorical features, but 'cat_features' parameter specifies nonzero number of categorical features

Open iyliamjd opened this issue 5 years ago • 1 comments

Hi everyone, it's me again. I have run this code. I get error code below: pool = Pool(data, label, cat_features=cat_cols)

the error : 'data' is numpy array of floating point numerical type, it means no categorical features," _catboost.CatBoostError: 'data' is numpy array of floating point numerical type, it means no categorical features, but 'cat_features' parameter specifies nonzero number of categorical features

Does anyone know what is happening, i did't change any of the code but got error maybe because of my train and test file. But I dont know how is the structure for test and train file.

iyliamjd avatar Nov 27 '19 07:11 iyliamjd

If I'm not mistaken, you should be doing issues here: https://github.com/catboost/catboost/issues I would recommend to create the new ones there, because we check that place all the time.

About this issue: the problem, you're facing is that you are passing floating point numbers to categorical columns, which is not allowed. Here's an explanation, why it is forbidden: https://catboost.ai/docs/concepts/faq.html#why-float-and-nan-values-are-forbidden-for-cat-features

We are planning to allow it for python though, it's one of the open problems for new contributors: https://github.com/catboost/catboost/blob/master/open_problems/open_problems.md

So what you need to do, is you need to convert those columns to integers or to strings.

annaveronika avatar Nov 27 '19 08:11 annaveronika