modelfox icon indicating copy to clipboard operation
modelfox copied to clipboard

Training error when column to predict has more than 100 variants

Open joelchen opened this issue 3 years ago • 2 comments

When column to predict has more than 100 variants for multiclass classification, there is following error during training:

✅ Inferring train table columns. 6s
✅ Loading train table. 6s
✅ Shuffling. 0s 846ms
✅ Computing train stats. 10s
✅ Computing test stats. 2s
✅ Finalizing stats. 11s
error: invalid target column type

joelchen avatar Apr 17 '22 14:04 joelchen

Hi @joelchen the default settings assume that a column with more than 100 non-numeric unique values is a text column, not an enum column. You can force the CLI to treat your target column as an enum column using a config file.

nitsky avatar Apr 17 '22 18:04 nitsky

@nitsky Alright, the accuracy of 100 variants is low and I have not trained again with enum as target column in config file, but other users may encounter this issue, so I will leave it to your team to decide whether there is room for improvement.

joelchen avatar Apr 19 '22 02:04 joelchen