healthcareai-py
healthcareai-py copied to clipboard
Warn users for category levels/factors with infrequent usage
Example R Output
Warning messages:
1: In private$loadData() :
Each of the following categorical variables has levels that occur 3 times or fewer:
- MaritalStatusDSC : 3 levels
- ReligionDSC : 23 levels
- LanguageDSC : 37 levels
- RaceGroupNM : 1 levels
Consider grouping these together with other levels.
You can view the levels of a column using the "table" command.
2: In private$loadData() :
The following categorical variable levels were not used in training the model:
- ReligionDSC : c("BU", "SD", "SOUTHERN BAPTIST")
- LanguageDSC : c("CATALAN", "CROATIAN", "IGBO", "SERBIAN", "SERBO-CROATIAN", "UNKN")
This sort of relates to #384 and I wonder if there is some code-sharing potential there
@Aylr Hi, I'm new to the project and looking for a good first issue to work on. Is this issue available?