redflag
redflag copied to clipboard
Proper tests for `is_continuous`
The if/else stuff in this function feels pretty fragile. There is almost certainly a better way to do it.
Good first issue if you like maths and/or code testing.
Hm, this still misses the Facies column in the classic Hugoton dataset.
See this thread: https://stackoverflow.com/questions/35826912/what-is-a-good-heuristic-to-detect-if-a-column-in-a-pandas-dataframe-is-categori
This was failing too:
a = np.repeat(np.arange(0, 5), 10)
rf.is_continuous(a)
(Is discrete, but was saying continuous because sample size too small. Now using all samples unless more than 10,000 samples, then use 10,000.)