Jan Motl
Jan Motl
I was testing this transformation in the past with different models. And it never lead to an improvement. From that time, whenever I have cyclical features and feel the need...
Nice tests. Just nitpicking: 1. In `test_multi_hot_fit` shouldn't: ```python self.assertEqual(enc.transform(X_t).shape[1], enc.transform(X_t[X_t['extra'] != 'A']).shape[1], 'We have to get the same count of columns') ``` become ```python self.assertEqual(enc.transform(X_t).shape[1], enc.transform(X_t[X_t['extra'] != 'D']).shape[1], #...
Good work. > The transformation test was based on test_one_hot.py. That's actually a mistake of mine in test_one_hot.py. I will fix it. > Suffixes start with 1, not 0 in...
@fullflu Please, check conformance of MultiHotEncoder to the changes in the master. All these changes were about `handle_missing` and `handle_unknown` arguments, which should be supported by all encoders. Note that...
Good. > I will check this if necessary. How serious is this warning? I just attempt to keep the test results free of errors and warnings - once I allow...
Awesome. Move the example. And I will merge it. Note: Just write somewhere that `|` assumes uniform distribution of the feature values. For example, when the data contain `1|2`, the...
Nice. I like the use of `assert_frame_equal()` (I didn't know that it existed). And that you wrote the default settings in the documentation. I propose to rename the optional argument...
> implement and_delimiter How is it going to work? Is it similar to `TfidfVectorizer` or `CountVectorizer`? A potentially useful dataset for the functionality illustration: [data](sorry.vse.cz/~berka/challenge/pkdd1999/data_berka.zip), [description](https://sorry.vse.cz/~berka/challenge/PAST/index.html). In my opinion, it...
> dictionary used as prior (hyperprior is [1,1,1,...],... Nice touch with the hyperprior.
Hi, @fullflu. Is there something I can help you with?