Stephen W. Thomas comments

Results 8 comments of


                                            Stephen W. Thomas

PolynomialWrapper retains old values from previous calls to fit

Note: a workaround is to add the following line to just before line 63 in `fit` (and to the similar line in `fit_transform`): ```python labels = labels.reindex(sorted(labels.columns), axis=1) ``` This...

Updates to field indexing

@chrismattmann Thanks for your changes! I'm happy to merge those changes in; I'll add some comments on the commit you linked to. Re: not having having to run LDA outside...

Help understanding ApplySortingLsh?

Thank you, that helps. Some more questions: - Why are some cohort IDs blocked? - Why is the bit width of each cohort ID different? E.g., why not make them...

Help understanding ApplySortingLsh?

Thank you, @shigeki. In the document you link to, I assume that the relevant part is this: > The inputs are turned into a cohort ID using a technique we're...

Help understanding ApplySortingLsh?

To expand: It seems like what the document is saying the algorithm should do is something like this (just making up the numbers for illustration): - Cohort 0 is assigned...

How to apply a custom preprocessor to only specified features

Thank you @mfeurer. Your suggestions about the string type and TF-IDF+Truncated SVD make a lot of sense. 👍 From there, it will be easy to add other dimensionality-reduction techniques (e.g.,...

How to get cross validation scores from AutoML object?

Thanks, @pplonski. In the future, it might make sense to add a feature to allow these scores to be obtained via the AuotML object directly so that one can get...

Question Regarding Pipeline

Hi Daniel, Yes, this is an ongoing challenge with sklearn pipelines with no easy, general answer. There a few bad options: - Don't investigate feature importance - Manually figure out...