Stephen W. Thomas
Stephen W. Thomas
Note: a workaround is to add the following line to just before line 63 in `fit` (and to the similar line in `fit_transform`): ```python labels = labels.reindex(sorted(labels.columns), axis=1) ``` This...
@chrismattmann Thanks for your changes! I'm happy to merge those changes in; I'll add some comments on the commit you linked to. Re: not having having to run LDA outside...
Thank you, that helps. Some more questions: - Why are some cohort IDs blocked? - Why is the bit width of each cohort ID different? E.g., why not make them...
Thank you, @shigeki. In the document you link to, I assume that the relevant part is this: > The inputs are turned into a cohort ID using a technique we're...
To expand: It seems like what the document is saying the algorithm should do is something like this (just making up the numbers for illustration): - Cohort 0 is assigned...
Thank you @mfeurer. Your suggestions about the string type and TF-IDF+Truncated SVD make a lot of sense. 👍 From there, it will be easy to add other dimensionality-reduction techniques (e.g.,...
Thanks, @pplonski. In the future, it might make sense to add a feature to allow these scores to be obtained via the AuotML object directly so that one can get...
Hi Daniel, Yes, this is an ongoing challenge with sklearn pipelines with no easy, general answer. There a few bad options: - Don't investigate feature importance - Manually figure out...