Anthony Blaom, PhD comments

Results 815 comments of


                                            Anthony Blaom, PhD

Suggestion to simplify implemenation of scitype

Of course, this comment does not address the other "fly in the ointment" which is tables, requiring trait dispatch.

Suggestion to simplify implemenation of scitype

Well, no, I'm not suggesting we change the *definition* of scitype for arrays (see [Property 3](https://github.com/JuliaAI/ScientificTypesBase.jl#what-is-provided-here)) - only the implementation. According to the definition, we will need to look inside...

Are GridSearch using the update! method?

Sorry, I guess this one fell under the radar. Just skimmed your comment but here's a quick reply, which hopefully addressed your point: In general, because one is resampling to...

Are GridSearch using the update! method?

As explained above, the best we can expect here is for user-specified holdout train/test pairs to work in addition to `Holdout` resampling strategy. [This PR](https://github.com/alan-turing-institute/MLJBase.jl/pull/559) resolves this (also in the...

Universal table transformer combining univariate transformations dispatched on schema

I'm inclined to go with option 2, which is more user-friendly. The other issue ought to be solved on the tables interface side, in my opinion.

Enhance treatment of missing value in one-hot encoder

`all-zero` looks like the simplest. One question for `category` is how to handle `missing` values that appear for a feature that did not have`missing` values in training (`fit`). Here's a...

Enhance treatment of missing value in one-hot encoder

No, rather it's the same as the current behaviour, except instead of `missing`s, use zeros. You don't need to spawn an extra column in this case: ``` julia> X =...

Enhance treatment of missing value in one-hot encoder

Yes, great catch, that's a bug: https://github.com/JuliaAI/MLJModels.jl/issues/467 Are you willing an able to make a PR with a test?

Enhance treatment of missing value in one-hot encoder

Done. You have an invitation to accept.

Scikitlearn clustering methods cleanup

@tlienart You are right. DBSCAN is not like KMeans clustering. I stand corrected. However, I do wonder if the sk-learn way of conceptualising this class of clustering problems is the...