Peter Hausamann

Results 25 comments of Peter Hausamann

Hi Noah, I've looked into it and I think this would be a very valuable contribution, however we'd have to approach it at a higher level. Here's an example that...

> First, there are two approaches to wrapping an sklearn pipeline. The first is to wrap each individual transformer with xarray methods, which is what it seems you are trying...

[Here's an example from the docs](https://phausamann.github.io/sklearn-xarray/content/transformers.html#transformers-changing-the-number-of-samples). Basically, the `Sanitizer` removes samples from the dataset which would not work in a normal pipeline because X and y would have an inconsistent...

Yeah, totally! There is btw also the possibility to use wrapped estimators in a pipeline with plain numpy arrays, the wrapped estimator determines it's input time at `fit` time... so...

Thanks for the explanation, I see your point now. I think it would be very useful to have a mechanism to parallelize part of the pipeline on a per-variable basis....

I feel like it should be possible to combine the two ideas into one single estimator that both (a) applies a transformer or pipeline to each variable in the dataset...

> I definitely think the dictionary input idea is good, but I think it is better to provide it as a function sort of like how sklearn has make_union and...

Anyway, these things are just technicalities, I think you can start working on a PR and we'll continue the discussion there. I've put some information together in the [wiki](https://github.com/phausamann/sklearn-xarray/wiki).

On second thought, class decorators seem like a bad idea, mostly because the resulting object is not pickleable. It makes more sense for each estimator to subclass the corresponding wrapper,...

The benefit of this approach would be that each estimator could inherit the methods it needs from the corresponding mixin, e.g.: class PCA(_CommonEstimatorWrapper, _ImplementsTransformMixin, _ImplementsScoreMixin) In some cases, the class...