tpot
tpot copied to clipboard
Row filtering operation in pipeline
I wonder if there is an operator available to filter out rows from the training data based on tuneable parameters. I am only aware of the Selector-Transformer-Regressor steps, but I would like to build my own operator that for example:
- is able to remove (instead of replacing) outliers from the dataset with a tuneable threshold parameter
- or detects drifts in my time series training dataset and then discards old data
Is there currently a way or workaround to include such an operator in a pipeline?
In general I think the whole topic of drift detection, automatic retraining if needed is not addressed adequately in most AutoML frameworks, what you think?