FEDOT
FEDOT copied to clipboard
Investigate new data operations for feature engineering and ensembling
- Add simple Ensembling methods, such as TopModel, WeightedEnsemble and AverageEnsemble
- Discover the best practises of FE methods for classification/regression of table-like datasets
- Perform the experiments with expert-based feature engineering as a separate DataOperation blocks
- Think about special presets of models/operations for classification/regression
- Try feature generation methods, for instance, like featuretools
"Importance Cut off" feature selection effectiventess also can be analysed
План действий:
- [x] 1. Добавить CatBoost, LightGBM + гиперпараметры DONE
- [x] 2. Посмотреть и составить примерный алгоритм/схему, как происходит feature engineering in LAMA DONE
- [x] 3. изучить в чем разница pipeline между LAMA - FEDOT
- [x] 4. Прогнать LAMA на бенчмарках (см. https://github.com/nicl-nno/automlbenchmark/blob/master/frameworks/FEDOT/exec.py) добавить необходимую разницу в FEDOT
- [ ] 5. научить композер создавать такие (или лучшие пайплайны)
На одном датасете были обучены оба фреймфорка. Для обоих фреймворков прогнали обучение по 8 раз и усреднили метрики:
FEDOT
AUC 5 MINUTES train: 0.7995216483735391 test: 0.7141597316576087
10 MINUTES train: 0.8050606503972741 test: 0.7121297554347826
20 MINUTES train: 0.7735015904571719 test: 0.723378269361413
LAMA
AUC ~ 1 minutes [40, 50, ] train: 0.6866954923298692 test: 0.7107557744565218
Для FEDOT необходимо было делать предварительную предобработку даты и категориальных признаков, подробнее .
will any auto genetic feature engineering between the multivariable features be added,such as feature1*lag(feature2,10).Since I find there is a genetic algorithm
in fedot. ATOM(https://github.com/tvdboom/ATOM) provide such process by gplearn,however the operators set are very small
@https://github.com/graceyangfan
We did not plan to design the features by GA itself. However, we use existing feature generators like poly_features and tune it's hyperparameters during evolution and tuning.