FEDOT
FEDOT copied to clipboard
Improvement of automatic text detection
Automatic text detection now is more effective, accurate and robust.
- Every column which possibly contains text now is checked on the tf-idf vocabulary size. If this size is more than threshold, this column really contains useful text information.
- Columns with links (they don't contain useful information and sometimes lead to a FEDOT fail) are removed automatically.
- Additional unit tests and extended tests (based on AutoML benchmark) will be added too.
Codecov Report
Merging #903 (86ae96f) into master (c5f050d) will decrease coverage by
0.08%
. The diff coverage is86.95%
.
@@ Coverage Diff @@
## master #903 +/- ##
==========================================
- Coverage 87.86% 87.77% -0.09%
==========================================
Files 206 206
Lines 13786 13726 -60
==========================================
- Hits 12113 12048 -65
- Misses 1673 1678 +5
Impacted Files | Coverage Δ | |
---|---|---|
fedot/core/pipelines/tuning/search_space.py | 100.00% <ø> (ø) |
|
...implementations/data_operations/text_pretrained.py | 56.14% <42.85%> (-4.73%) |
:arrow_down: |
fedot/core/data/data_detection.py | 96.70% <97.22%> (+1.31%) |
:arrow_up: |
fedot/core/composer/metrics.py | 97.22% <100.00%> (-0.04%) |
:arrow_down: |
fedot/core/constants.py | 100.00% <100.00%> (ø) |
|
fedot/core/data/data.py | 86.77% <100.00%> (-0.11%) |
:arrow_down: |
fedot/core/data/multi_modal.py | 87.62% <100.00%> (+2.05%) |
:arrow_up: |
fedot/preprocessing/data_types.py | 94.25% <100.00%> (+0.03%) |
:arrow_up: |
...edot/core/repository/graph_operation_repository.py | 66.66% <0.00%> (-8.34%) |
:arrow_down: |
fedot/explainability/explainer_template.py | 75.00% <0.00%> (-5.00%) |
:arrow_down: |
... and 59 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.