FEDOT icon indicating copy to clipboard operation
FEDOT copied to clipboard

Improving preprocessing

Open aPovidlo opened this issue 6 months ago • 22 comments

This is a 🔨 code refactoring.

Summary

Significant Updates in Data Storage and Preprocessing

Major Updates:

  • Enhanced logging: Added more detailed logs in DEBUG mode during preprocessing.
  • New functionality: You can now mark categorical features in data when using InputData.from_numpy(...), InputData.from_dataframe(...), and InputData.from_csv(...) methods.
  • New class: Introduced OptimizedFeatures, which stores data with optimal dtypes for improved efficiency.
  • Preprocessing improvement: Added a new stage called reduce_memory_size to optimize memory usage.
  • API enhancements: Updated PredefinedModel to allow copying parameters from DataPreprocessor.

Minor Updates:

  • Improved logic for detecting categorical data.
  • Updated encoders and imputers to align with the new changes.
  • Revised tests to incorporate the new features.

Context

closes #1337 closes #1329

aPovidlo avatar Aug 13 '24 15:08 aPovidlo