Machine-Learning-From-Scratch icon indicating copy to clipboard operation
Machine-Learning-From-Scratch copied to clipboard

add normalization step

Open Akchiche-Mohamed-Aymen opened this issue 8 months ago โ€ข 0 comments

๐Ÿ› ๏ธ Description of Contribution: Added Feature Normalization Step

๐Ÿ”„ What I Did:

I added a feature normalization step to the dataset preprocessing pipeline in the implementation of the machine learning algorithms (Linear regression & Logistic regression ). The original repository did not include this step, which can significantly impact the performance and convergence behavior of many ML models.

๐Ÿ“Œ Summary of Changes:

  • Introduced a normalization function that scales input features to a standardized range (mean = 0, std = 1) or [0, 1], depending on the model requirement.
  • Updated the training pipeline to apply normalization before model fitting.
  • Ensured that the same transformation is applied during inference/prediction.
  • I did not extract normalization into a shared module. This ensures clarity and local control for each modelโ€™s behavior.

๐Ÿ“ˆ Why Normalization Matters:

Feature normalization is a crucial step in machine learning, especially when:

  • Features have different units or scales (e.g., age in years vs. income in dollars).
  • Models rely on gradient-based optimization (e.g., Linear Regression, Logistic Regression, Neural Networks), where unscaled features can lead to slow convergence or divergent gradients.
  • We want fair contributions from each feature without one dominating due to larger magnitude.

Without normalization:

  • The cost function may not converge efficiently.
  • The model might be biased toward high-magnitude features.
  • Training can become unstable or yield suboptimal results.

Akchiche-Mohamed-Aymen avatar Jul 22 '25 13:07 Akchiche-Mohamed-Aymen