add normalization step

Open Akchiche-Mohamed-Aymen opened this issue 8 months ago • 0 comments

🛠️ Description of Contribution: Added Feature Normalization Step

🔄 What I Did:

I added a feature normalization step to the dataset preprocessing pipeline in the implementation of the machine learning algorithms (Linear regression & Logistic regression ). The original repository did not include this step, which can significantly impact the performance and convergence behavior of many ML models.

📌 Summary of Changes:

Introduced a normalization function that scales input features to a standardized range (mean = 0, std = 1) or [0, 1], depending on the model requirement.
Updated the training pipeline to apply normalization before model fitting.
Ensured that the same transformation is applied during inference/prediction.
I did not extract normalization into a shared module. This ensures clarity and local control for each model’s behavior.

📈 Why Normalization Matters:

Feature normalization is a crucial step in machine learning, especially when:

Features have different units or scales (e.g., age in years vs. income in dollars).
Models rely on gradient-based optimization (e.g., Linear Regression, Logistic Regression, Neural Networks), where unscaled features can lead to slow convergence or divergent gradients.
We want fair contributions from each feature without one dominating due to larger magnitude.

Without normalization:

The cost function may not converge efficiently.
The model might be biased toward high-magnitude features.
Training can become unstable or yield suboptimal results.

Jul 22 '25 13:07 Akchiche-Mohamed-Aymen