Machine-Learning-From-Scratch
Machine-Learning-From-Scratch copied to clipboard
add normalization step
๐ ๏ธ Description of Contribution: Added Feature Normalization Step
๐ What I Did:
I added a feature normalization step to the dataset preprocessing pipeline in the implementation of the machine learning algorithms (Linear regression & Logistic regression ). The original repository did not include this step, which can significantly impact the performance and convergence behavior of many ML models.
๐ Summary of Changes:
- Introduced a normalization function that scales input features to a standardized range (mean = 0, std = 1) or [0, 1], depending on the model requirement.
- Updated the training pipeline to apply normalization before model fitting.
- Ensured that the same transformation is applied during inference/prediction.
- I did not extract normalization into a shared module. This ensures clarity and local control for each modelโs behavior.
๐ Why Normalization Matters:
Feature normalization is a crucial step in machine learning, especially when:
- Features have different units or scales (e.g., age in years vs. income in dollars).
- Models rely on gradient-based optimization (e.g., Linear Regression, Logistic Regression, Neural Networks), where unscaled features can lead to slow convergence or divergent gradients.
- We want fair contributions from each feature without one dominating due to larger magnitude.
Without normalization:
- The cost function may not converge efficiently.
- The model might be biased toward high-magnitude features.
- Training can become unstable or yield suboptimal results.