py
py copied to clipboard
Missing value
In the age column, there are 177 Nan. How to deal with whether should I delete them or put the mean of age column??
drop the feature or fill missing value ? If the number of NaN is great then you may consider to drop the feature otherwise fill the missing values with mean or median.
mean or median ? If there is outliers in the features consider median to fill NaN as outliers affect the mean values.
How to find outliers ? If the skew of feature is right or left then features may have outliers.