mimic3-benchmarks icon indicating copy to clipboard operation
mimic3-benchmarks copied to clipboard

Add commandline args for outlier detection, rescaling

Open turambar opened this issue 7 years ago • 5 comments

Make outlier detection and input rescaling optional based on command-line args

turambar avatar Jan 16 '18 23:01 turambar

I think this is done, right @Harhro94?

turambar avatar Aug 26 '18 19:08 turambar

No, it isn't.

hrayrhar avatar Aug 26 '18 20:08 hrayrhar

Before doing this we should resolve the inconsistencies in the column names of item_id_to_variable_map.csv and variable_ranges.csv files (reported in #28). Right now this inconsistency doesn't affect the code, since we don't do resaling and outlier detection.

hrayrhar avatar Aug 26 '18 20:08 hrayrhar

Right, I'll take a look this week.

turambar avatar Aug 26 '18 20:08 turambar

Hi @turambar @hrayrhar,

Greetings. Thank you for the work you have done to create a benchmark dataset and tasks!

Is there any update on this issue to remove outliers?

I am using the dataset generated in this repository for my research. I noticed that for some variables, i.e: weight (box plot below), the range of values is large and the box plot indicates outliers. I think these values are adversely affecting the machine learning model that I am researching. Hence, I am looking at ways to correct these outliers.

Screenshot 2021-10-24 at 11 44 32 AM

jagandecapri avatar Oct 24 '21 03:10 jagandecapri