Anomaly-Detection
Anomaly-Detection copied to clipboard
Changed to make it more general
I have created a slightly different version of your program with the following changes:
I have changed it to take one single "Training File" as input and it will automatically split it into "tr_data", "cv_data" and "gt_data" files. This might be easier for some folks who don't understand the difference between the 3 files unless they have watched Andrew Ng's video and know what these mean (and do). I have created another function called "select_num_cols" that automatically selects numeric columns from the data set above. This enables most data scientists to get a smaller feature set than what they have. It will also work well with your Gaussian Distribution program. Since this version works with more than 2 variables, I have avoided plotting the variables. I hope these changes will be acceptable. If not, you can create a version.