machine-learning icon indicating copy to clipboard operation
machine-learning copied to clipboard

Research methods to normalize, and smooth data

Open jeff1evesque opened this issue 9 years ago • 2 comments

We need to research preprocessing methodolgies to normalize, and smooth data. Specifically, we need to determine if we can automate the preprocess to normalize, and smooth data. Also, we need to determine if multiple approaches can be implemented simultaneously, before too much data is thrown away.

Ultimately, we need to determine which datapoints are outliers, and generally bad. Then, instead of removing the corresponding datapoints from the SQL database, we would mark it with the table column. So, we would need to create an additional sql column, which would mark a specific datapoint as bad. This would allow users to decide whether to include bad datapoints when generating the corresponding model.

jeff1evesque avatar Dec 03 '15 05:12 jeff1evesque

The following is a lazy googling result suggesting against using PCA for SVM's:

  • https://www.quora.com/Is-it-worth-trying-PCA-on-your-data-before-feeding-to-SVM

jeff1evesque avatar Dec 04 '15 03:12 jeff1evesque

We are removing this issue from milestone 0.4 for similar reasons as #2297.

jeff1evesque avatar Sep 15 '16 01:09 jeff1evesque