predictive-maintenance-python
predictive-maintenance-python copied to clipboard
Prediction not understood
I am not able to understand how are you able to predict which bearing is suspected to fail. What is the logic behind your assumption to label them as passed and failed? In your results for testset1, bearings 5,6,7,8 are suspected to fail. You mean for the period of Oct 2003 to Nov 2003 (the readings for which month are available), bearing 3 (x and y axis both), bearing 4 (x and y axis both) are suspected to fail. Is it not possible to give the exact time prediction when is it possible to fail. Oct to Nov is a long period of time.
Why frequency has been considered as the features for logistic regression? Couldn'nt we consider directly the values given in the datasets and what are those values in datasets and why not consider amplitudes instead? Why first 70% of the data is considered to be passed and 30% of the data to be failed. Why bearing4_yaxis is considered failed in dataset1 and bearing1 failed in dataset2? Any logical reason?
Is my understanding correct?
You can get the time stamp from the prediction, because there are false negative and false positive, so here the logic is to count the number of labels (1 or 0) predicted. In a fixed time window (like 100 samples), if there are more predicted labels of 1's than 0's, then the bearing is supposed to fail soon.
Yes, the amplitude can be a feature as well, e.g. the root squared mean of the 1s-long signal. Here, the 5 frequency spectral amplitudes may not be the best features.
The values of 70% and 30% depends on the data, could be optimized according to the domain knowledge. The data specification file (from the downloading link) says which bearing was observed failed at the end of data set.