telemanom icon indicating copy to clipboard operation
telemanom copied to clipboard

A question regarding your publication

Open pengyuan0106 opened this issue 2 years ago • 2 comments

Hello Kyle,

I read your paper recently "Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding" Very nice paper. And I just have a question regarding your training dataset vs test dataset, which seem two different ones. any reason to set it that way? it will be great if you can clarify and help me understand. For example:

In train dataset: There are two major categories (1, and -1) in E-5.

In test dataset: There are three categories(1, 0, -1) in E-5, and category 1 is point anomaly.

Thank you very much and hope to hear from you soon

pengyuan0106 avatar Feb 10 '22 03:02 pengyuan0106

Are you referring to the model inputs from the one-hot-encoded dimensions (not the first dimension that contains the channel values)? Additional detail or code to reproduce would be helpful.

If I'm understanding your question correctly, test data in this context may contain not-yet-seen command information (the one-hot encoded dimensions I think you are referring to) and we need a way to represent this information.

khundman avatar Feb 27 '22 14:02 khundman

Hello Kyle,

Thank you for your reply. I converted npy files into CSV and use excel to check raw data. Please find a example about what I found below:

In train dataset: There are two major categories (1, and -1) in the first column of channel E-5.

In test dataset: There are three categories(1, 0, -1) in the first column of channel E-5, and category 1 is point anomaly.

Therefore, the training data and the test data don't seem to come from same dataset. It will be great if you can clarify and help me understand.

Thank you very much.

pengyuan0106 avatar Feb 27 '22 18:02 pengyuan0106