DeepLog
DeepLog copied to clipboard
Procedure for DeepLog
Hi,
I want to make sure I get the procedure to implement DeepLog correct. Here's what I'm thinking. Given train log data and test log data, do the following:
- Run spell on the train data and test log data to get log keys and features.
- Sort the outputs from spell into different sessions or blocks for both train and test data.
- Take only the log keys and put into file. Each row will represent one session or block of log outputs. Do this for both train and test data.
- Take sessions from train data that do not have errors and train it on deeplog.
- Run the test data to make predictions.
Can someone confirm whether this thinking is correct?
Almost correct. You have to split the train and test data only after the step 4. Before the step 4, you have to do everything on all the dataset.
@danielhanbitlee @amineebenamor Hi,about step2,can you tell me how to [Sort the outputs from spell into different sessions or blocks]. for example:we have the following data
| EventId | EventTemplate | ParameterList |
|---|---|---|
| 6af214fd | Receiving block <> src <> <> dest <> 50010 | ['blk_-1608999687919862906', '/10.250.19.102:54106', '/10.250.19.102'] |
| 26ae4ce0 | BLOCK* NameSystem.allocateBlock <*> | ['mnt/hadoop/mapred/system/job_200811092030_0001/job.jar. blk_-1608999687919862906'] |
| 6af214fd | Receiving block <> src <> <> dest <> 50010 | ['blk_-1608999687919862906', '/10.250.10.6:40524', '/10.250.10.6'] |
| 6af214fd | Receiving block <> src <> <> dest <> 50010 | ['blk_7503483334202473044', '/10.251.215.16:55695', '/10.251.215.16'] |
| -- | -- | -- |
| 6af214fd | Receiving block <> src <> <> dest <> 50010 | ['blk_7503483334202473044', '/10.250.19.102:34232', '/10.250.19.102'] |
| 6af214fd | Receiving block <> src <> <> dest <> 50010 | ['blk_-1608999687919862906', '/10.250.14.224:42420', '/10.250.14.224'] |
| dc2c74b7 | PacketResponder <> for block <> terminating | ['1', 'blk_-1608999687919862906'] |
| dc2c74b7 | PacketResponder <> for block <> terminating | ['2', 'blk_-1608999687919862906'] |
After sorting by blocks,what will the data become?Thank you.