DeepLog icon indicating copy to clipboard operation
DeepLog copied to clipboard

Procedure for DeepLog

Open danielhanbitlee opened this issue 6 years ago • 2 comments

Hi,

I want to make sure I get the procedure to implement DeepLog correct. Here's what I'm thinking. Given train log data and test log data, do the following:

  1. Run spell on the train data and test log data to get log keys and features.
  2. Sort the outputs from spell into different sessions or blocks for both train and test data.
  3. Take only the log keys and put into file. Each row will represent one session or block of log outputs. Do this for both train and test data.
  4. Take sessions from train data that do not have errors and train it on deeplog.
  5. Run the test data to make predictions.

Can someone confirm whether this thinking is correct?

danielhanbitlee avatar May 23 '19 21:05 danielhanbitlee

Almost correct. You have to split the train and test data only after the step 4. Before the step 4, you have to do everything on all the dataset.

amineebenamor avatar May 27 '19 12:05 amineebenamor

@danielhanbitlee @amineebenamor Hi,about step2,can you tell me how to [Sort the outputs from spell into different sessions or blocks]. for example:we have the following data

EventId EventTemplate ParameterList
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_-1608999687919862906', '/10.250.19.102:54106', '/10.250.19.102']
26ae4ce0 BLOCK* NameSystem.allocateBlock <*> ['mnt/hadoop/mapred/system/job_200811092030_0001/job.jar. blk_-1608999687919862906']
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_-1608999687919862906', '/10.250.10.6:40524', '/10.250.10.6']
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_7503483334202473044', '/10.251.215.16:55695', '/10.251.215.16']
-- -- --
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_7503483334202473044', '/10.250.19.102:34232', '/10.250.19.102']
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_-1608999687919862906', '/10.250.14.224:42420', '/10.250.14.224']
dc2c74b7 PacketResponder <> for block <> terminating ['1', 'blk_-1608999687919862906']
dc2c74b7 PacketResponder <> for block <> terminating ['2', 'blk_-1608999687919862906']

After sorting by blocks,what will the data become?Thank you.

zhangch-fnst avatar Aug 15 '19 07:08 zhangch-fnst