deep-loglizer icon indicating copy to clipboard operation
deep-loglizer copied to clipboard

How to handle unlabelled data

Open neural-nerd-raj opened this issue 1 year ago • 2 comments

I noticed, after load_sessions is called, it has train/test data in form of:
{'templates': ['BLOCK NameSystem.allocateBlock:<>',...], 'label': 0}.**
For unsupervised data there is no label key available.
How to handle this? Do I need to assume '0' as label for unsupervised?

neural-nerd-raj avatar Feb 18 '24 11:02 neural-nerd-raj

Unsupervised learning should be achieved by setting the "--label_type" parameter to "nextlog", meaning that the ID of the next log in the window is used as a label for unsupervised learning. For supervised learning, the parameter should be set to "anomaly".

CattusX avatar Mar 05 '24 06:03 CattusX

they do it that way so you can mesure metrics at the end. but int real world applications you wont have label(0 or 1). Regarding the @new-cat , thats right, labels will be next key given a sequence

alexjamesmx avatar Mar 06 '24 14:03 alexjamesmx