metal icon indicating copy to clipboard operation
metal copied to clipboard

Infer Usefulness of Label Functions

Open danich1 opened this issue 7 years ago • 3 comments

When using the original snorkel package, I was able to plot the generative model's weights and see which label functions received higher weights compared to the others. Ideally, it be great to know how one could get this kind of information using metals version of label estimation. If this isn't incorporated, would it be feasible to implement?

danich1 avatar Oct 18 '18 16:10 danich1

Hi @danich1 this is a great question (and very cool to hear that this info has been useful!)

Currently, you can get the vector of conditional accuracies for each label that an LF emits using LabelModel.get_conditional_probs(source=None), where source is just a synonym for labeling function- so you can plug in an LF index to get just its table of accuracies. See the docstring here: https://github.com/HazyResearch/metal/blob/9402e5534e100c8c10509138df937097264b9fa1/metal/label_model/label_model.py#L224

However, I'm glad you bring this up because this should be a much nicer and more intuitive interface- will get to this soon!

ajratner avatar Oct 18 '18 17:10 ajratner

Ah interesting. Another question: is the table format in terms of [lf=(0,1,2),Y=(Y=1, Y=2)]? In other words If I pass in source=0, I should get back a matrix that corresponds to the first row being probability that label function (at index 0) abstains given the "true class" is either positive or negative?

danich1 avatar Oct 18 '18 17:10 danich1

Yup exactly!

ajratner avatar Oct 18 '18 17:10 ajratner