johnsnowlabs icon indicating copy to clipboard operation
johnsnowlabs copied to clipboard

How to extract probabilities of models like Roberta?

Open franz101 opened this issue 1 year ago • 1 comments

For most NER models I was able to extract the probabilities of the prediction (ConfidenceScores). I noticed Roberta and Albert are missing the setIncludeAllConfidenceScores function in johnsnowlabs > 5. Is there a way to still extract the ConfidenceScores maybe I missed it in the documentation.

How to reproduce:

import sparknlp
from johnsnowlabs.nlp import PretrainedPipeline

spark = sparknlp.start()

pipeline = PretrainedPipeline("albert_base_token_classifier_conll03_pipeline", lang = "en")

pipeline.model.setIncludeAllConfidenceScores(True)

franz101 avatar Sep 21 '23 13:09 franz101

@franz101 the method setIncludeAllConfidenceScores is not available on every annotator, in particular, the XYZ-For-Token-Classification based annotators do not support this feature yet. Any NerDL, MedicalNerDL, FinanceNerDL, LegalNerDL architecture based models currently support setIncludeAllConfidenceScores

I will talk with @maziyarpanahi to get it on the roadmap or find workarounds

C-K-Loan avatar Sep 26 '23 10:09 C-K-Loan