unilm
unilm copied to clipboard
Unable to know the confidence score of our generated text
currently i am working on TR-OCR MODEL USED- VisionEncoderDecoderModel
code - /** from PIL import Image from transformers import TrOCRProcessor, VisionEncoderDecoderModel processor = TrOCRProcessor.from_pretrained("microsoft/trocr-base-printed") model = VisionEncoderDecoderModel.from_pretrained("microsoft/trocr-base-printed") image = Image.open("/content/drive/MyDrive/vaibhav/AADHAAR_DATA/testdata/"+ df.loc[i,"key"]).convert("RGB") pixel_values = processor(image, return_tensors="pt").pixel_values generated_ids = model.generate(pixel_values) generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True) print(generated_text) # it return text only **/ I want confidence with my generated text which i am unable to get
Hey, @vaibhavkansallumiq ,I'm also facing the same issue, any findings or update? will help a lot, thanks
There's something called "output_score" but i don't know how to use it
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: Akash L P @.> Sent: Thursday, September 1, 2022 3:11:13 PM To: microsoft/unilm @.> Cc: Vaibhav Kansal @.>; Mention @.> Subject: Re: [microsoft/unilm] Unable to know the confidence score of our generated text (Issue #844)
You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
Hey, @vaibhavkansallumiqhttps://github.com/vaibhavkansallumiq ,I'm also facing the same issue, any findings or update? will help a lot, thanks
— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/unilm/issues/844#issuecomment-1234027682, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AYAUUVVQWRFU6AXDKI7OOW3V4B23TANCNFSM575IBJ6A. You are receiving this because you were mentioned.Message ID: @.***>
Hey, thanks for replying And from which function was it from...?
did you find anything usefull
is located in the function transformers/generation_utils.py, but the confidence score returned is not clear what it means.
def infer(self, image): pixel_values = self.processor(image, return_tensors="pt").pixel_values pred_ids = self.model.generate( pixel_values, use_cache=True, output_scores=True, return_dict_in_generate=True ) print(pred_ids['scores']) pred_ids = pred_ids['sequences'] preds = self.processor.batch_decode( pred_ids, skip_special_tokens=True )
I've posted a (partial) answer to the duplicate of this question for the HuggingFace TrOCR model: https://github.com/microsoft/unilm/issues/955#issuecomment-1728546355