semantic-text-similarity icon indicating copy to clipboard operation
semantic-text-similarity copied to clipboard

Interpreting output

Open Chandrak1907 opened this issue 4 years ago • 2 comments

How to interpret the output? I am confused with continuous value output.

from semantic_text_similarity.models import WebBertSimilarity from semantic_text_similarity.models import ClinicalBertSimilarity web_model = WebBertSimilarity(device='cpu', batch_size=10) #defaults to GPU prediction clinical_model = ClinicalBertSimilarity(device='cuda', batch_size=10) #defaults to GPU prediction web_model = WebBertSimilarity(device='cuda', batch_size=10) #defaults to GPU prediction web_model.predict([("She won an olympic gold medal","The women is an olympic champion")]) array([3.0079894], dtype=float32) web_model.predict([("I am King","I am king")]) array([4.5483], dtype=float32) web_model.predict([("I am King","I am not king")]) array([3.6953335], dtype=float32) web_model.predict([("I am King","I am queen")]) array([4.31337], dtype=float32)

Chandrak1907 avatar Jan 08 '20 21:01 Chandrak1907

The [0,5] is a standard annotation range for STS tasks with 0 representing complete dissimilarity and 5 total similarity.

On Wed, Jan 8, 2020, 4:46 PM Chandrak1907 [email protected] wrote:

How to interpret the output? I am confused with continuous value output.

from semantic_text_similarity.models import WebBertSimilarity from semantic_text_similarity.models import ClinicalBertSimilarity web_model = WebBertSimilarity(device='cpu', batch_size=10) #defaults to GPU prediction clinical_model = ClinicalBertSimilarity(device='cuda', batch_size=10) #defaults to GPU prediction web_model = WebBertSimilarity(device='cuda', batch_size=10) #defaults to GPU prediction web_model.predict([("She won an olympic gold medal","The women is an olympic champion")]) array([3.0079894], dtype=float32) web_model.predict([("I am King","I am king")]) array([4.5483], dtype=float32) web_model.predict([("I am King","I am not king")]) array([3.6953335], dtype=float32) web_model.predict([("I am King","I am queen")]) array([4.31337], dtype=float32)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AndriyMulyar/semantic-text-similarity/issues/6?email_source=notifications&email_token=ADJ4TBXTZTSUD5KSFF53563Q4ZCTFA5CNFSM4KEPA7GKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IE4MXSQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJ4TBRBDYWT6FC2V7M223LQ4ZCTFANCNFSM4KEPA7GA .

AndriyMulyar avatar Jan 08 '20 21:01 AndriyMulyar

Please can you help us to understand the scores returned for exact matches? They seem to be in the 4-5 range but do not ever reach 5.

web_model.predict([("drones","drones")]) array([4.1196394], dtype=float32) web_model.predict([("I love to draw graphs","I love to draw graphs")]) array([4.710452], dtype=float32) web_model.predict([("data quality","data quality")]) array([4.539741], dtype=float32) web_model.predict([("motorcycle","motorcycle")]) array([4.4404535], dtype=float32)

magneticnorth avatar Feb 04 '21 17:02 magneticnorth