CTranslate2 Asking about the return_scores during generation

Asking about the return_scores during generation

Open freyaya123 opened this issue 3 months ago • 7 comments

Hi, I'm new to ctranslate2, and I'm confused about the scores returned by generator.generate_batch() function. What's the coresponding meaning of the scores in the huggingface generate() function?

For example,

>>> text
'Question:\nNatalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?\nAnswer reasoning:\ndef solution():    """Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"""\n'

in hf generation:

>>> input_ids=tokenizer(text, return_tensors="pt").input_ids
>>> input_ids.shape
torch.Size([1, 100])
>>>stop=[13, tokenizer.eos_token_id]
>>>model = AutoModelForCausalLM.from_pretrained("xxx", return_dict_in_generate=True)
>>>tokenizer = AutoTokenizer.from_pretrained("xxx")
>>>outputs=model.generate(input_ids, do_sample=True, num_return_sequences=3, output_scores=True,max_length=200,eos_token_id=stop,top_k=50,top_p=1.0,temperature=1.0,pad_token_id=tokenizer.pad_token_id) 
>>> outputs.sequences.shape
torch.Size([3, 111])
>>> len(outputs.scores)
11
>>> outputs.scores[0].shape
torch.Size([3, 32016])

But if I use ctranslate2, for example:

>>>>>>prompt_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(text))
>>>step_results = ct2_generator.generate_batch([prompt_tokens],return_scores=True, max_length=100,num_hypotheses=3,sampling_topk=50,sampling_topp=1.0,sampling_temperature=1.0,include_prompt_in_result=False,end_token=stop)
>>> [GenerationResult(sequences=[['▁▁▁', '▁cli', 'ps', '_', 's', 'old', '_', 'ap', 'ril', '▁=', '▁', '4', '8'], ['▁▁▁', '▁cli', 'ps', '_', 's', 'old', '_', 'ap', 'ril', '▁=', '▁', '4', '8'], ['▁▁▁', '▁cli', 'ps', '_', 'ap', 'ril', '▁=', '▁', '4', '8']], sequences_ids=[[1678, 9335, 567, 29918, 29879, 1025, 29918, 481, 4115, 353, 29871, 29946, 29947], [1678, 9335, 567, 29918, 29879, 1025, 29918, 481, 4115, 353, 29871, 29946, 29947], [1678, 9335, 567, 29918, 481, 4115, 353, 29871, 29946, 29947]], scores=[-0.0640094131231308, -0.0640094131231308, -0.09266181290149689])]

I will get a list of len 3 for step_results[0].scores

And I also noticed that there is another function in hf:

>>> transition_scores = model.compute_transition_scores(outputs.sequences,outputs.scores,normalize_logits=True) 
>>> transition_scores
tensor([[-1.0014e-05, -2.0167e-01, -3.2067e-05, -5.0068e-06, -6.9655e-02,
         -4.8401e+00, -1.8686e-02, -3.4571e-06, -1.5497e-06, -3.3379e-06,
         -6.1989e-06],
        [-1.0014e-05, -2.0167e-01, -3.2067e-05, -5.0068e-06, -6.9655e-02,
         -7.9593e-03, -6.5205e-05, -3.5763e-06, -1.6689e-06, -3.2186e-06,
         -5.3644e-06],
        [-1.0014e-05, -2.0167e-01, -3.2067e-05, -5.0068e-06, -6.9655e-02,
         -7.9593e-03, -6.5205e-05, -3.5763e-06, -1.6689e-06, -3.2186e-06,
         -5.3644e-06]])#[3,11]

which is really different from the scores in step_results.

So I have two questions here:

What's the corresponding relationship between generated_outputs.scores, transition_scores and step_results[0].scores?
Another little questions, I think I have set the same parameters between hf and ctranslate, but I have definitely different generations. For example, if the stop token id equals to 13(\n), the hf output will contain the \n, but the ctranslate2 output won't. How to include the stop into the output?

#ctranslate2
>>> step_results[0].sequences_ids
[[1678, 9335, 567, 29918, 29879, 1025, 29918, 481, 4115, 353, 29871, 29946, 29947], [1678, 9335, 567, 29918, 29879, 1025, 29918, 481, 4115, 353, 29871, 29946, 29947], [1678, 9335, 567, 29918, 481, 4115, 353, 29871, 29946, 29947]]

#hf
>>> outputs.sequences[:, len(input_ids[0]):]
tensor([[ 1678,  9335,   567, 29918,   481, 29878,   353, 29871, 29946, 29947,
            13],
        [ 1678,  9335,   567, 29918,   481,  4115,   353, 29871, 29946, 29947,
            13],
        [ 1678,  9335,   567, 29918,   481,  4115,   353, 29871, 29946, 29947,
            13]])

Why is that? Are there parameters I'm not aware of?

Apr 10 '24 10:04 freyaya123

CTranslate2 CTranslate2 copied to clipboard

Asking about the return_scores during generation

CTranslate2
CTranslate2 copied to clipboard