CTranslate2 icon indicating copy to clipboard operation
CTranslate2 copied to clipboard

NO LOGITS RETURNS AFTER GENERATE

Open lance-xuqihang opened this issue 1 year ago • 8 comments

after ctranslate2.models.Whisper.generate, result does not include "logits" version == 4.4.0

lance-xuqihang avatar Sep 11 '24 09:09 lance-xuqihang

Do you set the return_logits_vocab=True?

minhthuc2502 avatar Sep 12 '24 09:09 minhthuc2502

Do you set the return_logits_vocab=True?

Yes. But I didn't find logits in WhisperGenerationResults

lance-xuqihang avatar Sep 12 '24 14:09 lance-xuqihang

Can you send me the code? I will test it.

minhthuc2502 avatar Sep 13 '24 13:09 minhthuc2502

Can you send me the code? I will test it.

The code is modified from faster-whisper result = self.model.generate( encoder_output, [prompt], length_penalty=options.length_penalty, repetition_penalty=options.repetition_penalty, no_repeat_ngram_size=options.no_repeat_ngram_size, max_length=max_length, return_logits_vocab=True, return_scores=True, return_no_speech_prob=True, suppress_blank=options.suppress_blank, suppress_tokens=options.suppress_tokens, max_initial_timestamp_index=max_initial_timestamp_index, **kwargs, )

And I have printed the WhisperGenerationResult class. Seems no ‘logits’ in this class. >>> dir(ctranslate2.models.WhisperGenerationResult) ['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'no_speech_prob', 'scores', 'sequences', 'sequences_ids']

lance-xuqihang avatar Sep 14 '24 03:09 lance-xuqihang

Seems to be missing from the Python wrapper https://github.com/OpenNMT/CTranslate2/blob/v4.4.0/python/cpp/whisper.cc#L125

bogdanteleaga avatar Sep 20 '24 09:09 bogdanteleaga

In version v4.5.0, logits is present in WhisperGenerationResult, but it contains an empty value: result.logits -> [[ [cpu:0 float32 storage viewed as ]]]

I'm wondering if this is expected or if it's possible to retrieve meaningful values for logits. Other transcription-related aspects are working fine.

thank you.

discojune avatar Oct 30 '24 03:10 discojune

As discojune said above I get [[ [cpu:0 float32 storage viewed as ] ]] from results.logits with return_logits_vocab set to True as a parameter. The API says we should be able to decode storageviews via converting to them to numpy arrays, however when I try to do this I get the Type error: "float() argument must be a string or a real number, not 'ctranslate2._ext.StorageView'". I think there may be an issue with the StorageView objects for WhisperGenerationResults as on the API it shows storageview objects as having set dimensions while there is an empty value provided for WhisperGenerationResults.

sammygrey avatar Dec 05 '24 07:12 sammygrey

The latest release includes the addition of the missing field in the WhisperGenerationResult object, but I'm always getting an empty array in the logits, leading to the "argument must be a string or a real number" error mentioned above. Is anyone able to get the actual logits from the Whisper.generate function?

miguelamavel avatar Apr 09 '25 13:04 miguelamavel