transformers
transformers copied to clipboard
Tf timestamps whisper + update generate support
What does this PR
This PR updates the way we generation TF and FLAX to fix the breaking changes that we had.
It also adds support for the timestamps in TF
.
Follows #21965
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
Awesome thanks for the review 🤗
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
lmk when you want to pick this up again :P Meanwhile, shall we add the WIP label, so that the bot doesn't ping us?
yes! Hahah sorry, maybe next week or 2 weeks from now !
Okay! Thanks to @gante's recommendations, the xla generation works perfectly! The slow timestamp processing test also passes 🥳
Thanks for your review, will adresse all of this
@ArthurZucker I was testing out if I get the timestamps with TF model with your tf-timestamps-whisper
branch on colab but I see this:
[/content/transformers/src/transformers/models/whisper/tokenization_whisper.py](https://localhost:8080/#) in decode(self, token_ids, skip_special_tokens, clean_up_tokenization_spaces, output_offsets, time_precision, decode_with_timestamps, **kwargs)
593 )
594 if decode_with_timestamps:
--> 595 text = self._decode_with_timestamps(token_ids, time_precision=time_precision)
596 # retrieve offsets
597 if output_offsets:
[/content/transformers/src/transformers/models/whisper/tokenization_whisper.py](https://localhost:8080/#) in _decode_with_timestamps(self, token_ids, time_precision)
501 for token in token_ids:
502 if token >= timestamp_begin:
--> 503 timestamp = f"<|{(token - timestamp_begin) * time_precision:.2f}|>"
504 outputs.append(timestamp)
505 outputs.append([])
[/usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py](https://localhost:8080/#) in error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb
[/usr/local/lib/python3.10/dist-packages/tensorflow/python/ops/gen_math_ops.py](https://localhost:8080/#) in mul(x, y, name)
6574 if tld.is_eager:
6575 try:
-> 6576 _result = pywrap_tfe.TFE_Py_FastPathExecute(
6577 _ctx, "Mul", name, x, y)
6578 return _result
TypeError: Cannot convert 0.02 to EagerTensor of dtype int32
Hey! That’s probably because I haven’t pull from main for a while and we changed the whisper tokenizer. As you can see the decoding process is the one failing here
@ArthurZucker Thanks for the response. I got the issue resolved with
timestamp = f"<|{float(token - timestamp_begin) * time_precision:.2f}|>"
i.e. changing token - timestamp_begin
to float(token - timestamp_begin)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.