Kamil Akesbi
Kamil Akesbi
Hi @Hubert-Bonisseur, Thanks for sharing this issue! - Remark 1 was solved in PR #30325. - Regarding Remark 2: Long-form generation indeed requires timestamps to chunk the audios so this...
Hi @sproocht, Thanks for sharing this error! It will be solved with PR #29688.
I think this PR is ready to be merged! cc @amyeroberts @gante if you want to have a look ;)
Hi @mizoru, Thanks for iterating on this! Could you please open an issue with a min reproducer of the error you get before making these changes?
Hi @systemdevart, Thank you for this question! Here, `stride_left` indicates the overlap between the current and left chunk when already considering that `stride_right` samples are not in the left chunk...
It was indeed solved with #30325, I'm closing for now!
Hi @udeepam, thanks for this issue and the clear reproducer! On the latest version of the main branch ( `transformers 4.40.0.dev0`), I get the same results with and without `generation_config`,...
It will be solved by PR #31296 :)
Hi @zxl777, Thanks for this issue! The provided audio is longer than 30 seconds. In this case, you can choose to: - Use batched inference by chunking the input audio...
Hi @hanif-rt, this should be solved with PR #31572 :)