lhotse
lhotse copied to clipboard
Function to merge short sentences into a long sentence
Hello, I would like to ask if there is a function in lhotse that supports the operation of splicing multiple uniform speakers or id-like sentence in a CutSet into one long sentence? I have extracted jsonl.gz and I want to merge the sentences of some speakers with similar ids into one long sentence.
You can do that by composing some groupby's and append, e.g.
from lhotse.cut.set import CutSet, append_cuts
cuts = CutSet.from_file(...)
speakers = cuts.speakers
long_cuts_per_speaker = {s: append_cuts(cuts.filter(lambda c: c.supervisions[0].speaker == s)) for s in speakers} # there are more efficient ways of doing this
Of course you can also add silence in between with cut.pad(cut.duration + pause_duration).append(another_cut)
, mix noises in with .mix(noise_cut)
, etc.
Thank you very much for your reply! May I ask if I can control the length of each sentence? For example, if I need to merge several short sentence into 10-20S, will I follow cut.duration to merge.
---- Replied Message ---- From Piotr @.> Date 02/29/2024 23:54 To lhotse-speech/lhotse @.> Cc Guochen Yu @.>, Author @.> Subject Re: [lhotse-speech/lhotse] Function to merge short sentences into a long sentence (Issue #1293)
You can do that by composing some groupby's and append, e.g. from lhotse.cut.set import CutSet, append_cuts
cuts = CutSet.from_file(...) speakers = cuts.speakers
long_cuts_per_speaker = {s: append_cuts(cuts.filter(lambda c: c.supervisions[0].speaker == s)) for s in speakers} # there are more efficient ways of doing this Of course you can also add silence in between with cut.pad(cut.duration + pause_duration).append(another_cut), mix noises in with .mix(noise_cut), etc. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Yes, you can also search for CutConcatenate transform that now has support for max_duration argument.