substitution function for cast_column in the code

Open rezapiiich opened this issue 2 years ago • 0 comments

Hi, I have tried to fine-tune whisper model using your code and it was perfect, but there is a problem. When common_voice DatasetDict size expands, my RAM is filled and it stops running. I want to replace this following line of your code with another snippet that does exactly the same thing but does it one voice or one batch of voices at a time so that my RAM doesn't fill up.

common_voice = common_voice.cast_column("audio", Audio(sampling_rate=16000))

Nov 05 '23 11:11 rezapiiich