blog
blog copied to clipboard
substitution function for cast_column in the code
Hi, I have tried to fine-tune whisper model using your code and it was perfect, but there is a problem. When common_voice DatasetDict size expands, my RAM is filled and it stops running. I want to replace this following line of your code with another snippet that does exactly the same thing but does it one voice or one batch of voices at a time so that my RAM doesn't fill up.
common_voice = common_voice.cast_column("audio", Audio(sampling_rate=16000))