whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

CoreML: Repeating parts of text instead of transcribing - more than an hour long files

Open russell-dot-js opened this issue 1 year ago • 2 comments

See #612 - the error seems to be prevalent when using a CoreML model. Rebuilding without CoreML resolves the issue

russell-dot-js avatar Feb 09 '24 00:02 russell-dot-js

I can confirm that with CoreML it freezes immediately with large files, but rebuilding without it does not completely solve the problem (it just occurs later with large files).

But this helped me: https://github.com/ggerganov/whisper.cpp/issues/896#issuecomment-1569586018

lucidyan avatar Feb 26 '24 17:02 lucidyan

I tried the above but still ran into some issues. As a workaround, I'm using a script to split large audio files into smaller chunks (here, 1200 seconds aka 20 minutes). This uses ffprobe and ffmpeg to split m4a files.

Usage:

chmod u+x split_audio.sh
./split_audio.sh <path to your m4a file>

After using whisper to transcribe the parts, I put the transcripts of all the parts in one file with cat: cat file2.txt file3.txt ... >> file1.txt

Script:

#!/bin/bash

# Function to get the duration of the audio file in seconds
get_audio_duration() {
    duration=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "$1")
    echo "$duration"
}

# Function to split the audio file into n-minute parts
split_audio() {
    file_path="$1"
    segment_duration=1200  # seconds
    duration=$(get_audio_duration "$file_path")
    file_name="${file_path%.*}"
    file_ext="${file_path##*.}"
    num_parts=$(echo "$duration / $segment_duration" | bc)
    if (( $(echo "$duration % $segment_duration > 0" | bc) )); then
        num_parts=$(($num_parts + 1))
    fi
    
    for ((i=0; i<num_parts; i++)); do
        start_time=$(echo "$i * $segment_duration" | bc)
        output_file="${file_name}_part$(($i + 1)).${file_ext}"
        ffmpeg -i "$file_path" -ss "$start_time" -t "$segment_duration" -c copy "$output_file"
    done
}

# Main script execution
if [[ $# -ne 1 ]]; then
    echo "Usage: $0 <path_to_m4a_file>"
    exit 1
fi

file_path="$1"

split_audio "$file_path"

KNWR avatar May 19 '24 14:05 KNWR