whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

[Help/Suggestion] Docker compile whisper.cpp and use with go-bindings

Open Fliskr opened this issue 2 years ago • 1 comments

I want to create multistage build with whisper.cpp and go bindinds in my package.

Things i've tried:

  1. COPY libwhisper.a and whisper.h into docker container and then go build - receiving errors with linki to CPP libraries.
  2. COPY precompiled whisper binary and use it using os.Exec("whisper") - another linking issue.
  3. Compile whisper.cpp inside docker and then use it with docker build - the same as above.

Could someone provide a gist dockerfile with very basic working example

NOTES: tried amd64/golang:1.19.6, golang:1.19.6 docker images as builder and debian:11 as second stage.

Fliskr avatar Mar 03 '23 16:03 Fliskr

Here's a Dockerfile but without go bindings:

FROM debian:11

RUN apt -q -y update && apt -q -y upgrade
RUN apt -q -y install -q -y libsdl2-dev alsa-utils
RUN apt -q -y install -q -y g++ make wget

RUN mkdir /whisper && \
  wget -q https://github.com/masterful/whisper.cpp/tarball/master -O - | \
  tar -xz -C /whisper --strip-components 1

# git alternative
# RUN apt -q -y install git &&
#   git clone --depth 1 https://github.com/ggerganov/whisper.cpp.git /whisper

WORKDIR /whisper

RUN bash ./models/download-ggml-model.sh base.en
RUN make main stream

FROM debian:11

RUN apt -q -y update && apt -q -y upgrade
RUN apt -q -y install -q -y libsdl2-dev alsa-utils

WORKDIR /root

RUN mkdir /root/models
COPY --from=0 /whisper/models/ggml-base.en.bin /root/models/ggml-base.en.bin
COPY --from=0 /whisper/main /usr/local/bin/whisper
COPY --from=0 /whisper/stream /usr/local/bin/stream

# To use stream in a container (linux hosts only):
# docker build . -t whisper
# docker run -it --device /dev/snd:/dev/snd whisper stream

See also PR #576 for a smaller Alpine build.

mikeslattery avatar Mar 07 '23 15:03 mikeslattery

@mikeslattery Thank you, that helped a lot.

Fliskr avatar Mar 20 '23 06:03 Fliskr

@mikeslattery

Here's a Dockerfile but without go bindings:

How do I need to call whisper via docker? I have certainly done it wrong:

user@DS:~$ cat "/volume3/docker/whisper/test.wav" | docker run -it whisper.cpp whisper -l de --output-txt --output-srt -m /root/models/ggml-large.bin -
the input device is not a TTY

geimist avatar Mar 24 '23 17:03 geimist

How do I need to call whisper via docker?

If you replace the -it argument with -i, that error will go away.

I've only used files. I've not tried stdout/stdin. I didn't even know that would work. I extracted the binary and use it on my host, not with docker, as I don't want any additional latency. Here is my script (shortened):

#!/bin/bash

# Transcribe microphone as text to stdout

set -euo pipefail

dir='/tmp/whisper'
log="$dir/vtt.log"
recording="$dir/vtt.wav"
mkdir -p "$dir"
cd "$dir"

arecord -d 0 -r 16000 -c 1 -f S16_LE "$recording" &>> "$log" &
trap "jobs -p | xargs -r kill" ERR INT
zenity --info --text=Recording... --ok-label=Stop
jobs -p | xargs -r kill
wait

whisper -m ~/.local/share/whisper/ggml-tiny.en.bin --output-txt "$recording" "$@" &>> "$log"
cat "$recording.txt"

Fyi, I Mimic3 docker container for the reverse; TTS. But I have it run as a daemon and use http protocol, for low latency.

mikeslattery avatar Mar 24 '23 21:03 mikeslattery

FWIW, based on above Dockerfile @mikeslattery : reitzig/6582edd485a5d0a8b68600dab3b0861b Thanks!

reitzig avatar Mar 31 '23 12:03 reitzig

A nice combination. 👍 Is there a reason you guys are using the old fork from https://github.com/masterful/whisper.cpp?

geimist avatar Mar 31 '23 13:03 geimist

Nope, didn't even realize. C&P-🐒, it me. 😇

reitzig avatar Mar 31 '23 17:03 reitzig

Thanks to @mikeslattery Dockerfile. Here is a Dockerfile with go binding and code build.

FROM debian:11 as whisper_builder

RUN apt -q -y update && apt -q -y upgrade
RUN apt -q -y install -q -y libsdl2-dev alsa-utils
RUN apt -q -y install -q -y g++ make wget

RUN mkdir /whisper && \
  wget -q https://github.com/ggerganov/whisper.cpp/tarball/master -O - | \
  tar -xz -C /whisper --strip-components 1

WORKDIR /whisper
RUN bash ./models/download-ggml-model.sh base.en
RUN make main stream

WORKDIR /whisper/bindings/go
RUN make whisper

WORKDIR /gowhisper
ENV C_INCLUDE_PATH=/whisper/
ENV LIBRARY_PATH=/whisper/
RUN apt -q -y remove -q -y golang
RUN wget https://golang.org/dl/go1.18.linux-amd64.tar.gz && tar -zxvf go1.18.linux-amd64.tar.gz -C /usr/local/bin/
ENV PATH=${PATH}:/usr/local/bin/go/bin
COPY . .
RUN go mod tidy
RUN go build -o gowhisper .

FROM debian:11-slim
WORKDIR /root

RUN mkdir /root/models
COPY --from=whisper_builder /whisper/models/ggml-base.en.bin /root/models/ggml-base.en.bin
COPY --from=whisper_builder /whisper/main /usr/local/bin/whisper
COPY --from=whisper_builder /whisper/stream /usr/local/bin/stream
COPY --from=whisper_builder /gowhisper/gowhisper /usr/local/bin/gowhisper
RUN mkdir /models/

VOLUME ["/models/model.bin"]
CMD ["gowhisper"]

vsjoe avatar Apr 16 '23 11:04 vsjoe