whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Suboptimal performance in SSE-enabled CPUs that don't have AVX

Open ttsiodras opened this issue 1 year ago • 2 comments

I have a Celeron-equipped machine; which has SSE, but no AVX instructions. I just wanted to let you know that processing the JFK sample...

  • Takes 73 seconds with the binary made by the default make

  • Takes 64 seconds with a bit of Makefile hacking to use openblas

    CFLAGS += -DGGML_USE_OPENBLAS -I /usr/include/x86_64-linux-gnu/openblas64-pthread/ LDFLAGS += /usr/lib/x86_64-linux-gnu/libopenblas64.a

(which is not detected automatically, as far as I can see)

  • and takes 50 seconds if I also patch C/CXXFLAGS to force use of SSE, ignoring accuracy issues:

    CFLAGS = -I. -O3 -std=c11 -fPIC -msse -msse2 -mfpmath=sse,387 -Ofast CXXFLAGS = -I. -I./examples -O3 -std=c++11 -fPIC -msse -msse2 -mfpmath=sse,387 -Ofast

The resulting binary works fine, while running in 31% less time than the binary from the default build. This can probably be improved even further with SSE-specific asm, but at the very least it seems there's some low-hanging fruit here.

ttsiodras avatar Dec 10 '22 11:12 ttsiodras

...and if I comment out the BLAS stuff, and use just this change:

CFLAGS   = -I.              -O3 -std=c11   -fPIC -Ofast -march=native
CXXFLAGS = -I. -I./examples -O3 -std=c++11 -fPIC -Ofast -march=native

...SSE gets automatically enabled (due to the "native") and the time gets down to 36 seconds.

IMHO you should default to this build option - I'll open a merge request.

ttsiodras avatar Dec 10 '22 11:12 ttsiodras

Merge request: https://github.com/ggerganov/whisper.cpp/pull/252

ttsiodras avatar Dec 10 '22 11:12 ttsiodras

Decided not to add the flags: https://github.com/ggerganov/whisper.cpp/pull/252#issuecomment-1355347757

ggerganov avatar Dec 16 '22 18:12 ggerganov

-Ofast made a huge difference on my crappy old CPU (that says it supports avx but didn't for some reason).

Anyways.. I understand not addit support, but maybe we should document this in the readme at least so folks know they can hack their makefile

On Sat, 17 Dec 2022, 4:14 am Georgi Gerganov, @.***> wrote:

Closed #251 https://github.com/ggerganov/whisper.cpp/issues/251 as completed.

— Reply to this email directly, view it on GitHub https://github.com/ggerganov/whisper.cpp/issues/251#event-8058913895, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALQR66BI2UPM4MEP2R64XLWNSWPZANCNFSM6AAAAAAS2GPYJA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jaybinks avatar Dec 18 '22 06:12 jaybinks