llama.cpp Build on Debian Docker

Hello, wanted to experiment installing the system in a Linux/Debian container but I am getting the following error when I am issuing make.

"failed in call to 'always_inline' '_mm256_cvtph_ps'" (I have a more detailed output bellow.)

A. I used the bitnami/pytorch which is based on debian https://hub.docker.com/r/bitnami/pytorch B. i downloaded the git repository on a folder named app and issued the following command :

docker run --user root -v /host/DOCKER/images/PYTORCH/app:/app/ -it --rm bitnami/pytorch /bin/bash

C. consequently updated and installed build-essential with

apt-get update & apt-get install build-essential

D. Last, i entered in the repo folder and got the following compilation error while issuing make

`make I llama.cpp build info: I UNAME_S: Linux I UNAME_P: unknown I UNAME_M: x86_64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mavx -mavx2 -msse3 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread I LDFLAGS: I CC: cc (Debian 10.2.1-6) 10.2.1 20210110 I CXX: g++ (Debian 10.2.1-6) 10.2.1 20210110

cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mavx -mavx2 -msse3 -c ggml.c -o ggml.o In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:113, from ggml.c:155: ggml.c: In function 'ggml_vec_dot_f16': /usr/lib/gcc/x86_64-linux-gnu/10/include/f16cintrin.h:52:1: error: inlining failed in call to 'always_inline' '_mm256_cvtph_ps': target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) | ^~~~~~~~~~~~~~~ ggml.c:911:33: note: called from here 911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i )(x))) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:921:37: note: in expansion of macro 'GGML_F32Cx8_LOAD' 921 | #define GGML_F16_VEC_LOAD(p, i) GGML_F32Cx8_LOAD(p) | ^~~~~~~~~~~~~~~~ ggml.c:1274:21: note: in expansion of macro 'GGML_F16_VEC_LOAD' 1274 | ay[j] = GGML_F16_VEC_LOAD(y + i + jGGML_F16_EPR, j); | ^~~~~~~~~~~~~~~~~ In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:113, from ggml.c:155: /usr/lib/gcc/x86_64-linux-gnu/10/include/f16cintrin.h:52:1: error: inlining failed in call to 'always_inline' '_mm256_cvtph_ps': target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) | ^~~~~~~~~~~~~~~

` I am note sure what to try next or if i have done the sequence properly Thanks for this anticipating work!

Mar 13 '23 23:03 meltoner

_mm256_cvtph_ps is related to AVX. At a guess your Docker container build env doesn't support AVX for some reason.

Are you building your Docker images on a Mac/ARM64 or native x86_64? If ARM64 then this issue seems to indicate AVX emulation on ARM64 isn't supported yet. Even if it does become supported it will likely be dog slow.

Also, I don't think you need a pytorch image, but instead a minimal image that provides a gcc/g++ v10 build env (assuming you preprocess the model files outside the image). I haven't tried it, but gcc:10.2 might get you closer to what you need.

Mar 14 '23 08:03 gjmulder

Thank you! will give it a try!

Mar 14 '23 08:03 meltoner

I just installed Docker on my AMD x86_64 system and spun up an image of gcc:10.2. I was able to get it to compile.

However, it looks like someone has just created a pull request to add Docker support here #132

Mar 14 '23 13:03 gjmulder

thats just lovely!! thanks!

Mar 14 '23 14:03 meltoner

Hi just to provide my research on the matter it seems that virtual box is the problem limiting the avx instructions. I deduct this because compilation failed on docker gcc:10.2, then tried on the virtual machine and failed also, but worked on the bare metal server. As a result ill focus on virtual box and in case i find a solution will post it for the knowledge base.

Mar 15 '23 21:03 meltoner

llama.cpp llama.cpp copied to clipboard

Build on Debian Docker

llama.cpp
llama.cpp copied to clipboard