115 Illegal instruction
The anythingllm is installed in Ubuntu server.
In the system LLM set ,the system can connect to the Ollama server and get the models .
But when chat in workspace ,the docker is exited.
1.Show the info in browser:
2.and the docker logs:
"/usr/local/bin/docker-entrypoint.sh: line 7: 115 Illegal instruction (core dumped) node /app/server/index.js"
What's the problem?
Docker engine - it appears https://github.com/Mintplex-Labs/anything-llm/issues/1290#issuecomment-2101960232
That issue is specifically occurring on Mac, but it is the same on Linux/Ubuntu as well.
Hi @lishaojun616 , have you managed to resolve the issue? I am experiencing the same situation as you described in #1323 .
Hello everyone,
I face the same problem.
Same setup, ubuntu 22.04 LTS, using ollama as llm.
I have installed the newest docker engine, build anything-llm with docker-compose-v2.
...
[TELEMETRY SENT] {
event: 'workspace_created',
distinctId: 'c060354a-b171-4702-b83a-9da2ef0612e4',
properties: {
multiUserMode: false,
LLMSelection: 'ollama',
Embedder: 'native',
VectorDbSelection: 'lancedb',
runtime: 'docker'
}
}
[Event Logged] - workspace_created
[TELEMETRY SENT] {
event: 'onboarding_complete',
distinctId: 'c060354a-b171-4702-b83a-9da2ef0612e4',
properties: { runtime: 'docker' }
}
[NativeEmbedder] Initialized
/usr/local/bin/docker-entrypoint.sh: line 7: 119 Illegal instruction (core dumped) node /app/server/index.js
gitlab-runner@gradio:~$ docker --version
Docker version 26.1.3, build b72abbb
This issue is marked as closed. Is there a solution available?
Best regrads
Joachim
Encountering the same issue. Using Ubuntu Server 22.04 with Docker, Yarn, and Node installed as recommended in HOW_TO_USE_DOCKER.md#how-to-use-dockerized-anything-llm. Ollama is on another machine, serving at 0.0.0.0 (other remote apps function correctly with this setup, even in Docker).
EDIT: Forgot to mention:
Docker version 26.1.3, build b72abbb
Ubuntu is running in a VM
Experiencing the identical error as posted by @joachimt-git:
[Event Logged] - update_llm_provider
[Event Logged] - update_embedding_engine
[Event Logged] - update_vector_db
[TELEMETRY SENT] {
event: 'enabled_multi_user_mode',
distinctId: '20b022d1-14cc-490c-ab86-4f941a32f7bc',
properties: { multiUserMode: true, runtime: 'docker' }
}
[Event Logged] - multi_user_mode_enabled
[TELEMETRY SENT] {
event: 'login_event',
distinctId: '20b022d1-14cc-490c-ab86-4f941a32f7bc::1',
properties: { multiUserMode: false, runtime: 'docker' }
}
[Event Logged] - login_event
[TELEMETRY SENT] {
event: 'workspace_created',
distinctId: '20b022d1-14cc-490c-ab86-4f941a32f7bc::1',
properties: {
multiUserMode: true,
LLMSelection: 'ollama',
Embedder: 'native',
VectorDbSelection: 'lancedb',
runtime: 'docker'
}
}
[Event Logged] - workspace_created
[TELEMETRY SENT] {
event: 'onboarding_complete',
distinctId: '20b022d1-14cc-490c-ab86-4f941a32f7bc',
properties: { runtime: 'docker' }
}
[NativeEmbedder] Initialized
/usr/local/bin/docker-entrypoint.sh: line 7: 117 Illegal instruction (core dumped) node /app/server/index.js
Any suggestions for resolving this?
How can I assist?
Thanks
This is certainly a configuration issue. Considering all is well until the native embedder is called this might be arch related - but we support both ARM and x86. Regardless, here is my exact steps that fail to repro:
- Obtain Ubuntu 22.04 LTS AWS instance - used t3.small - x86
-
curl -fsSL https://get.docker.com -o get-docker.sh -
sudo sh get-docker.sh -
sudo usermod -aG docker $USER - docker -v
Docker version 26.1.3, build b72abbb
-
docker pull mintplexlabs/anythingllm
Run:
export STORAGE_LOCATION=$HOME/anythingllm && \
mkdir -p $STORAGE_LOCATION && \
touch "$STORAGE_LOCATION/.env" && \
docker run -d -p 3001:3001 \
--cap-add SYS_ADMIN \
-v ${STORAGE_LOCATION}:/app/server/storage \
-v ${STORAGE_LOCATION}/.env:/app/server/.env \
-e STORAGE_DIR="/app/server/storage" \
mintplexlabs/anythingllm
Access via instance IP on port 3001 - I get the interface, onboard, create workspace, and upload documents.
[Event Logged] - workspace_created
[TELEMETRY SENT] {
event: 'onboarding_complete',
distinctId: 'fe73dd18-d52e-4a4c-bb62-a6adba7491d1',
properties: { runtime: 'docker' }
}
-- Working readme.pdf --
-- Parsing content from pg 1 --
-- Parsing content from pg 2 --
-- Parsing content from pg 3 --
-- Parsing content from pg 4 --
-- Parsing content from pg 5 --
[SUCCESS]: readme.pdf converted & ready for embedding.
[CollectorApi] Document readme.pdf uploaded processed and successfully. It is now available in documents.
[TELEMETRY SENT] {
event: 'document_uploaded',
distinctId: 'fe73dd18-d52e-4a4c-bb62-a6adba7491d1',
properties: { runtime: 'docker' }
}
[Event Logged] - document_uploaded
Adding new vectorized document into namespace sample
[NativeEmbedder] Initialized
[RecursiveSplitter] Will split with { chunkSize: 1000, chunkOverlap: 20 }
Chunks created from document: 14
[NativeEmbedder] The native embedding model has never been run and will be downloaded right now. Subsequent runs will be faster. (~23MB)
[NativeEmbedder] Downloading Xenova/all-MiniLM-L6-v2 from https://huggingface.co/
....truncated
[NativeEmbedder - Downloading model] onnx/model_quantized.onnx 100%
[NativeEmbedder] Embedded Chunk 1 of 1
Inserting vectorized chunks into LanceDB collection.
Caching vectorized results of custom-documents/readme.pdf-d717ca8c-6ac0-4514-8d6d-94ac48760afe.json to prevent duplicated embedding.
[TELEMETRY SENT] {
event: 'documents_embedded_in_workspace',
distinctId: 'fe73dd18-d52e-4a4c-bb62-a6adba7491d1',
properties: {
LLMSelection: 'openai',
Embedder: 'native',
VectorDbSelection: 'lancedb',
runtime: 'docker'
}
}
[Event Logged] - workspace_documents_added
Considering all of this occurs on [NativeEmbedder] Initialized this to me would indicate a lack of resources to run the local embedder and if that is the case, you should allocate more resources to the container or use another embedder. That is the only way I could imagine a full core dump or illegal instruction occurring with Illegal instruction. Either that or the underlying chip arch is not found/supported for Xenova transformers.js.
Hi Timothy,
I think what @xsn-cloud and I have in commun is that we both use ollama.
May that ne the cause of the failure?
Joachim
It would not, since the exception is in the AnythingLLM container and if there was an illegal instruction in the Ollama program it would throw in that container/program. All AnythingLLM does is execute a fetch request to the Ollama instance, which would be permitted in any container
@timothycarambat Thanks for addressing this issue. Please let me know if there's anything I can assist you with.
I've conducted the following experiment, also considering that it might be an issue with Docker running on VMs and to verify resource issues:
UPDATE: Also tested it in Windows 10 (WSL, Docker for Windows, Docker version 26.1.1, build 4cf5afa): Same issue
- Clean install of Debian 12 on baremetal - Dual Xeon E5-2650 v2 @ 2.60GHz with 96GB of RAM
- Docker version 26.1.3, build b72abbb
- Followed the procedure exactly as you did in your previous comment (the same one you used on an Ubuntu 22.04 LTS AWS)
- No documents loaded
- During the onboarding, anything-llm successfully communicates with ollama, accurately retrieves the installed models, and allows the selection of the model without issues.
- Model selected: llama3 with 4K context window. Other models tested with same results.
This is the outcome after the onboarding when attempting to send a "hello" in a new chat. (docker logs -f [containerid]).
Please note that the container was restarted from scratch to ensure the clarity of the logs.
Collector hot directory and tmp storage wiped!
Document processor app listening on port 8888
Environment variables loaded from .env
Prisma schema loaded from prisma/schema.prisma
✔ Generated Prisma Client (v5.3.1) to ./node_modules/@prisma/client in 338ms
Start using Prisma Client in Node.js (See: https://pris.ly/d/client)
'''
import { PrismaClient } from '@prisma/client'
const prisma = new PrismaClient()
'''
or start using Prisma Client at the edge (See: https://pris.ly/d/accelerate)
'''
import { PrismaClient } from '@prisma/client/edge'
const prisma = new PrismaClient()
'''
See other ways of importing Prisma Client: http://pris.ly/d/importing-client
Environment variables loaded from .env
Prisma schema loaded from prisma/schema.prisma
Datasource "db": SQLite database "anythingllm.db" at "file:../storage/anythingllm.db"
20 migrations found in prisma/migrations
No pending migrations to apply.
┌─────────────────────────────────────────────────────────┐
│ Update available 5.3.1 -> 5.14.0 │
│ Run the following to update │
│ npm i --save-dev prisma@latest │
│ npm i @prisma/client@latest │
└─────────────────────────────────────────────────────────┘
[TELEMETRY ENABLED] Anonymous Telemetry enabled. Telemetry helps Mintplex Labs Inc improve AnythingLLM.
prisma:info Starting a sqlite pool with 33 connections.
fatal: not a git repository (or any of the parent directories): .git
getGitVersion Command failed: git rev-parse HEAD
fatal: not a git repository (or any of the parent directories): .git
[TELEMETRY SENT] {
event: 'server_boot',
distinctId: '4f39e3fb-ac8c-4043-9586-c21ef46b0c47',
properties: { commit: '--', runtime: 'docker' }
}
[CommunicationKey] RSA key pair generated for signed payloads within AnythingLLM services.
Primary server in HTTP mode listening on port 3001
[NativeEmbedder] Initialized
/usr/local/bin/docker-entrypoint.sh: line 7: 163 Illegal instruction (core dumped) node /app/server/index.js
One more clarification: The error occurs after sending a message in the chatbox. Until then, the last message displayed is [NativeEmbedder] Initialized, and it remains unchanged until the message is sent.
Thanks a lot for your time.
If you were to not use the native embedder, this problem would not surface. The only commonality between all of this is varying CPUs. Transformers.js which runs the native embedder, uses ONNX runtime and at this point the root cause has to be coming from there as this only occurs when using the native embedder and that is the supporting libraries to enable that functionality.
I had the same issue running Docker in Ubuntu 24.04 VM on a Proxmox host. I switched the CPU in the guest to "host," and it fixed the problem. Just wanted to share in case anyone else is having the same struggle I did. Hope this helps!
Does the CPU you swapped to support AVXv2?
No, my CPU does not support AVX2 however it supports AVX
Does the CPU you swapped to support AVXv2?
I am sorry, i do not know how to check that, i just changed o "host" being a Intel Core i9-9900K CPU
At this time, the working hypothesis is that since Transformers.js uses ONNX runtime it will fail to execute any model (including the built in embedder) if AVX2 is not supported https://github.com/microsoft/onnxruntime
@jorgen-k https://www.intel.com/content/www/us/en/products/sku/186605/intel-core-i99900k-processor-16m-cache-up-to-5-00-ghz/specifications.html
Instruction Set Extensions Intel® SSE4.1, Intel® SSE4.2, Intel® AVX2
Supports AVX2
I am using kvm as hyperviser and the virtual cpu doesn't support avx2. When I configure a passthrough of the cpu (which supports avx2) as @jorgen-k has suggested it works for me as well.
I ran out of luck for the AVX cpu. It's a XEON 2660 but only supports AVX. Had to find another machine.
I had the same issue with "/usr/local/bin/docker-entrypoint.sh: line 7: 115 Illegal instruction (core dumped) node /app/server/index.js". It seems the new Docker images of AnythingLLM have some issues, possibly on older systems. To fix the issue, I tried using older Docker versions and previous AnythingLLM images. While the older Docker versions did not resolve the issue, the older AnythingLLM images worked great. The newest working version for me was "sha256:1d994f027b5519d4bc5e1299892e7d0be1405308f10d0350ecefc8e717d3154f". You can find it here: https://github.com/Mintplex-Labs/anything-llm/pkgs/container/anything-llm/209298508
Running on Centos7 Linux with (CWP7), 2X Intel(R) Xeon(R) CPU E5-2680 v2, 2X Nvidia 2080TI GPUs
@Smocvin, excellent work. Okay, then that pretty much nails down commit ca63012c0f569ad775b6fd22b9b7965d61812d77 as the issue commit. In that commit we moved from lancedb 0.1.19 to 0.4.11 (which is what we use on desktop version).
However, given how this issue seems to only be a problem with certain CPUs we have two choices:
Bump to 0.5.0 and see if that fixes it or roll back to 0.1.19. Given how we do not leverage or dive deep into LanceDBs API much, the code change is quite minimal or none.
What I will need though is some help from the community as I do not have a single machine, VM, or instance that I can replicate this bug with. So my ask is:
- If you are getting this bug, and it is on a Cloud-container service. What service and instance specs are you using so we can provision a test instance for replicate and debugging.
or
If anyone is willing to help debug the hard way I am going to creat two new tags on docker :lancedb_bump :lancedb_revert and I would need someone suffering from this issue to pull both and see which works.
Obviously if we can bump up, that would be ideal, but I would rather not field this issue for the rest of time since lancedb should just work.
Links to images
lancedb_bump: docker pull mintplexlabs/anythingllm:lancedb_bump
https://hub.docker.com/layers/mintplexlabs/anythingllm/lancedb_bump/images/sha256-40b0b28728d1bb481f01e510e96351a1970ac3fafafe4b2641cb264f0e7f8a93?context=repo
lancedb_revert: docker pull mintplexlabs/anythingllm:lancedb_revert
https://hub.docker.com/layers/mintplexlabs/anythingllm/lancedb_revert/images/sha256-f6a8d37a305756255302a8883e445056e1ab2f9ecf301f7c542685689436685d?context=repo
Can repro with a basic cloud instance on Vultr with the following specs: Cloud Compute - Shared CPU, Ubuntu 22.04 LTS x64, Intel High Performance, 25 GB NVMe, 1 vCPU, 1 GB Ram.
Then I basically just:
export STORAGE_LOCATION=$HOME/anything-llm
docker run -d -p 3001:3001 --cap-add SYS_ADMIN -v ${STORAGE_LOCATION}:/app/server/storage -v ${STORAGE_LOCATION}/.env:/app/server/.env -e STORAGE_DIR="/app/server/storage" mintplexlabs/anythingllm
Configured with OpenAI / lancedb. At that point, just tried any chat eg. typed 'hello' and then it hangs for a bit and comes up with the error message shown above and I can see the docker container died with the log:
/usr/local/bin/docker-entrypoint.sh: line 7: 102 Illegal instruction (core dumped) node /app/server/index.js
I'm happy to help debug here locally with the newly created image tags when available. I have two machines I can test on here with AVX (Debian docker) and AVX2 (Windows docker desktop). I get the core dump on the AVX machine with :latest but the AVX2 machine runs the container fine so I can provide output from both of them if needed.
@Dozer316 @acote88 @computersrmyfriends can any of you who have this issue on the master/latest image check and see if lancedb_bump or lancedb_revert work on the impact machine?
Hopefully the _bump image works, otherwise we are in for some pain, but at least we can debug from there. I am friends with the LanceDB team so I can escalate to them if the issue persists.
Hey there - revert has solved the problem on the impacted machine, bump still core dumps unfortunately.
Thanks for taking a look at this for us.
Same here. _revert works, _bump crashes. Cheers.
Results of the test:
lancedb_bump: Crashes lancedb_revert: Works
Notes:
CPU: AVX only Testing: Tested with local documents; works perfectly.
(edited: several typos, sorry)
Thank you @Dozer316 @acote88 @xsn-cloud for all taking the time to test both, which is very tedious. Ill contact the lancedb team as well as see if we can rollback the docker vectordb package in the interim.
I just closed my report out #1618 because it was caused by the same thing. AVX was not a flag on the virtual CPU.
I set the virtual CPU to pass through and it solved the issues.
Thank you @xsn-cloud
Okay, so the reason this issue occurs is due to LanceDB having its minimum target of haswell as of version ~0.20. This is because performance on AVX2 is just much better.
So right now there are two options to go around this:
- Upgrade or migrate use to a CPU that supports AVX2
- We can maintain an image that works with the older vectordb package. Truthfully, I really dont want to do that and we have to draw the line somewhere. The reason I don't is that while code-change is minor this likely will become increasingly burdensome to maintain as we continue to bump lance into later versions. Knowing where the issue lies though is very useful.
Either way, the root cause is the requirement of the underlying CPU to have AVX2. Closing currently as wontfix but discussion is still open for any more commentary.
Thanks for following up on this Timothy. In case this can help others, I compared 2 types of instances on Vultr. One called "High Performance" and the other one called "High Frequency". The "High Frequency" one does support AVX2, while the other doesn't. You can check by running:
cat /proc/cpuinfo | grep -i avx
You have no idea how long I've had to search everywhere and how many reconfigurations and reinstalls I did before I found this thread. Could you MAYBE write SOMEWHERE that currently AnythingLLM requires an AVX2 CPU to work properly?
Hello @timothycarambat
Thank you for publishing the Lancedb_revert image at all in the first place.
Currently, googling the error message took me to this thread , which in turn links to this one.
To resolve the issue all I had to do was update the docker run command with the lancedb_revert tag, and otherwise "off it went"
My pc is old, but it's what I've got and sadly upgrading just isn't on the cards any time soon - I'm grateful to have a way to try it out at all.
I appreciate it's unreasonable to put in ongoing effort for small subset of users running into incompatibility problems because they insist on using a relic from the before times - Especially since it's going to start increasingly cropping up elsewhere as well.
Having an image an in the first place is great, but it'd be nice if there was some way to "run out the clock" on updates until breaking changes inevitably came along.
Would it be possible to have an unsupported update that pins the version of lancedb in place, dumps latest and/or dev over the top of it and "When it breaks, that's the end of the ride.... May the odds be ever in your favour"?
When it does, ideally the docker image gets a 2nd "unsupported final build" release based on that point version and that's the end of that.