exo No module named 'llvmlite'

Found this module was missed by requirements. Need to install manually:

pip install llvmlite

Oct 06 '24 01:10 FFAMax

Hi @FFAMax , After install the package, are you able to load the model into the GPU? I had the same issue, but after install that, it still doesn't work. After investigating, I discovered this might be related to a compatibility problem between tinygrad and llvmlite. Have you experienced a similar issue?

Oct 13 '24 09:10 Sean-fn

Hello, @Sean-fn. So, I tried it on Mac Intel-based and regular PC with PCI-E GPU on Debian/Ubuntu, at this point I do not have any success case of using it. Ended up with conclusion:

Need some exact successful cases so can know how to reproduce
Better to move in the direction of using docker images so it can be easily deployed regardless of local env.

Oct 13 '24 22:10 FFAMax

Just FYI. I got more luck on Description: Ubuntu 22.04.3 LTS Release: 22.04

and

Description: Debian GNU/Linux 12 (bookworm) Release: 12 Codename: bookworm

and

Description: Ubuntu 22.10 Release: 22.10 Codename: kinetic

Oct 28 '24 02:10 FFAMax

Also tried run Llama-3.1-8B with 1, 2, and finally it stopped crashing only with 3 nodes which are: Linux Box (NVIDIA GEFORCE GTX 1080) 8GB Linux Box (NVIDIA GEFORCE GTX 1070) 8GB Linux Box (NVIDIA GEFORCE GTX 1080 TI) 11GB

Because it unable pick multiple GPUs on the same machine, was need deploy 3 physical hosts

Oct 28 '24 02:10 FFAMax

I have the same issue. Installing the package does not help. I am using Arch Linux / Kicksecure laptops.

Jan 28 '25 05:01 ascendforever

Same, it's just a few megabytes, this should be default requirement, also to allow automatic fallback. For ubuntu 22.04 you'll need a few python addon debs to make anything work

I tried the `pyver3.12` branch to be a bit further away from `main`. 
You can substitute that number for whatever you use.

First, if you look through the instal outputs you will see you're missing pip and it tries to do a user install to /home/user/.exo/venv or something. it can mess things up slightly.

Fix that by installing pip the "debian way".

apt install

python3.12-full (since I also needed to add -dev you might get away with 2+3)
python3.12-venv
python3.12-dev

pip install

Then if you run a query you'll get the No module named 'llvmlite'

in venv created by the exo $ source install.sh later also run (.venv) $ pip install llvmlite

Next, you'll have the next problem:

llvm.sh - install LLVM/CLANG in a recent version

solves this: Error: Failed to fetch completions: Error processing prompt (see logs with DEBUG>=2): dtypes.bfloat16

You need to go up with your CLANG/LLVM version and patch something out of tinygrad. https://github.com/tinygrad/tinygrad/issues/6905#issuecomment-2742287430

(i reinstalled llvmlite after this, keep getting the error.

(.venv) floh@beast-lnx:/gpu/exo-pyver312$ DEBUG=2 CLANG_VERSION=21 LLVM=1 /gpu/exo-pyver312/.venv/bin/exo

Gonna try re-running install.sh but if that doesn't work i'll drop the ball here, it seems only a little step missing to make it work. None of this worked, the reason seems to be that the handling of a missing bfloat16 is being reworked at tinygrad but then the process got interrupted. https://github.com/tinygrad/tinygrad/pull/8723

I'm only here due to ROCM not being nice to me, so i'm fixing the workaround to the workaround to the problem - I think I've had it for the day.

Finally, from what I learned now (without looking at the types for real) bfloat16 is a performance improvement trick, and if they're typecasting it to float16 or -32 then there should be an approach to just use fp16/fp32 models and exo should work.

if that makes any sense to someone, maybe - please explain how to do it for laypeople like me?

Mar 28 '25 17:03 FlorianHeigl