ramalama rag fails on Apple m3 Pro
Tried ramalama rag on a Macbook Pro with m3 Pro CPU/GPU running macOS Sequoia 15.3.2 (24D81). Ramalama was installed via the recommend script curl. Podman was installed from homebrew and is using the GPU machine:
mnewsome@mnewsome-mac ~ % which podman
/opt/homebrew/bin/podman
mnewsome@mnewsome-mac ~ % podman version
Client: Podman Engine
Version: 5.4.1
API Version: 5.4.1
Go Version: go1.24.1
Built: Tue Mar 11 17:22:13 2025
Build Origin: brew
OS/Arch: darwin/arm64
Server: Podman Engine
Version: 5.4.1
API Version: 5.4.1
Go Version: go1.23.7
Git Commit: b79bc8afe796cba51dd906270a7e1056ccdfcf9e
Built: Tue Mar 11 00:00:00 2025
OS/Arch: linux/arm64
mnewsome@mnewsome-mac ~ % podman machine list
NAME VM TYPE CREATED LAST UP CPUS MEMORY DISK SIZE
podman-machine-default* libkrun 25 hours ago Currently running 6 2GiB 100GiB
mnewsome@mnewsome-mac ~ %
Here's the actual error:
% ramalama --debug rag /tmp/LEMUR.md newimg
run_cmd: podman inspect [quay.io/ramalama/ramalama:0](http://quay.io/ramalama/ramalama:0)
Working directory: None
Ignore stderr: False
Ignore all: True
run_cmd: podman run --rm -v /private/tmp/LEMUR.md:/docs//private/tmp/LEMUR.md:ro,z -v ./RamaLama_rag_jkjkmgrd/vectordb:/output:z [quay.io/ramalama/ramalama-rag:latest](http://quay.io/ramalama/ramalama-rag:latest) doc2rag /output /docs/
Working directory: None
Ignore stderr: False
Ignore all: False
Error: Command '['podman', 'run', '--rm', '-v', '/private/tmp/LEMUR.md:/docs//private/tmp/LEMUR.md:ro,z', '-v', './RamaLama_rag_jkjkmgrd/vectordb:/output:z', '[quay.io/ramalama/ramalama-rag:latest](http://quay.io/ramalama/ramalama-rag:latest)', 'doc2rag', '/output', '/docs/']' returned non-zero exit status 137.
I tried an update just now, and it's still exhibiting for me:
mnewsome@mnewsome-mac ~ % curl -fsSL https://raw.githubusercontent.com/containers/ramalama/s/install.sh | bash
_____ _
| __ \ | |
| |__) |__ _ _ __ ___ __ _| | __ _ _ __ ___ __ _
| _ // _` | '_ ` _ \ / _` | | / _` | '_ ` _ \ / _` |
| | \ \ (_| | | | | | | (_| | |___| (_| | | | | | | (_| |
|_| \_\__,_|_| |_| |_|\__,_|______\__,_|_| |_| |_|\__,_|
==> Auto-updating Homebrew...
Adjust how often this is run with HOMEBREW_AUTO_UPDATE_SECS or disable with
HOMEBREW_NO_AUTO_UPDATE. Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).
==> Homebrew collects anonymous analytics.
Read the analytics documentation (and how to opt-out) here:
https://docs.brew.sh/Analytics
No analytics have been recorded yet (nor will be during this `brew` run).
==> Homebrew is run entirely by unpaid volunteers. Please consider donating:
https://github.com/Homebrew/brew#donations
==> Auto-updated Homebrew!
Updated 1 tap (homebrew/cask).
You have 2 outdated formulae installed.
llama.cpp 5010 is already installed but outdated (so it will be upgraded).
==> Downloading https://ghcr.io/v2/homebrew/core/llama.cpp/manifests/5030
######################################################################### 100.0%
==> Fetching llama.cpp
==> Downloading https://ghcr.io/v2/homebrew/core/llama.cpp/blobs/sha256:0ee4c2eb
######################################################################### 100.0%
==> Upgrading llama.cpp
5010 -> 5030
==> Pouring llama.cpp--5030.arm64_sequoia.bottle.tar.gz
🍺 /opt/homebrew/Cellar/llama.cpp/5030: 141 files, 70MB
==> Running `brew cleanup llama.cpp`...
Disable this behaviour by setting HOMEBREW_NO_INSTALL_CLEANUP.
Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).
Removing: /opt/homebrew/Cellar/llama.cpp/5010... (141 files, 72.7MB)
Removing: /Users/mnewsome/Library/Caches/Homebrew/llama.cpp_bottle_manifest--5010.cpp... (22.3KB)
Removing: /Users/mnewsome/Library/Caches/Homebrew/llama.cpp--5010.cpp... (21.7MB)
🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙🦙
mnewsome@mnewsome-mac ~ % ramalama --debug rag /tmp/LEMUR.md newimg
run_cmd: podman inspect quay.io/ramalama/ramalama:0
Working directory: None
Ignore stderr: False
Ignore all: True
run_cmd: podman run --rm -v /private/tmp/LEMUR.md:/docs//private/tmp/LEMUR.md:ro,z -v ./RamaLama_rag_4u_uwx0g/vectordb:/output:z quay.io/ramalama/ramalama-rag:latest doc2rag /output /docs/
Working directory: None
Ignore stderr: False
Ignore all: False
Error: Command '['podman', 'run', '--rm', '-v', '/private/tmp/LEMUR.md:/docs//private/tmp/LEMUR.md:ro,z', '-v', './RamaLama_rag_4u_uwx0g/vectordb:/output:z', 'quay.io/ramalama/ramalama-rag:latest', 'doc2rag', '/output', '/docs/']' returned non-zero exit status 137.
mnewsome@mnewsome-mac ~ %
@mattnewsome I just made updates to the installer can you try again?
Please install Podman from podman.io or best to install it via podman-desktop installer, not from BREW.
I updated the installer and it still failed, so I suspect Dan's right that it's the way I installed podman. That said, I'll just add that I also ran it this way via WebUI:
% ramalama serve llama3
% podman run -it --rm --network slirp4netns:allow_host_loopback=true -e OPENAI_API_BASE_URL=http://host.containers.internal:8080 -p 3000:8080 -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main
(and then visited http://127.0.0.1:3000)
This works in the browser. I can add documents like the .md file I tried to use at the command line (ramalama rag
If the answer is "no", I'll happily try moving to podman from the podman-desktop installer, just want to check first as it seems odd it works via WebUI still.
Read https://developers.redhat.com/articles/2025/04/03/simplify-ai-data-integration-ramalama-and-rag
So what is still broken?
Hmm, @rhatdan im wondering if we should have debug bubbles up those errors to run the rag container. I believe its due to podman installed via brew like @rhatdan said. I can try and reproduce this issue on my mac to see whats going on
So what is still broken?
Uninstalled podman from brew, reinstalled using Podman Desktop, reran ramalama rag, same result:
mnewsome@mnewsome-mac ~ % ramalama --debug rag /tmp/LEMUR.md newimg
run_cmd: podman inspect quay.io/ramalama/ramalama-rag:0
Working directory: None
Ignore stderr: False
Ignore all: True
Attempting to pull quay.io/ramalama/ramalama-rag:0...
run_cmd: podman pull quay.io/ramalama/ramalama-rag:0
Working directory: None
Ignore stderr: True
Ignore all: False
run_cmd: podman run --rm -v /private/tmp/LEMUR.md:/docs//private/tmp/LEMUR.md:ro,z -v ./RamaLama_rag_ixgu2nd_/vectordb:/output:z quay.io/ramalama/ramalama-rag:latest doc2rag /output /docs/
Working directory: None
Ignore stderr: False
Ignore all: False
Error: Command '['podman', 'run', '--rm', '-v', '/private/tmp/LEMUR.md:/docs//private/tmp/LEMUR.md:ro,z', '-v', './RamaLama_rag_ixgu2nd_/vectordb:/output:z', 'quay.io/ramalama/ramalama-rag:latest', 'doc2rag', '/output', '/docs/']' returned non-zero exit status 137.
mnewsome@mnewsome-mac ~ % which podman
/opt/podman/bin/podman
mnewsome@mnewsome-mac ~ % podman version
Client: Podman Engine
Version: 5.4.1
API Version: 5.4.1
Go Version: go1.24.1
Git Commit: b79bc8afe796cba51dd906270a7e1056ccdfcf9e
Built: Tue Mar 11 18:41:13 2025
Build Origin: pkginstaller
OS/Arch: darwin/arm64
Server: Podman Engine
Version: 5.4.1
API Version: 5.4.1
Go Version: go1.23.7
Git Commit: b79bc8afe796cba51dd906270a7e1056ccdfcf9e
Built: Tue Mar 11 00:00:00 2025
OS/Arch: linux/arm64
mnewsome@mnewsome-mac ~ %
hello, what is the memory of your podman machine ? check with podman machine inspect
podman machine set -m (to set memory if machine is stopped)
you should probably increase the memory to see if it's changing something
So actually now with Podman Desktop, ramalama won't run for me at all:
mnewsome@mnewsome-mac ~ % ramalama run llama3
Attempting to pull quay.io/ramalama/ramalama:0...
Attempting to pull quay.io/ramalama/ramalama:0...
Error: stat /dev/dri: no such file or directory
Memory was set to 2048. I doubled it to 4096. No change, but likely still broken due to looking for /dev/dri, which I understand doesn't exist on macOS.
I'm out of time to investigate further, but hope this feedback is useful.
@mattnewsome yeah our containers solution is in a funny place, you need to manually configure podman-machine to use krunkit, then it should work. podman-machine doesn't have /dev/dri with the default hypervisor.
Seems we have an issue with RamaLama installs on MAC showing version 0?
ramalama version
This means ramalama will try to pull the non-existant :0 version and then it should fail over to latest. In stead of pulling 0.7
I believe the RamaLama is not installed from a python sense correctly.
def version():
try:
return importlib.metadata.version("ramalama")
except importlib.metadata.PackageNotFoundError:
return "0"
return "0"
@mattnewsome how did you install ramalama ?
it seems you used the script, you may give a try to the brew version of RamaLama https://formulae.brew.sh/formula/ramalama
I confirm the issue, fresh installation with script with mac m3 lead the "ramalama version" to give output "0". Using the command "ramalama rag .." it tries to pull ramalama-rag image with tag 0 and it fails.
Installing ramalama trought brew resolve the issue. So, it seems like the problem is coming from the installation with script.
Now running to the same problem too. installed ramalama with brew. Running ramalama rag returns non-zero exit status 137, whatever this is.
Does ramalama version show 0?
What does ramalama --debug rag ...
Show?
Now running to the same problem too. installed ramalama with brew. Running ramalama rag returns non-zero exit status 137, whatever this is.
I get 137 also on Fedora: #1221
Googling exit code 137 shows:
Exit Code 137: Causes & Best Practices to Prevent It
Memory is to containers what water is to people: a vital resource. Just as bad things start to happen if you don't drink water for a time, your containers will start experiencing errors if you don't supply them with enough memory.
Specifically, they'll probably experience exit code 137, which signals in most cases that Kubernetes killed a Pod due to an Out of Memory (OOM) killed error. When this happens, you need to get to the root of the problem so that you can get your Pod back up and running – and prevent OOM errors from recurring.
Keep reading for guidance as we explain everything you need to know about exit code 137 on Kubernetes.```
In my setup podman machine has 16G or RAM. So I find it fairly surprising that one still runs out of memory while running docling. Btw. for some time I had generation of vector db working. But with 0.7.4 it somehow stopped.
Can you try with 0.7.5 which was just released. We are doing docling differently now, caching more to disk, so it should not be as memory intensive.
With 0.7.5 it works on a bunch of docx files (what worked before 0.7.4), but still gives 137 on a single xlsx file. Wouldn't call the file too big though, 300 rows * 8 colums.
Ill check this out it seems the fix I did lowers ram but there could be other underlying issues on mac with the vm!
https://github.com/containers/ramalama/pull/1247
Working on it. So far it seems its a docling issue. .MD files seem to be the best compatibility right now
Thanks @bmahabirbu for looking into this.
FWIW, I'm seeing this regularly with Ramalama 0.8 (installed via script) on Podman 5.4.2 (installed from podman.io download). I managed to process a single-line markdown file, but any time I try and process HTML or PDF, I get a 137.
A friendly reminder that this issue had no activity for 30 days.
@arthurbarr @mattnewsome still an issue?
A friendly reminder that this issue had no activity for 30 days.
Since we never heard back, closing. Reopen if this is still an issue.