memray icon indicating copy to clipboard operation
memray copied to clipboard

Bus Error on macOS When Using Memray with XGBoost (memray::intercept::calloc recursion crash)

Open keitabroadwater opened this issue 7 months ago • 1 comments

Is there an existing issue for this?

  • [x] I have searched the existing issues

Current Behavior

Thanks for all the great work on this tool.

When profiling an xgboost training call on macOS using memray, the process crashes with a bus error. After investigation, it appears to be caused by infinite recursion between Memray’s calloc interceptor and macOS’s thread-local variable resolution (dyld::ThreadLocalVariables).

Expected Behavior

A memray_profile.bin file created, without crashing.

Steps To Reproduce

Environment OS: macOS (Apple Silicon, M2)

Python: 3.12 (via uv and .venv)

Memray: 1.16.0

XGBoost: 2.0.3 (ARM64 native binary)

Allocator: pymalloc (default)

Repro Steps Here’s a minimal example that reproduces the issue:

import xgboost as xgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)

model = xgb.XGBClassifier(n_jobs=1)
model.fit(X_train, y_train)  # ← this line causes the crash when Memray is tracking

When run with:

memray run --output memray-profile.bin example.py

...the process crashes with:

zsh: bus error  memray run --output memray-profile.bin example.py

Memray Version

1.16.0

Python Version

3.12

Operating System

macOS

Anything else?

Summary: When profiling an xgboost training call on macOS using memray, the process crashes with a bus error. After investigation, it appears to be caused by infinite recursion between Memray’s calloc interceptor and macOS’s thread-local variable resolution (dyld::ThreadLocalVariables).

Environment OS: macOS (Apple Silicon, M2)

Python: 3.12 (via uv and .venv)

Memray: 1.16.0

XGBoost: 2.0.3 (ARM64 native binary)

Allocator: pymalloc (default)

Command:

memray run --output memray-profile.bin example.py

Repro Steps Here’s a minimal example that reproduces the issue:

import xgboost as xgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)

model = xgb.XGBClassifier(n_jobs=1)
model.fit(X_train, y_train)  # ← this line causes the crash when Memray is tracking

When run with:

memray run --output memray-profile.bin example.py

...the process crashes with:

zsh: bus error  memray run --output memray-profile.bin example.py

Diagnosis The crash trace shows repeated recursion between:

memray::intercept::calloc → _tlv_get_addr → memray::intercept::calloc

…ending in a fatal bus error.

From crash logs:

memray::intercept::calloc(unsigned long, unsigned long)
dyld::ThreadLocalVariables::instantiateVariable(...)
libdyld.dylib::_tlv_get_addr

... (repeats) This appears to be a recursion between:

Memray's calloc interception and macOS’s thread-local variable handling (triggered when xgboost uses threads or libomp)

Which re-enters calloc, again invoking Memray

Workarounds Works fine on Linux using the same script and versions.

Avoiding memray run and using memray.Tracker() manually still crashes.

Works with scalene or memory_profiler on macOS.

keitabroadwater avatar May 27 '25 16:05 keitabroadwater

I think https://github.com/bloomberg/memray/pull/732 should fix this as it should be going via the pthread API. This went into v1.17.2. Can you test with that one?

pablogsal avatar May 27 '25 16:05 pablogsal