LightGBM icon indicating copy to clipboard operation
LightGBM copied to clipboard

LightGBM is incompatible with libomp 12 and 13 on macOS

Open SchantD opened this issue 4 years ago • 27 comments

Description

LightGBM cannot be used to fit multiple models in parallel using threads with the latest libomp. On 2014 MacBook Pro:

OMP: Error #13: Assertion failure at kmp_runtime.cpp(3689).
OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see https://bugs.llvm.org/.
[1]    17358 abort      python myfile2.py

On 2019 MacBook Pro:

OMP: Error #131: Thread identifier invalid.

Setting nthreads=1 doesn't solve the problem.

Reproducible example

from lightgbm import LGBMClassifier
import numpy as np
from concurrent.futures import ThreadPoolExecutor

x = np.random.random((200, 4))
y = x.sum(axis=1) >= 2


def myfunc(a=7):
    test = LGBMClassifier().fit(x, y)
    print(test.predict(x))


with ThreadPoolExecutor(20) as tpe:
    print(list(tpe.map(myfunc, range(20))))

Environment info

LightGBM version or commit hash: 3.1.1 (with python 3.7.3) and 3.2.1 (with python 3.9.4)

brew install libomp

libomp: stable 12.0.0 (bottled)
LLVM's OpenMP runtime library
https://openmp.llvm.org/
/usr/local/Cellar/libomp/12.0.0 (9 files, 1.5MB) *
Poured from bottle on 2021-04-26 at 11:06:26
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/libomp.rb

Command(s) you used to install LightGBM

pip install lightgbm

Additional Comments

The code does work with libomp version 11. Downgraded using

wget https://raw.githubusercontent.com/Homebrew/homebrew-core/fb8323f2b170bd4ae97e1bac9bf3e2983af3fdb0/Formula/libomp.rb
brew unlink libomp
brew install libomp.rb

SchantD avatar Apr 26 '21 10:04 SchantD

All our tests are passing with libomp 12: https://github.com/microsoft/LightGBM/runs/2437586276

==> Pouring libomp--12.0.0.catalina.bottle.tar.gz

...

-- The C compiler identification is AppleClang 12.0.0.12000032
-- The CXX compiler identification is AppleClang 12.0.0.12000032
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode_12.4.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Applications/Xcode_12.4.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -Xclang -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -Xclang -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test MM_PREFETCH
-- Performing Test MM_PREFETCH - Success
-- Using _mm_prefetch
-- Performing Test MM_MALLOC
-- Performing Test MM_MALLOC - Success
-- Using _mm_malloc
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/runner/work/LightGBM/LightGBM/build
[  5%] Building CXX object CMakeFiles/_lightgbm.dir/src/boosting/gbdt_model_text.cpp.o
[  5%] Building CXX object CMakeFiles/_lightgbm.dir/src/boosting/boosting.cpp.o
[  8%] Building CXX object CMakeFiles/_lightgbm.dir/src/boosting/gbdt.cpp.o
[ 11%] Building CXX object CMakeFiles/_lightgbm.dir/src/boosting/gbdt_prediction.cpp.o
[ 14%] Building CXX object CMakeFiles/_lightgbm.dir/src/boosting/prediction_early_stop.cpp.o
[ 17%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/bin.cpp.o
[ 20%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/config.cpp.o
[ 23%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/config_auto.cpp.o
[ 26%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/dataset.cpp.o
[ 29%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/dataset_loader.cpp.o
[ 32%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/file_io.cpp.o
[ 35%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/json11.cpp.o
[ 38%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/metadata.cpp.o
[ 41%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/parser.cpp.o
[ 44%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/train_share_states.cpp.o
[ 47%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/tree.cpp.o
[ 50%] Building CXX object CMakeFiles/_lightgbm.dir/src/metric/dcg_calculator.cpp.o
[ 52%] Building CXX object CMakeFiles/_lightgbm.dir/src/metric/metric.cpp.o
[ 55%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/ifaddrs_patch.cpp.o
[ 58%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/linker_topo.cpp.o
[ 61%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/linkers_mpi.cpp.o
[ 64%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/linkers_socket.cpp.o
[ 67%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/network.cpp.o
[ 70%] Building CXX object CMakeFiles/_lightgbm.dir/src/objective/objective_function.cpp.o
[ 73%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/cuda_tree_learner.cpp.o
[ 76%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/data_parallel_tree_learner.cpp.o
[ 79%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/feature_parallel_tree_learner.cpp.o
[ 82%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/gpu_tree_learner.cpp.o
[ 85%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/linear_tree_learner.cpp.o
[ 88%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/serial_tree_learner.cpp.o
[ 91%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/tree_learner.cpp.o
[ 94%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/voting_parallel_tree_learner.cpp.o
[ 97%] Building CXX object CMakeFiles/_lightgbm.dir/src/c_api.cpp.o
[100%] Linking CXX shared library ../lib_lightgbm.so
[100%] Built target _lightgbm

...

====== 234 passed, 4 skipped, 2 xfailed, 79 warnings in 120.17s (0:02:00) ======

I'm not sure LightGBM was ever able to

fit multiple models in parallel using threads

Refer to https://lightgbm.readthedocs.io/en/latest/FAQ.html#lightgbm-hangs-when-multithreading-openmp-and-using-forking-in-linux-at-the-same-time.

I think you can migrate to the bug-free Intel toolchain or compile threadless version: https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html#build-threadless-version-not-recommended.

StrikerRUS avatar Apr 26 '21 11:04 StrikerRUS

I'm not sure LightGBM was ever able to

fit multiple models in parallel using threads

Well, we have been using a similar approach as stated above (ThreadPool + fit) successfully in production settings for quite some time, and also facing this problem now. As already commented, this issue is quickly solved by downgrading to the older libomp version, without any side effects. Maybe this has been something without any tests, which just now happens to fail?

I would also like to point out that this issue could also happen with different scikit-learn wrappers using the joblib/delayed approach. The default here is to use multiprocessing (which works), but threading (in order to save memory etc) does not.

from joblib import parallel_backend
import numpy as np
from lightgbm import LGBMClassifier
from sklearn.model_selection import cross_validate

x = np.random.random((200, 5))
y = x.sum(axis=1) > 2.5


def run(*args, **kwargs):
    estimator = LGBMClassifier()
    estimator.fit(x, y)
    return estimator.predict(x)


with parallel_backend('threading', n_jobs=5):
    print(cross_validate(LGBMClassifier(), x, y, n_jobs=5, cv=5))

Refer to https://lightgbm.readthedocs.io/en/latest/FAQ.html#lightgbm-hangs-when-multithreading-openmp-and-using-forking-in-linux-at-the-same-time.

Interestingly, it seems that this (somewhat) fixes the problem. Setting n_jobs=1 works for me, but also higher values (up to around n_jobs=5) seem to work. Maybe this is simply a question of spawning too many threads in total?

Some results:

  • 5 cv jobs, 10 n_jobs -> fail (OMP: Error #131: Thread identifier invalid.)
  • 5 * 5, 5 * 6, 5 * 7, 5 * 8 -> ok
  • 5 * 9 -> fail (OMP: Error #131: Thread identifier invalid.)
  • 6 * 7, 6 * 8 -> fail
  • 6 * 6 -> ok

For whatever reason it seems the threshold is between 40 (working) and 42 (failing).

Zahlii avatar Apr 26 '21 11:04 Zahlii

On XGBoost we are also facing issues with updated libomp. It has internal error: https://github.com/dmlc/xgboost/pull/6912/checks?check_run_id=2459890229

trivialfis avatar Apr 28 '21 17:04 trivialfis

Another example of regression in 12 version: https://github.com/facebookresearch/faiss/pull/1849.

I can't find this bug was reported... https://bugs.llvm.org/buglist.cgi?bug_status=all&no_redirect=1&order=changeddate%20DESC%2Cpriority%2Cbug_severity&product=OpenMP&query_format=specific

StrikerRUS avatar Apr 30 '21 13:04 StrikerRUS

Upstream bug report: https://bugs.llvm.org/show_bug.cgi?id=50579.

StrikerRUS avatar Jun 07 '21 11:06 StrikerRUS

Moving the import statement import lightgbm as lgb to line 1 in my file actually got rid of the error. As per suggestion from https://github.com/dmlc/xgboost/issues/7039#issuecomment-860910066

libomp version /usr/local/Cellar/libomp/12.0.0

Error dump when loading booster model. Putting it out here in case it is useful:

Process:               Python [6481]
Path:                  /Library/Frameworks/Python.framework/Versions/3.7/Resources/Python.app/Contents/MacOS/Python
Identifier:            Python
Version:               3.7.3 (3.7.3)
Code Type:             X86-64 (Native)
Parent Process:        zsh [511]
Responsible:           iTerm2 [403]
User ID:               501

Date/Time:             2021-07-02 10:35:36.911 +0800
OS Version:            macOS 11.4 (20F71)
Report Version:        12
Bridge OS Version:     5.4 (18P4663)

Time Awake Since Boot: 14000 seconds
Time Since Wake:       1500 seconds

System Integrity Protection: enabled

Crashed Thread:        41

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000048
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [6481]

VM Regions Near 0x48:
--> 
    __TEXT                      10388e000-10388f000    [    4K] r-x/rwx SM=COW  /Library/Frameworks/Python.framework/Versions/3.7/Resources/Python.app/Contents/MacOS/Python

Thread 0:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	0x00007fff204dc206 _kernelrpc_mach_vm_protect_trap + 10
1   libsystem_kernel.dylib        	0x00007fff204df1da mach_vm_protect + 33
2   libsystem_pthread.dylib       	0x00007fff20512589 _pthread_create + 533
3   libomp.dylib                  	0x0000000183c99568 __kmp_create_worker + 264
4   libomp.dylib                  	0x0000000183c6f2a4 __kmp_allocate_thread + 954
5   libomp.dylib                  	0x0000000183c6ac21 __kmp_allocate_team + 1311
6   libomp.dylib                  	0x0000000183c6c51c __kmp_fork_call + 5365
7   libomp.dylib                  	0x0000000183c61295 __kmpc_fork_call + 293
8   lib_lightgbm.so               	0x00000001838d5036 LightGBM::ParallelPartitionRunner<int, false>::ParallelPartitionRunner(int, int) + 118
9   lib_lightgbm.so               	0x00000001838c9379 LightGBM::GBDT::GBDT() + 777
10  lib_lightgbm.so               	0x00000001838be0f1 LightGBM::Boosting::CreateBoosting(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*) + 1745
11  lib_lightgbm.so               	0x0000000183abf490 LightGBM::Booster::Booster(char const*) + 400

seahrh avatar Jul 02 '21 06:07 seahrh

Facing the exact reported issue.Subscribed for more updates.

mldeveloper01 avatar Jul 02 '21 14:07 mldeveloper01

I have the same issue and did some testing: basically libomp 12.0 works with Catalina, but results in segfault for Big Sur. Downgrading to 11.1 worked for Big Sur (tested on Intel MBP and M1 MBP via rosetta2)

mkos avatar Jul 09 '21 12:07 mkos

Unfortunately, LLVM developers haven't fixed this bug (https://github.com/microsoft/LightGBM/issues/4229#issuecomment-855839996) in 12.0.1 release.

StrikerRUS avatar Jul 09 '21 13:07 StrikerRUS

One suggested workaround in the upstream bug report without downgrading libomp version is to set some environmental variables:

LIBOMP_USE_HIDDEN_HELPER_TASK=0
LIBOMP_NUM_HIDDEN_HELPER_THREADS=0

https://bugs.llvm.org/show_bug.cgi?id=50579#c1

StrikerRUS avatar Sep 29 '21 23:09 StrikerRUS

New major LLVM version 13 was released 4 days ago: https://github.com/llvm/llvm-project/releases/tag/llvmorg-13.0.0. And the latest Homebrew libomp formulae is pointing to that version now: https://github.com/Homebrew/homebrew-core/blob/4343aee9c28d28b9ed3208b5933df54c29b916fb/Formula/libomp.rb#L4.

But unfortunately this bug (https://github.com/microsoft/LightGBM/issues/4229#issuecomment-855839996) wasn't fixed in stable 13 release. I'm going to reflect this fact in the issue's title.

StrikerRUS avatar Oct 04 '21 21:10 StrikerRUS

LLVM has changed Bugzilla to GitHub Issues as the main issue tracker.

New replies to the original bug report contains the following:

This bug is being removed from the LLVM 13.0.1 release milestone. If you have a fix or think this bug is important enough to block the release, please explain why in a comment and add the bug back to the LLVM 13.0.1 release milestone.

I cannot reproduce the failure on macOS with trunk as well. Besides, helper thread should be disabled on macOS. I don't know if the latest HomeBrew version has already covered newer code base. Please let me know if the problem still exists.

Everyone who is subscribed to this issue and has easy access to macOS, please check the latest available libomp version from Homebrew (stable ✅ 13.0.0 at the moment of writing this comment) and report your results here: https://github.com/llvm/llvm-project/issues/49923.

StrikerRUS avatar Jan 08 '22 23:01 StrikerRUS

@StrikerRUS I saw https://github.com/llvm/llvm-project/issues/49923 is closed, is this problem solved?

guolinke avatar Mar 01 '22 15:03 guolinke

@guolinke I haven't seen this. According to the conversation in https://github.com/llvm/llvm-project/issues/49923, they closed that issue due to the inability to reproduce the issue.

Please, anyone subscribed to this issue, check whether the error occurs with the most recent libomp version 13.0.1.

StrikerRUS avatar Mar 02 '22 00:03 StrikerRUS

MacBook Air (M1, 2020) running macOS Monterey version 12.5.1.

Trying to fit LightGBMModel gives me [1] 36565 segmentation fault python test_lightgbm.py.

libomp:

brew info libomp
==> libomp: stable 14.0.6 (bottled)
LLVM's OpenMP runtime library
https://openmp.llvm.org/
/usr/local/Cellar/libomp/11.1.0 (9 files, 1.4MB)
  Poured from bottle on 2022-09-13 at 15:01:22
/usr/local/Cellar/libomp/14.0.6 (7 files, 1.6MB)
  Poured from bottle on 2022-09-13 at 15:05:49

I tried:

wget https://raw.githubusercontent.com/Homebrew/homebrew-core/fb8323f2b170bd4ae97e1bac9bf3e2983af3fdb0/Formula/libomp.rb
brew unlink libomp
brew install libomp.rb

But it gives me:

Error: Failed to load cask: libomp.rb
Cask 'libomp' is unreadable: wrong constant name #<Class:0x00007fa9e92bb340>
Warning: Treating libomp.rb as a formula.
Warning: libomp 11.1.0 is already installed, it's just not linked.
To link this version, run:
  brew link libomp

tuomijal avatar Sep 15 '22 06:09 tuomijal

Potentially useful discussion here

tuomijal avatar Sep 15 '22 06:09 tuomijal

Was able to fix by downgrading to libomp 11.1.0 with:

brew uninstall --ignore-dependencies libomp

tuomijal avatar Sep 15 '22 07:09 tuomijal

Thanks for the links, and sorry for the inconvenience!

I have been interested for a while in the idea of switching this project's wheels to cibuildwheel + scikit-build, for other reasons (#5061). It looks like xgboost was able to solve this OpenMP compatibility issue by that approach as well (specifically by just bundling an older OpenMP in their wheels): https://github.com/dmlc/xgboost/pull/7621.

jameslamb avatar Sep 15 '22 14:09 jameslamb

@jameslamb We need to be very careful with embedding libomp library in our wheels:

  • https://lightgbm.readthedocs.io/en/latest/FAQ.html#lightgbm-crashes-randomly-with-the-error-like-initializing-libiomp5-dylib-but-found-libomp-dylib-already-initialized
  • https://lightgbm.readthedocs.io/en/latest/FAQ.html#lightgbm-crashes-randomly-or-operating-system-hangs-during-or-after-running-lightgbm

StrikerRUS avatar Sep 18 '22 14:09 StrikerRUS

This worked for me in a Local dataspell notebook on M1 ARM. Looks like if you only need tabular package, then you may be in luck:

!pip install -U pip
!pip install -U setuptools wheel
!pip install "mxnet<2.0.0"
!pip install "autogluon.tabular"

nickordoodle avatar Dec 06 '22 02:12 nickordoodle

@nickordoodle Thanks for posting this. Can you please explain how your post is related to the topic "LightGBM is incompatible with OpenMP 12 and 13 on macOS"?

jameslamb avatar Dec 06 '22 02:12 jameslamb

I found tonight that upgrading to the latest libomp shipped by Homebrew (v15.0.6), I was able to compile LightGBM, build the Python package, and run all of its tests without issue on my macbook (Intel chip, macOS 12.2.1).

brew install libomp
cd ./python-package
pip install .
cd ..
pytest tests/python_package_tests

jameslamb avatar Dec 24 '22 05:12 jameslamb

Assigning this to myself... I'll prioritize this for the next release of LightGBM (after v4.2.0).

I observed a deadlock in this simple example tonight:

rm -rf ./dist
sh build-python.sh sdist
pip install ./dist/lightgbm-*.tar.gz
import lightgbm as lgb
from sklearn.datasets import make_regression

X, y = make_regression(n_samples=10_000)
dtrain = lgb.Dataset(X, label=y)
dtrain.construct()

With the following:

  • OS: macOS 14.1.2 (Sonoma)
  • CPU: M2 chip
  • compiler: AppleClang 15.0.0
  • Python: 3.11.7
  • OpenMP: 17.0.6
brew info libomp
==> libomp: stable 17.0.6 (bottled) [keg-only]
LLVM's OpenMP runtime library
https://openmp.llvm.org/
/opt/homebrew/Cellar/libomp/17.0.6 (7 files, 1.7MB)
  Poured from bottle using the formulae.brew.sh API on 2023-12-19 at 22:06:33
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/lib/libomp.rb
License: MIT

Installing with OpenMP turned off, I didn't experience any deadlocks or other issues.

pip install \
    --config-settings=cmake.define.USE_OPENMP=OFF \
    ./dist/lightgbm-*.tar.gz

For more details: https://github.com/microsoft/LightGBM/pull/6191#issuecomment-1863831484

jameslamb avatar Dec 20 '23 04:12 jameslamb

FYI (not sure if this is common knowledge yet): when developing on LightGBM on Apple Silicon, I never turned off OpenMP but used gcc instead of clang for compilation (for me, that was):

export CXX=g++-13 CC=gcc-13

This fixed any problems I had 😅

borchero avatar Dec 20 '23 10:12 borchero

Thanks @borchero , that's helpful!

Looking into this a bit today, I also think that some of these failures might not actually be about incompatibility with particular versions of OpenMP, but rather related to #5106.

Fixing the search paths embedded in lib_lightgbm.so on macOS might eliminate some of these cases where programs segfault because multiple versions of libomp have been loaded.

details (click me)

Tried the following today on my intel mac:

  • OS: macOS 14.1.2 (Sonoma)
  • CPU: intel chip
  • compiler: AppleClang 13.0.0
  • Python: 3.11.7
  • OpenMP: 17.0.6
  1. build lib_lightgbm
rm -rf ./build
mkdir ./build
cd ./build
cmake ..
make -j2 _lightgbm
cd ..
  1. check what it linked against
# check what it's linked to
otool -L lib_lightgbm.so
# 
../lib_lightgbm.so:
    @rpath/lib_lightgbm.so (compatibility version 0.0.0, current version 0.0.0)
    /usr/local/opt/libomp/lib/libomp.dylib (compatibility version 5.0.0, current version 5.0.0)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1200.3.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0

Notice that even though I was building in an active conda environment, it found Homebrew's OpenMP, /usr/local/opt/libomp/lib/libomp.dylib.

  1. install the Python library
sh build-python.sh install --precompile
  1. run an example

This segfaults, I think because it's finding the llvm-openmp from conda:

python ./examples/python-guide/logistic_regression.py
Performance of `binary` objective with binary labels:
Segmentation fault: 11

Running with some debugging stuff set... it looks like that's exactly what's happening. 2 versions of OpenMP are being loaded.

DYLD_PRINT_LIBRARIES=1 \
python examples/python-guide/logistic_regression.py 2>&1 \
| grep libomp
dyld[32037]: <891B2F9B-F926-3D67-AA9C-D58D47668AFB> /Users/jlamb/mambaforge/envs/lgb-dev/lib/libomp.dylib
dyld[32037]: <C91365F6-6644-300A-9277-1946696E9E86> /usr/local/Cellar/libomp/17.0.4/lib/libomp.dylib

Looking a bit more closely, it seems that scikit-learn comes with an sklearn/utils/_openmp_helpers.cpython-311-darwin.so which has an RPATH entry that causes conda's libomp.dylib to be loaded.

otool -L /Users/jlamb/mambaforge/envs/lgb-dev/lib/python3.11/site-packages/sklearn/utils/_openmp_helpers.cpython-311-darwin.so
/Users/jlamb/mambaforge/envs/lgb-dev/lib/python3.11/site-packages/sklearn/utils/_openmp_helpers.cpython-311-darwin.so:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1197.1.1)
	@rpath/libomp.dylib (compatibility version 5.0.0, current version 5.0.0)

Patching out lib_lightgbm's corresponding entry so that it will end up not loading a different version, the example runs without segfaulting.

install_name_tool \
    -change /usr/local/opt/libomp/lib/libomp.dylib \
    @rpath/libomp.dylib \
    /Users/jlamb/mambaforge/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/lib/lib_lightgbm.so
otool -L \
    /Users/jlamb/mambaforge/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/lib/lib_lightgbm.so
/Users/jlamb/mambaforge/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/lib/lib_lightgbm.so:
	@rpath/lib_lightgbm.so (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libomp.dylib (compatibility version 5.0.0, current version 5.0.0)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1200.3.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)
python examples/python-guide/logistic_regression.py
Performance of `binary` objective with binary labels:
{'time': 0.031093120574951172, 'correlation': 0.6012584922759894, 'logloss': 0.15545640415178236}
Performance of `xentropy` objective with binary labels:
{'time': 0.0031642913818359375, 'correlation': 0.6012584922759894, 'logloss': 0.15545640415178236}
Performance of `xentropy` objective with probability labels:
{'time': 0.006477832794189453, 'correlation': 0.884189150816587, 'logloss': 0.1551448517607808}
Best `binary` time: 0.002405881881713867
Best `xentropy` time: 0.0023250579833984375

Just stopping here for now to post my notes. I'll continue working on this.

jameslamb avatar Dec 28 '23 21:12 jameslamb

Adding another relevant link: https://github.com/bacpop/pp-sketchlib/issues/42#issuecomment-748054538

jameslamb avatar Jan 23 '24 05:01 jameslamb

Fixing the search paths embedded in lib_lightgbm.so on macOS might eliminate some of these cases where programs segfault because multiple versions of libomp have been loaded.

We did this in #6391. As of lightgbm==4.4.0, lightgbm's macOS wheels should no longer segfault in the presence of other libomp.dylib already loaded in the process.

I'm going to mark this awaiting response, so it'll be closed automatically in 30 days if there are not any other comments. Doing that to leave some time for others to discover issues and post follow-up comments here.

Thank you all very much for the patience and helpful comments. Please come by and contribute again some time, we'd love the help!

jameslamb avatar Jun 15 '24 05:06 jameslamb