baybe icon indicating copy to clipboard operation
baybe copied to clipboard

Issue with simulate_scenarios

Open dpersaud opened this issue 1 year ago • 6 comments

When trying to run the "transfer learning" example from github on my M1 Max, i get the following error :

> /opt/anaconda3/envs/env202407_BO/lib/python3.10/site-packages/baybe/utils/botorch_wrapper.py:22: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
>   x_tensor = Tensor(x)
> /opt/anaconda3/envs/env202407_BO/lib/python3.10/site-packages/baybe/utils/botorch_wrapper.py:22: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
>   x_tensor = Tensor(x)
>   0%|                                                                                                                                                                                                                                          | 0/50 [00:00<?, ?it/s]/opt/anaconda3/envs/env202407_BO/lib/python3.10/site-packages/linear_operator/utils/interpolation.py:71: UserWarning: torch.sparse.SparseTensor(indices, values, shape, *, device=) is deprecated.  Please use torch.sparse_coo_tensor(indices, values, shape, dtype=, device=). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_new.cpp:623.)
>   summing_matrix = cls(summing_matrix_indices, summing_matrix_values, size)
> [1]    36434 segmentation fault  python bayBE_tl_test.py
> /opt/anaconda3/envs/env202407_BO/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown                                                                         
>   warnings.warn('resource_tracker: There appear to be %d '

dpersaud avatar Aug 21 '24 18:08 dpersaud

Thanks for reporting the issue :) Can you share some details about how you installed BayBE and list packages installed in your environment so that we can try to recreate/investigate?

AVHopp avatar Aug 22 '24 09:08 AVHopp

Hi @dpersaud, thanks for reaching out. The "segmentation fault" suggests to me that you could be simply running out of memory and the process gets killed. Hence a few questions/suggestions from my side:

  • Is the problem reproducible?
  • Have you monitored the memory consumption while running the process?
  • What is your memory size?
  • Are you using the exact code from the example or did you make any modifications?

AdrianSosic avatar Aug 26 '24 06:08 AdrianSosic

This has been reported by at least two other M1 users before, it seems like this is something highly machine specific

Invoking @mhrmsn and @marcelmbn, perhaps you can recall when you had the semaphore issues?

Scienfitz avatar Aug 26 '24 06:08 Scienfitz

I had exactly the same error a few months ago. The things I remember are that in my case it was related to the scikit-learn-extra. In my case (and this probably depends a lot on how the environment was set up) the library sklearn_extra/cluster/_commonnn_inner.cpython-312-darwin.so was not correctly linked to the appropriate system library (macOS).

To investigate this further, you can run the following command on your machine:

otool -l /Users/<user>/mambaforge/envs/<your_environment>/lib/python3.12/site-packages/sklearn_extra/cluster/_commonnn_inner.cpython-312-darwin.so

(Adapt the full path to your environment, obviously.)

The printout should look at some point similar to this:

Load command 10
          cmd LC_LOAD_DYLIB
      cmdsize 48
         name /usr/lib/libc++.1.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 1700.255.0
compatibility version 1.0.0
Load command 11
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name /usr/lib/libSystem.B.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 1345.100.2
compatibility version 1.0.0
Load command 12
          cmd LC_RPATH
      cmdsize 72
         path /Users/<user>/mambaforge/envs/<environment>/lib (offset 12)

When I had the error, I also got this error in a certain setup, which led me to the above command.

ImportError: dlopen(/site-packages/sklearn_extra/cluster/_common n_inner.cpython-312-darwin.so, 0x0002): Library not loaded: @rpath/libc++.1.dylib

(Probably I skipped the multiprocessing part to receive this error.)

Honestly, I don't remember 100% why exactly the link to the system library (/usr/lib/libc++.1.dylib) failed in some cases, but that was basically the reason for the error. It seemed that the dependency had to be compiled from source when setting up the environment for the link to succeed (probably there are no macOS-arm64 binaries available).

TL;DR: Try to install baybe from scratch by just using pip install baybe in a completely empty environment, except for python and pip.

marcelmbn avatar Aug 26 '24 08:08 marcelmbn

@dpersaud please try the above suggestion, I think we could add this to Known Issues if you can confirm

Scienfitz avatar Aug 26 '24 14:08 Scienfitz

Hello @dpersaud repeating my request for your feedback

Scienfitz avatar Sep 09 '24 15:09 Scienfitz

closing due to inactivity a new section has been added to known_issues.md roughly describing this issue and the suggested fix

Scienfitz avatar Oct 11 '24 14:10 Scienfitz