Open3D icon indicating copy to clipboard operation
Open3D copied to clipboard

Mac M1: libc++abi: terminating with uncaught exception of type pybind11::stop_iteration:

Open bexcite opened this issue 2 years ago • 6 comments

Checklist

Describe the issue

Open3d pybind11 module installed exception translator crashes when others pybind11 modules loaded together with open3d and other's module exception arrives to the open3d exception translator first.

Ughh, it's complicated, let's see on an example below.

Consider the simple example with two pybind11 modules loaded (open3d and gtsam) that leads to a crash on Mac M1 system:

import gtsam
import open3d

vals = gtsam.Values()
vals.insert(1, 1)
vals.insert(2, 2)
for v in vals.keys():
    print("v = ", v)

Crashes with an error:

libc++abi: terminating with uncaught exception of type pybind11::stop_iteration:

To fix the thing, one need to load open3d first and then gtsam, i.e.:

import open3d
import gtsam
...

What's going on here!?

I've spent some time in lldb with my other custom pybind11 module that I've had debug symbols for and it lead me to the chains of exception translators that pybind11 establishes when new module is loaded. https://github.com/pybind/pybind11/blob/master/include/pybind11/detail/internals.h#L460

Then when pybind11 exception is fired in any other native module it goes to the chain and tries exception translators in a reverse order of their loading. https://github.com/pybind/pybind11/blob/master/include/pybind11/pybind11.h#L1000

So in an above crash situation open3d exception translators appears to be first and instead of silently passing through and re-throw exception to give a chance for other translators to handle it the process crashes with an above error.

I don't have a deep knowledge of C++ runtimes but seems that some weird translation of original exception happened so it passed through the catch clause of py::detail::apply_exception_translators() function and popped unhandled to a further runtime.

With pybind11 2.8.0+ one can install local_exception_translators that are handled always first for a local module, and with it we've been able to hack around the open3d translator crash on Mac M1, but it will not work for other modules that already built and can't be used with open3d if open3d modules loaded the last one.

UPDATE: Btw, it's not happening in Linux, Win, Mac x64 based systems, the only case where it's that weird behavior is Mac M1 system. Thanks!

Steps to reproduce the bug

import gtsam
import open3d

vals = gtsam.Values()
vals.insert(1, 1)
vals.insert(2, 2)
for v in vals.keys():
    print("v = ", v)

Error message

libc++abi: terminating with uncaught exception of type pybind11::stop_iteration:

Expected behavior

no crash

Open3D, Python and System information

- Operating system: macOS 11.6 (20G165)
- Python version: Python 3.9.9 (main, Nov 21 2021, 03:16:13), [Clang 13.0.0 (clang-1300.0.29.3)]
- Open3D version: 0.14.1 (installed into fresh venv with pip)
- System architecture: arm64
- Is this a remote workstation?: no
- How did you install Open3D?: pip
- Compiler version (if built from source): (was uased to debug the issue with `lldb` and my other code)

cc --version
Apple clang version 12.0.5 (clang-1205.0.22.9)
Target: arm64-apple-darwin20.6.0
Thread model: posix

Additional information

the same situation is not hapening when we switch open3d to some other pybind11 based module (for example pip install ouster-sdk and then import ouster.client as client is playing nicely with gtsam and all exception translators are passing through/re-throwing exceptions without abi crashes)

bexcite avatar Feb 26 '22 01:02 bexcite

I can confirm this issue on M1 mac. Thanks for the detailed info.

yxlao avatar Mar 03 '22 04:03 yxlao

I can confirm this issue on M1 mac. Thanks for the detailed info.

Same issue on M1 pro. I had installed pybind11 for another project.

siddas27 avatar Mar 19 '22 23:03 siddas27

I also faced this issue on M1 mac. The issue for me was that some of the open3d variables were C++ variables (std:vector) and had to be converted using np.asarray().

My steps:

  1. Identify which line is causing the issue. For me, it was accessing vertices: my_var = mesh.vertices
  2. Print out the variable to confirm it is a C++ type (will say something like std:vector)
  3. Cast the variable to numpy array: np.asarray(mesh.vertices)

w3ichen avatar Jul 22 '22 03:07 w3ichen

I confirm that it also happens on my M1 and importing open3d first helps.

ducha-aiki avatar Aug 17 '22 13:08 ducha-aiki

I have a similar issue but it seems it's not related to multiple pybind modules.

Mac OS Monterey 12.5.1 / M1 Pro / Open3D 0.15.1

When I run this minimal code snippet, it works normally

import open3d as o3d
o3d.core.Tensor([1.0, 2.0], dtype=o3d.core.Dtype.Float32)

But when encounter line 2 in a debugger (e.g. Pycharm), I get: libc++abi: terminating with uncaught exception of type pybind11::stop_iteration:

matemijolovic avatar Sep 02 '22 21:09 matemijolovic

I wouldn't be surprised that when you are running Pycharm debugger it adds (via regular import/load_module routines) some native modules that then interferes with the open3d code when stop_iteration exception tries to be propagated. Though it's a hypothesis (more like guessing).

On Fri, Sep 2, 2022 at 2:34 PM Mate Mijolović @.***> wrote:

I have a similar issue but it seems it's not related to multiple pybind modules.

Mac OS Monterey 12.5.1 / M1 Pro / Open3D 0.15.1

When I run this minimal code snippet, it works normally

import open3d as o3do3d.core.Tensor([1.0, 2.0], dtype=o3d.core.Dtype.Float32)

But when encounter line 2 in a debugger (e.g. Pycharm), I get: libc++abi: terminating with uncaught exception of type pybind11::stop_iteration:

— Reply to this email directly, view it on GitHub https://github.com/isl-org/Open3D/issues/4809#issuecomment-1235913637, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABGOKV2BXGVKBLPJSKFDNLV4JXETANCNFSM5PLWHCLA . You are receiving this because you authored the thread.Message ID: @.***>

-- Pavlo Bashmakov, Software Engineer, 3D Mapping @ Ouster https://capsulesbot.com/

bexcite avatar Sep 03 '22 01:09 bexcite

It's almost a year since the bug is reported and it's still there. We just hit it just now again when open3d module imported after other python module with native pybind lib. The fix is still the same for anyone struggling with this:

  • make sure that import open3d happens before everything else (like ouster-sdk, gtsam, kiss-icp, and other libs that are using pybind under the hood)

bexcite avatar Feb 16 '23 18:02 bexcite

Any workaround? Even with open3d imported first, I experience this issue now.

themightyoarfish avatar Mar 06 '23 13:03 themightyoarfish

@w3ichen s workaround works for me.

themightyoarfish avatar Mar 06 '23 13:03 themightyoarfish

Thanks to @w3ichen I had the same problem on Mac M1. When I'm enumerating the indices which are returned from a KDTree, I get the error.: [k, idx, _] = tree.search_radius_vector_3d(p, eps)

for i, pidx in enumerate(idx)
    pass

If I replace idx with numpy.asarray(idx) then it works with no error.

shayan-nikoo avatar Mar 21 '23 15:03 shayan-nikoo

This is fixed in PR #6008 and available in v0.17.0. If building from source use master or tag v0.17.0-1fix6008

ssheorey avatar Mar 21 '23 19:03 ssheorey

Hi @ssheorey, I was getting the same error when trying to run the pose_graph_optim.py file in the python examples:

libc++abi: terminating due to uncaught exception of type pybind11::stop_iteration: zsh: abort python run_pose_adjustment.py -i data /opt/homebrew/Cellar/[email protected]/3.10.13/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown

I tracked it down to self.dict_nodes, self.dict_edges = self._graph2dicts() in the def solve_(self, dist_threshold=0.07, preference_loop_closure=0.1): function.

Specifically, I had to change these lines:

for i, node in enumerate(nodes): dict_nodes[i] = node.pose for edge in edges: dict_edges[(edge.source_node_id, edge.target_node_id)] = (edge.transformation, edge.information, edge.uncertain)

to

` for i in range(len(nodes)): dict_nodes[i] = nodes[i].pose

    for i in range(len(edges)):
        edge = edges[i]
        dict_edges[(edge.source_node_id,
                    edge.target_node_id)] = (edge.transformation,
                                             edge.information,
                                             edge.uncertain)`
                                             

Should I create a new issue or just leave it here? It's basically the same problem. I'm using a M1 Pro and open3d (0.17.0 installed via pip) in a python environment.

Thanks

jtressle avatar Oct 13 '23 21:10 jtressle