cgal icon indicating copy to clipboard operation
cgal copied to clipboard

convex_decomposition_3: Uncatchable exception causes crash after copying Nef_3

Open kintel opened this issue 9 years ago • 12 comments

This looks like a bug related to taking a copy of a Nef_polyhedron_3. To reproduce, I've edited one of the built-in examples (https://github.com/CGAL/cgal/blob/master/Convex_decomposition_3/examples/Convex_decomposition_3/list_of_convex_parts.cpp). See below for my code. I'm essentially taking a copy of the read Nef_polyhedron_3 before passing the copy to convex_decomposition_3.

Issue A: This causes an assertion failure. The assertion failure only happens if I make a copy of the Nef_polyhedron_3.

Issue B: The assertion isn't catchable, because the Nef_polyhedron_3 destructor crashes. It could be that the object is left in an unrecoverable state (which isn't very nice as it forces me to let my app crash).

The test data: https://raw.githubusercontent.com/openscad/openscad/openscad-2019.01-RC1/cgal/data/issue1455.nef3

FYI: The nef3 example is generated based on user input to OpenSCAD. My goal is to shield users from having their process crash due to feeding corner cases into the software.

Edit: Tested on Mac OS X using CGAL-4.6.3.

list_of_convex_parts.cpp:

#include <CGAL/Exact_predicates_exact_constructions_kernel.h>
#include <CGAL/Polyhedron_3.h>
#include <CGAL/Nef_polyhedron_3.h>
#include <CGAL/IO/Nef_polyhedron_iostream_3.h>
#include <CGAL/Nef_3/SNC_indexed_items.h>
#include <CGAL/convex_decomposition_3.h> 
#include <list>
#include <CGAL/assertions_behaviour.h>

typedef CGAL::Exact_predicates_exact_constructions_kernel Kernel;
typedef CGAL::Polyhedron_3<Kernel> Polyhedron_3;
typedef CGAL::Nef_polyhedron_3<Kernel, CGAL::SNC_indexed_items> Nef_polyhedron_3;
typedef Nef_polyhedron_3::Volume_const_iterator Volume_const_iterator;

int main() {

  CGAL::set_error_behaviour(CGAL::THROW_EXCEPTION);
#if 1
  Nef_polyhedron_3 tmp;
  std::cin >> tmp;
  Nef_polyhedron_3 N(tmp);
#else
  Nef_polyhedron_3 N;
  std::cin >> N;
#endif

  try {
    CGAL::convex_decomposition_3(N);
  }
  catch (...) {
    std::cerr << "Assertion failed\n";
    return 1;
  }
  std::list<Polyhedron_3> convex_parts;

  // the first volume is the outer volume, which is 
  // ignored in the decomposition
  Volume_const_iterator ci = ++N.volumes_begin();
  for( ; ci != N.volumes_end(); ++ci) {
    if(ci->mark()) {
      Polyhedron_3 P;
      N.convert_inner_shell_to_polyhedron(ci->shells_begin(), P);
      convex_parts.push_back(P);
    }
  }
  std::cout << "decomposition into " << convex_parts.size() << " convex parts " << std::endl;

}

kintel avatar Oct 16 '15 01:10 kintel

This problem also effects my project, I would be very interested in a solution such as that proposed in https://github.com/CGAL/cgal/pull/451 being adopted.

GilesBathgate avatar Oct 31 '15 19:10 GilesBathgate

I see that https://github.com/CGAL/cgal/pull/451 was merged into CGAL/master. Does it solve this issue? Can we close it then?

sgiraudot avatar Oct 28 '16 06:10 sgiraudot

Going through some old issues: This is still an issue as of CGAL-5.5

kintel avatar Mar 26 '24 14:03 kintel

@kintel I helped make some fixes to CGAL in this area. Is it still related to throwing an exception from a destructor? (i.e while there is already an exception, and it's calling destructors to unwind the stack)

GilesBathgate avatar Mar 26 '24 15:03 GilesBathgate

It looks like it does:

~Nef_polyhedron_3_rep() {
    CGAL_NEF_TRACEN( "Nef_polyhedron_3_rep: destroying SNC structure "<<&snc_<<
            ", point locator "<<pl_);
    snc_.clear();
    delete pl_;
  }

-> EXC_BAD_ACCESS (code=1, address=0x68) on the delete statement.

..and it never reaches our catch-all handler.

kintel avatar Mar 27 '24 00:03 kintel

@kintel I don't think you can catch memory access issues. There must be somewhere that pl_ is not initialised. or a double delete?

GilesBathgate avatar Mar 27 '24 08:03 GilesBathgate

@GilesBathgate Yeah, you're probably right. I was making assumptions based on the same test case crashing. I'll go back and do a proper investigation on our side.

kintel avatar Mar 28 '24 00:03 kintel

Went back and tested the originally reported program and data against CGAL-5.6.1, and it's still failing in the same way (I can catch the assertion, but the program later crashes in the Nef_polyhedron_3 destructor).

kintel avatar Aug 25 '24 00:08 kintel

I am not able to look at the code right now, but... When you make a copy, does it also copy pl_ or is it lazy initialised? Maybe the fix is simply a null check for pl_ in the destruction

GilesBathgate avatar Aug 25 '24 07:08 GilesBathgate

Looks lazy. Null check wouldn't do much as deleting nullptr is perfectly ok. I assume pl_ is not null, but somehow got corrupted.

kintel avatar Aug 25 '24 15:08 kintel

The problem appears to be in External_structure_builder.h, on line 128:

    delete old_pl;
    SNC_external_structure C(*sncp,pl);
    C.clear_external_structure();
    C.build_external_structure();

This deletes the point locator, which then gets deleted again in the ~Nef_polyhedron_3_rep() destruction.

This scenario only occurs because the exception code path causes the function to exit early in the C.build_external_structure(); line

GilesBathgate avatar Aug 25 '24 17:08 GilesBathgate

I think it could be fixed by simply re-ordering the operations:

    SNC_external_structure C(*sncp,pl);
    C.clear_external_structure();
    C.build_external_structure();
    delete old_pl;

GilesBathgate avatar Aug 25 '24 17:08 GilesBathgate