tulip icon indicating copy to clipboard operation
tulip copied to clipboard

Segmentation fault when adding an edge

Open rgiot opened this issue 5 years ago • 7 comments

Hello, in an application of mine, I've just encounter a segmentation fault when adding an edge to a graph. I had not this issue before, what I have done is:

  • updating Tulip to the latest version
  • using another dataset Maybe the issue come from my side or a corner case in my dataset. Here is my problematic code:
        assert(sgSource->isElement(origSource));
        assert(sgSource->isElement(artificialSource));
        sgSource->addEdge(origSource, artificialSource);

In debug mode, the 2 assertions pass without any problem, so the two nodes are supposed to be within the graph. However I obtain this failure:

Thread 1 "powerGraphViewe" received signal SIGSEGV, Segmentation fault.
tlp::GraphView::addEdgeInternal (this=0x55555a63ee10, e=...) at /home/rgiot/src/auber-code/library/tulip-core/src/GraphView.cpp:279
279	  _nodeData.get(src.id)->outDegreeAdd(1);

No need to show the stack trace, because up to frame 1 it is the code I copy pasted.

Are they any well known corner cases that make fail this addEdge ? In case it is really a bug in addEge, I can provide more assertions to help debuging it

rgiot avatar Jun 20 '19 17:06 rgiot

Hello, Je vois Thread 1 dans ta trace. As-tu d'autres threads qui accèdent au graphe en modification ?

@+ Bruno

Bruno PINAUD

Maitre de conférences/Associate Professor

Université de Bordeaux CNRS UMR 5800 LaBRI

Tel : +33 (0)5 40 00 35 03

----- Mail d’origine ----- De: rgiot [email protected] À: Tulip-Dev/tulip [email protected] Cc: Subscribed [email protected] Envoyé: Thu, 20 Jun 2019 19:11:28 +0200 (CEST) Objet: [Tulip-Dev/tulip] Segmentation fault when adding an edge (#126)

Hello, in an application of mine, I've just encounter a segmentation fault when adding an edge to a graph. I had not this issue before, what I have done is:

  • updating Tulip to the latest version
  • using another dataset Maybe the issue come from my side or a corner case in my dataset. Here is my problematic code:
        assert(sgSource->isElement(origSource));
        assert(sgSource->isElement(artificialSource));
        sgSource->addEdge(origSource, artificialSource);

In debug mode, the 2 assertions pass without any problem, so the two nodes are supposed to be within the graph. However I obtain this failure:

Thread 1 "powerGraphViewe" received signal SIGSEGV, Segmentation fault.
tlp::GraphView::addEdgeInternal (this=0x55555a63ee10, e=...) at /home/rgiot/src/auber-code/library/tulip-core/src/GraphView.cpp:279
279	  _nodeData.get(src.id)->outDegreeAdd(1);

No need to show the stack trace, because up to frame 1 it is the code I copy pasted.

Are they any well known corner cases that make fail this addEdge ? In case it is really a bug in addEge, I can provide more assertions to help debuging it

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/Tulip-Dev/tulip/issues/126

bpinaud avatar Jun 20 '19 21:06 bpinaud

the code sgSource->isElement(origSource) only cheks if _nodeData.get(origSource.id) != nullptr, so if there is a segv while accessing to the memory pointed by _nodeData.get(origSource.id) this means that the corresponding pointer has been overwritten. So I suggest you to use valgrind.

p-mary avatar Jun 21 '19 07:06 p-mary

@bpinaud: non, pas de thread. Je suppose que c'est à cause de l'interface graphique qu'on voit ça @p-mary: ok, I'll try that now, but I doubt that it wil lsucceed. Usually everything is too much slow to be usable

rgiot avatar Jun 21 '19 11:06 rgiot

@rgiot, valgrind has quite evolved in terms of performance. The last time I used it, it was pretty fast to execute Tulip with it.

Another great solution to track memory errors is to use clang AddressSanitizer. The reports it produces are much simpler to analyze compared to valgrind output.

anlambert avatar Jun 21 '19 11:06 anlambert

@anlambert ok, thnaks, I'll look on that. Valgrind fails to run my application.

--16365:0: aspacem Valgrind: FATAL: VG_N_SEGMENTS is too low.
--16365:0: aspacem   Increase it and rebuild.  Exiting now.

rgiot avatar Jun 21 '19 12:06 rgiot

Compiling Tulip with clang AddressSanitizer makes the error different. I have tracked 2 additional errors before the one I submitted

Issue 1 on my side

So I have fixed a first issue on my side.

The failure was within the EdgeBundling algorithm:

==13153==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020029675f0 at pc 0x0000007095f2 bp 0x7ffe8f3eef10 sp 0x7ffe8f3eef08
READ of size 8 at 0x6020029675f0 thread T0
    #0 0x7095f1 in bool tlp::DataSet::get<double>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, double&) const /home/rgiot/src/auber-code/library/tulip-core/include/tulip/cxx/DataSet.cxx:29:15
    #1 0x7f346d0d0da9 in EdgeBundling::run() /home/rgiot/src/auber-code/plugins/general/EdgeBundling/EdgeBundling.cpp:224:14
    #2 0x7f347ce56487 in tlp::Graph::applyAlgorithm(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, tlp::DataSet*, tlp::PluginProgress*) /home/rgiot/src/auber-code/library/tulip-core/src/Graph.cpp:697:23
    #3 0x7f346cf9305b in MultilevelEdgeBundling::bundleGraph(tlp::Graph*, tlp::LayoutProperty*) /home/rgiot/src/auber-code/externalplugins/BiometricPowerGraph/./lib/include/bpg/multilevelEdgeBundling.h:86:23
[...]

and the calling code is

    tlp::DataSet dataSet;
    dataSet.set("layout", viewLayout);
    dataSet.set("long_edges", 0.5);
    dataSet.set("iterations", 5);
    dataSet.set("split_ratio", 20);
    bool res = graph->applyAlgorithm(
      "Edge bundling",
      error,
      &dataSet,
      nullptr);

EdgeBundling.cpp:224 corresponds to dataSet->get("split_ratio", splitRatio); which is specified to be a double in the header wheras I used an int in my code. I suggest to make a minor modification in line 213 to replace splitRatio = 10; by splitRatio = 10.0; in case it was the source of my trouble.

I have thus replace the dataSet.set("split_ratio", 20); by dataSet.set("split_ratio", 20.0); to fix this first issue.

Issue 2 on Tulip side

The second issue appears later with this assertion that fails:

powerGraphViewer: /home/rgiot/src/auber-code/library/tulip-core/src/SimpleTest.cpp :112 : static void tlp::SimpleTest::makeSimple(tlp::Graph *, vector<tlp::edge> &, const bool):  l'assertion « SimpleTest::isSimple(graph, directed) 

So, I guess that I have a graph on which makeSimple fails. But I do not know if it is because SimpleTest was failing on my graph or if because my graph structure is incoherent (I guess it is supposed to never happens as the code is well tested). For example, I am unable to save the graph before calling makeSimple

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff29cd535 in __GI_abort () at abort.c:79
#2  0x00007ffff29cd40f in __assert_fail_base (fmt=0x7fffe8ec88eb "%s%s%s :%u : %s%s l'assertion « %s » a échoué.\n%n", assertion=0x7ffff52781c0 <str> "isElement(elt)", 
    file=0x7ffff5278200 <str> "/home/rgiot/src/auber-code/library/tulip-core/include/tulip/IdManager.h", line=358, function=<optimized out>) at assert.c:92
#3  0x00007ffff29dd012 in __GI___assert_fail (assertion=0x7ffff52781c0 <str> "isElement(elt)", file=0x7ffff5278200 <str> "/home/rgiot/src/auber-code/library/tulip-core/include/tulip/IdManager.h", line=358, 
    function=0x7ffff52783c0 <__PRETTY_FUNCTION__._ZNK3tlp17SGraphIdContainerINS_4edgeEE6getPosES1_> "unsigned int tlp::SGraphIdContainer<tlp::edge>::getPos(ID_TYPE) const [ID_TYPE = tlp::edge]") at assert.c:101
#4  0x00007ffff50700c9 in tlp::SGraphIdContainer<tlp::edge>::getPos (this=0x614000797578, elt=...) at /home/rgiot/src/auber-code/library/tulip-core/include/tulip/IdManager.h:358
#5  0x00007ffff51be5f1 in tlp::TLPExport::getEdge (this=<optimized out>, e=...) at /home/rgiot/src/auber-code/library/tulip-core/src/TLPExport.cpp:113
#6  0x00007ffff51bfc2f in tlp::TLPExport::saveLocalProperties (this=0x60d000018570, os=warning: RTTI symbol not found for class 'std::basic_ofstream<char, std::char_traits<char> >'
..., g=0x614000797440) at /home/rgiot/src/auber-code/library/tulip-core/src/TLPExport.cpp:369
#7  0x00007ffff51bd150 in tlp::TLPExport::saveProperties (this=0x60d000018570, os=warning: RTTI symbol not found for class 'std::basic_ofstream<char, std::char_traits<char> >'
..., g=0x614000797440) at /home/rgiot/src/auber-code/library/tulip-core/src/TLPExport.cpp:405
#8  0x00007ffff51bae88 in tlp::TLPExport::exportGraph (this=0x60d000018570, os=warning: RTTI symbol not found for class 'std::basic_ofstream<char, std::char_traits<char> >'
...) at /home/rgiot/src/auber-code/library/tulip-core/src/TLPExport.cpp:506
#9  0x00007ffff4f49e7b in tlp::exportGraph (graph=0x614000797440, outputStream=..., format=..., dataSet=..., progress=<optimized out>) at /home/rgiot/src/auber-code/library/tulip-core/src/Graph.cpp:460
#10 0x00007ffff4f49898 in tlp::saveGraph (graph=0x614000797440, Python Exception <class 'gdb.error'> There is no member named _M_dataplus.: 

Is there any tests I can do to check the validity of the graph ?

rgiot avatar Jul 01 '19 09:07 rgiot

Some answers to what you spotted:

EdgeBundling.cpp:224 corresponds to dataSet->get("split_ratio", splitRatio); which is specified to be a double in the header wheras I used an int in my code. I suggest to make a minor modification in line 213 to replace splitRatio = 10; by splitRatio = 10.0; in case it was the source of my trouble.

Here the issue came to the fact that you are filling an empty tlp::DataSet with no type information on the plugin parameters. In the EdgeBundling source code, the dataSet member has already been filled with those type information provided in the plugin constructor and thus the provided int value gets converted to a double automatically.

One way to fix that kind of type issues is to get a tlp::DataSet instance filled with the plugin default parameters (as it is recommended to do when using the Python bindings). You can get such a dataset using the following function:

tlp::DataSet getDefaultPluginParameters(const std::string &pluginName, tlp::Graph *graph) {
  tlp::DataSet result;
  const tlp::ParameterDescriptionList &parameters = tlp::PluginLister::getPluginParameters(pluginName);
  parameters.buildDefaultDataSet(result, graph);
  return result;
}

So, I guess that I have a graph on which makeSimple fails. But I do not know if it is because SimpleTest was failing on my graph or if because my graph structure is incoherent (I guess it is supposed to never happens as the code is well tested). For example, I am unable to save the graph before calling makeSimple

That's a really weird error, your graph structure is likely messed up. This is quite hard to debug without more materials. Could you share your plugin code if possible (or put it in a private repository with limited access if you can not make it public) or a TLP dataset to be able to reproduce the issue ?

anlambert avatar Jul 01 '19 21:07 anlambert