igraph icon indicating copy to clipboard operation
igraph copied to clipboard

Steiner Tree Feature

Open Nachiket18 opened this issue 4 years ago • 54 comments

The functionality that is being developed is calculating the Steiner tree in a graph. Steiner Tree is created on basis of subset of terminals in the graphs with the condition that length of the tree should be minimum.

We are following the approach suggested in this paper - https://arxiv.org/abs/1710.00668

Also, we are taking reference of an FPT algorithm known as Dreyfus- Wagner algorithm - https://onlinelibrary.wiley.com/doi/10.1002/net.3230010302


Edit by @szhorvat:

TODO list to be completed only after the first review:

  • [ ] interruptability
  • [ ] functions.yaml
  • [ ] documentation
  • [ ] checks on input parameters
  • [ ] overflow checks

Nachiket18 avatar Jan 22 '22 03:01 Nachiket18

This is a good starting point. I closed the other PR (the comments won't disappear).

The next step now is to fix all the errors that prevent the code from compiling. You will not be able to check if it works correctly until it compiles.

Are you comfortable with compiling igraph on your own computer without these changes? Do you know how to run the tests? You should be using the very same steps to compile and test the function implemented in this PR.

szhorvat avatar Jan 22 '22 08:01 szhorvat

This is a good starting point. I closed the other PR (the comments won't disappear).

The next step now is to fix all the errors that prevent the code from compiling. You will not be able to check if it works correctly until it compiles.

Are you comfortable with compiling igraph on your own computer without these changes? Do you know how to run the tests? You should be using the very same steps to compile and test the function implemented in this PR.

Thanks. I was able to compile the code using the instructions which specifies the usage for cmake and later the command cmake --build

. Earlier there were compile time errors but at present the code ( including the Steiner tree change) is getting build without any errors.

I am however, confused about running test case. I did create create a test case and updated the CmakeLists.txt but running it and understanding why the CI is failing is proving little difficult for me.

Nachiket18 avatar Jan 30 '22 22:01 Nachiket18

Earlier there were compile time errors but at present the code ( including the Steiner tree change) is getting build without any errors.

Actually, the code does not build without errors. This is the first reason why the CI is failing, well before the tests. Here's the first error due to a type mismatch: you are sometimes using int and sometimes igraph_integer_t, which are not the same.

https://dev.azure.com/igraph-team/igraph/_build/results?buildId=2196&view=logs&j=9365e2c0-1fe2-55d7-9193-5883fccc1323&t=06be1443-af2e-53a2-9221-964df5414162&l=931

Can you fix all the compiler errors, which you can see in the output of the CI runs?

Generally, all integers used as indices (indexing into vectors, vertex and edges IDs) must be consistently igraph_integer_t.

szhorvat avatar Jan 31 '22 09:01 szhorvat

Can you please format the source files for readability? Use consistent indentation. Your editor should have an auto-format command.

Eventually, the formatting will need to match that of the rest of igraph, but for now, let's just have something that is consistent and readable, and we can worry matching to the rest of the codebase late.

szhorvat avatar Feb 06 '22 08:02 szhorvat

Thanks @szhorvat

I made the changes that you had suggested. Could you review the issue that is caused at IGRAPH_ASSERT macro?

For next steps, I have something that I would need to discuss with you -

The igraph_steiner_dreyfus_wagner function calculates the minimum length of the Steiner tree for smaller graph as it's FPT algorithm for NP-complete problem. Should the code be designed in such a way that it outputs tree as result or would output have tree and minimum length?

Nachiket18 avatar Feb 15 '22 04:02 Nachiket18

Should the code be designed in such a way that it outputs tree as result or would output have tree and minimum length?

It would be nice if the function could (optionally) provide both.

Let's first make sure that the code compiles, then that it works. We can do tweaks to it later.

szhorvat avatar Feb 15 '22 08:02 szhorvat

@szhorvat - Thanks for the suggestions. I made some silly mistakes in the test case file. sorry about that. For building the code I'm using the following commands -

$ cmake --build . $ cmake --build . --target check

Which commands should I run to check the errors and resolve them locally? I will keep a note of the linux_static_vendored when the tests are run here on the platform.

Nachiket18 avatar Mar 02 '22 22:03 Nachiket18

Yes, these commands are correct. If you use them you should see the test failing, just like on the CI right now. Does the test not fail for you locally?

szhorvat avatar Mar 03 '22 08:03 szhorvat

@Nachiket18 Let us know if you need more help with this. The immediate goal is to get the test suite to pass, and to make sure that you can run the tests locally on your computer.

I don't want to put too much pressure on you, so if you feel that you have no time right now, just let us know.

szhorvat avatar Apr 05 '22 08:04 szhorvat

@szhorvat - I made the changes that you had suggested. I was able to build the code using the instructions from the official documentation. However, during the target check there was error generated.

177/424 Test #177: test::igraph_steiner_tree_fpt Test exited abnormally with error: Segmentation fault

Normally, I would add print statements to debug the code but for this project what should be the way to resolve it?

Nachiket18 avatar Jun 01 '22 01:06 Nachiket18

Either add printf statements to the tests if that's convenient for you and run it manually from the command line (the test should be built in test/test_igraph_steiner_tree_... in the build folder -- just make sure you rebuild if after adding the printf statements), or hook it up to a debugger and run through it step by step. I don't know what IDE / OS you are using, I usually use gdb or lldb from the command line on macOS, but I'm oldschool :)

ntamas avatar Jun 01 '22 10:06 ntamas

Resolved the conflicts with the upstream develop branch, FYI.

I haven't checked the PR thoroughly yet so this is not a full review, but one thing that surely needs to be changed is that we have switched to using igraph_vector_int_t for variables that represents vertex indices so the function needs to be changed to use igraph_vector_int_t for vectors containing vertex indices. As for the distance matrix, that can probably stay as igraph_matrix_t because it can contain floating-point numbers in the weighted case, but infinite distances should be represented with IGRAPH_INFINITY and not INT_MAX.

ntamas avatar Jun 01 '22 10:06 ntamas

FYI I resolved conflict with develop again and made a minor fix to allow it to build again. You should pull the changes.

szhorvat avatar Jun 06 '22 17:06 szhorvat

If you haven't yet, take a look at this wiki page and the links within: https://github.com/igraph/igraph/wiki/Quickstart-for-new-contributors

As for debugging, building with AddressSanitizer (ASan) is useful. See here: https://github.com/igraph/igraph/wiki/Using-sanitizers-to-find-bugs

The linux_static_vendored CI run has ASan enabled and will give more informative error messages.

https://dev.azure.com/igraph-team/igraph/_build/results?buildId=3578&view=logs&j=4c3a8381-b9de-5463-904f-918810769be8&t=e71559a0-54d3-55f3-64cc-74b67ad70398&l=400

szhorvat avatar Jun 06 '22 17:06 szhorvat

Let us know if you have more questions.

szhorvat avatar Jun 06 '22 17:06 szhorvat

@szhorvat - Thanks. I made the changes that were causing segmentation fault, changed the data type to 'igraph_vector_int_t' for steiner_terminals and also updated the test case. The test case is failing even though it passes through the function in the test case and fails in the end. We are looking at the results from linux_static_vendored pipeline and would try to fix it.

Nachiket18 avatar Jun 29 '22 00:06 Nachiket18

Codecov Report

Merging #1939 (040d243) into master (a229509) will increase coverage by 0.05%. The diff coverage is 93.63%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1939      +/-   ##
==========================================
+ Coverage   83.35%   83.40%   +0.05%     
==========================================
  Files         373      374       +1     
  Lines       61251    61486     +235     
==========================================
+ Hits        51054    51281     +227     
- Misses      10197    10205       +8     
Impacted Files Coverage Δ
src/random/rng_pcg32.c 100.00% <ø> (ø)
src/internal/glpk_support.c 56.25% <50.00%> (ø)
src/paths/johnson.c 90.47% <80.76%> (+0.19%) :arrow_up:
src/paths/steiner.cpp 93.85% <93.85%> (ø)
src/community/fluid.c 89.58% <100.00%> (ø)
src/community/label_propagation.c 95.18% <100.00%> (ø)
src/community/leading_eigenvector.c 84.68% <100.00%> (ø)
src/community/leiden.c 97.46% <100.00%> (ø)
src/connectivity/components.c 94.91% <100.00%> (+0.01%) :arrow_up:
src/core/heap.pmt 92.52% <100.00%> (-0.21%) :arrow_down:
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a229509...040d243. Read the comment docs.

codecov[bot] avatar Jun 29 '22 06:06 codecov[bot]

infinite distances should be represented with IGRAPH_INFINITY and not INT_MAX.

Just highlighting this comment from @ntamas, as this needs to be addressed.


There are some parts of the algorithm that are not covered by the tests. Can you make sure that everything is covered, except some checks on input parameters or other error checks?

Let us know when this is ready for the first round of review.

szhorvat avatar Jun 29 '22 13:06 szhorvat

infinite distances should be represented with IGRAPH_INFINITY and not INT_MAX.

Just highlighting this comment from @ntamas, as this needs to be addressed.

There are some parts of the algorithm that are not covered by the tests. Can you make sure that everything is covered, except some checks on input parameters or other error checks?

Let us know when this is ready for the first round of review.

Yes. We would write test case for checking the logic of the algorithm by printing and comparing the value for minimum steiner tree distance with one that's expected.

Nachiket18 avatar Jun 29 '22 15:06 Nachiket18

One more comment is that it will be useful to merge develop into this branch again, to prevent the two branches from diverging too much. This should also fix the coverage report, which incorrectly claims that this PR reduces coverage in many files.

szhorvat avatar Jun 29 '22 15:06 szhorvat

It's great to see this PR moving forward.

To reiterate, the immediate goals are:

  • [ ] Merge the develop branch into this branch.
  • [ ] Add expected test output, and manually verify that it is correct.
  • [ ] Fix test failures on all platforms. Let us know if it's unclear how to deal with a failure.
  • [ ] Ensure full test coverage of all parts of the code that implement the algorithm. You can see coverage here It is not necessary to cover error checks or debug print functions, just the core algorithm.
  • [ ] Add documentation. It does not need to be polished or well-written, but it should be complete (i.e. discuss all relevant details of the function) and correct (match what the function does).

Before the review, it would be great to reformat the code to fit with the rest of the igraph codebase. We use an editorconfig file to help with this. If you're not sure how to do this efficiently, let me know and I will run it through an auto-formatter.

Once these are done, we can start an earnest review.

szhorvat avatar Jun 30 '22 11:06 szhorvat

@szhorvat I tried to fix as many issues as possible and am at my last 2 as per what I see the last few ones deal with

  1. macOS -> Johnsons being deprecated which causes Mac to fail at this moment.
  2. Windows(both) - > it says the min function doesn't belong to the namespace std but when I click the definition it says it is and I don't know how to change it. And this is the only error left in windows at this moment.

Do you know how to deal with this cross-compatibility before I move on with the functionality and its test case? Thanks

RonakGSahu avatar Jun 30 '22 15:06 RonakGSahu

When a function is deprecated, look it up in the sources and see what it was replaced with. In this case it is igraph_distances_johnson().

As for min(), let us look up which header declares it: https://en.cppreference.com/w/cpp/algorithm/min It is <algorithm>. With some of the standard C++ library implementations, some other headers may declare min as well, which is why you see no failure on Windows. But only <algorithm> is guaranteed to declare it, thus the only standards-compliant way to proceed is to include it.

szhorvat avatar Jun 30 '22 16:06 szhorvat

When a function is deprecated, look it up in the sources and see what it was replaced with. In this case it is igraph_distances_johnson().

As for min(), let us look up which header declares it: https://en.cppreference.com/w/cpp/algorithm/min It is <algorithm>. With some of the standard C++ library implementations, some other headers may declare min as well, which is why you see no failure on Windows. But only <algorithm> is guaranteed to declare it, thus the only standards-compliant way to proceed is to include it.

Ok, thanks . i thought it was completely removed from the igraph library because i had a similar situation in my C class where they said it was completely removed. And for min, it took me to the functional definition so I thought it was already added to the headers. I'll update the changes and push it soon.

RonakGSahu avatar Jun 30 '22 18:06 RonakGSahu

I updated the TODO checklist above. When all these are done, just click the ready for review button.


Regarding the recently added igraph_i_directed_t mode: anything that has _i_ in it should not appear in public API. There are two typical ways to control the handling of directedness. One is a igraph_bool_t directed argument: when false, edge directions should be ignored. The other is an igraph_neimode_t mode argument. This is used when the result would change is the edge directions are reversed, and can be used to treat a directed graph either as undirected (mode = IGRAPH_ALL) or as reversed (mode = IGRAPH_IN).

igraph_i_directed_t is just a dummy name to allow for the IGRAPH_DIRECTED=1 and IGRAPH_UNDIRECTED=0 enum values. These are used for readability (i.e. not for functional reasons) when passing values to igraph_bool_t directed arguments. fun(g, 1) is quite cryptic and one might need to look up the meaning of the second argument. fun(g, IGRAPH_DIRECTED) is immediately clear.

szhorvat avatar Jul 01 '22 13:07 szhorvat

We were able to build and test the functionality. Our test case includes a graph given in the research paper and output that code is generating is same as what it's mentioned in the paper. In order to ensure correctness, how many such test cases need to be implemented ? Meanwhile, I'll start working on documentation.

Nachiket18 avatar Jul 07 '22 13:07 Nachiket18

There's a wiki page that has hints on what to pay attention to when testing, as well as a few other things that we'll need to get done eventually.

https://github.com/igraph/igraph/wiki/Checklist-for-new-%28and-old%29-functions

  • Consider the case of 0 and 1 vertex graphs. What is a reasonable result and does the function return that result?
  • What if the input is disconnected?
  • Does your implementation handle self-loops and multi-edges? Self-loops are easy as they can be ignored, but multi-edges are not. If the current implementation does not handle them, can be changed to support them?
  • Test both with and without providing weights
  • Are there any other special cases that need to be considered?
  • etc.

szhorvat avatar Jul 07 '22 13:07 szhorvat

Don't forget to merge the develop branch into this one.

Also, please go through the review comments and click "Resolve conversation" on those which have been addressed. This way we can keep track of what has been done and what still needs to be completed.

szhorvat avatar Jul 07 '22 13:07 szhorvat

It looks like the function currently only returns the total weights of the Steiner tree. Can you return the tree itself as well as a vector of edge indices, similar to how igraph_minimum_spanning_tree() does it?

szhorvat avatar Jul 08 '22 07:07 szhorvat

It looks like the function currently only returns the total weights of the Steiner tree. Can you return the tree itself as well as a vector of edge indices, similar to how igraph_minimum_spanning_tree() does it?

Yes. We are figuring out how we are going to do.

Nachiket18 avatar Jul 16 '22 13:07 Nachiket18