rigraph icon indicating copy to clipboard operation
rigraph copied to clipboard

fix: Return `parent` as numeric vector in `bfs()` and `dfs()` instead of `igraph.vs`

Open Copilot opened this issue 4 months ago • 0 comments

The bfs() and dfs() functions returned parent as an igraph.vs object containing NA values for root vertices, causing str() and indexing operations to fail with "Unknown vertex selected" errors.

g <- sample_gnm(10, 20)
res <- bfs(g, 1, parent = TRUE)$parent

# Before: Error in simple_vs_index(): Unknown vertex selected
# After: Works correctly
str(res)     # num [1:10] NA 1 2 1 2...
res[1]       # NA
is.na(res[1]) # TRUE

Changes

  • R/structural-properties.R:

    • Remove conversion of parent to igraph.vs in both bfs() and dfs()
    • Convert 0 values (no parent from C) to NA after index conversion
    • Result format: NA for roots, -1 for unreachable vertices, 1-based IDs for reachable vertices
  • tests/testthat/test-structural-properties.R:

    • Add tests verifying parent is numeric vector with proper NA handling
    • Verify str() and indexing operations work correctly
  • tests/testthat/_snaps/structural-properties.md:

    • Update snapshots to reflect numeric vector output

The parent result is now documented correctly as "Numeric vector" and enables use cases like Strahler number computation that require standard R vector operations.

Fixes #1576

[!WARNING]

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • cran.rstudio.com
    • Triggering command: /opt/R/4.5.1/lib/R/bin/exec/R --no-restore --no-echo --args nextArgigraph_2.2.1.9002.tar.gznextArg--as-crannextArg--timingsnextArg--no-manual (dns block)
  • packagemanager.posit.co
    • Triggering command: /opt/R/4.5.1/lib/R/bin/exec/R --no-restore --no-echo --args nextArgigraph_2.2.1.9002.tar.gznextArg--as-crannextArg--timingsnextArg--no-manual (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>bfs()/dfs() adds NA into 'father' vector, which then causes trouble later</issue_title> <issue_description>Describe the bug

The bfs() and dfs() functions produce a 'father' vector that contains NA for the root. This is in fact not a vector, but an igraph.vs, where it is reasonable to expect that all values are proper vertex IDs and not NA. As a result, there will be problems when trying to use this result.

This is closely related to igraph/rigraph#186. Perhaps the same fix should be applied as there. While this will be a breaking change, the father results appears to be unusable at the moment, as the NAs are basically guaranteed to trigger an error (unless I'm missing something about how to work with NA).

To reproduce

g <- sample_gnm(10,20)
res <- bfs(g,1,father=T)$father
> str(res)
Error in simple_vs_index(x, ii, na_ok) : Unknown vertex selected
> res[1]
Error in simple_vs_index(x, args[[1]]$expr) : Unknown vertex selected

I am not sure how to even test whether a certain element or NA, which is what originally prevented me from being able to come up with a nice solution for the Strahler number computation here.

Version information

1.3.0, but also earlier. </issue_description>

Comments on the Issue (you are @copilot in this section)

@szhorvat Ultimately, the question is how to represent trees through parent vectors, see https://github.com/igraph/igraph/issues/1880 Also, what to call parent vectors? `parent`? `father`? `predecessor`? @ntamas I'm going to keep `father` for the 1.x series in order not to break the API; we can reconsider for 2.0 where we will need to break all sorts of things anyway.

As for NA, res[1, na_ok=T] works now after d88657c7. I'll look into enabling NA values for str(res).</comment_new> <comment_new>@ntamas

Okay, so supporting str(res) is not easy. In the end, it boils down to a call to [.igraph.vs from an internal part of the R code that I have no control over. So, either we make it possible for igraph.vs to remember whether it was constructed with na_ok in the first place so sub-slices created from it can "inherit" this property, or we make na_ok=T the default, which is a breaking change.</comment_new> <comment_new>@ntamas Postponing the rest of this issue to 2.x; indexing now works with the na_ok=... trick outlined above, and supporting str(res) would require deeper changes that we are not prepared to do now.</comment_new> <comment_new>@krlmlr str() is a generic, can we implement a method for it?

We could mention na_ok in the error message.</comment_new> <comment_new>@krlmlr

As for the presence of NA values, I don't see it as a big problem. This is idiomatic R.</comment_new> <comment_new>@ntamas I didn't know that str is a generic at the time I wrote it. If we can work around the issue with that, that's great, but I don't think it's a priority. Also, if NA values in a result vector like this is idiomatic R and people know who to do with it, that's OK and we can essentially close this issue.</comment_new> <comment_new>@szhorvat

As for the presence of NA values, I don't see it as a big problem. This is idiomatic R.

Also, if NA values in a result vector like this

Note that this result is not a vector, but a vertex sequence, i.e. igraph.vs.

I think this is problematic:

  • I am not currently comfortable with igraph.vs having NAs. On the C side, an igraph_vs_t, which an igraph.vs would typically map to, cannot have missing values. To make things worse, R arbitrarily designates a particular integer as the "NA value", but on the C side, this will appears as a normal integer. We don't want to have to know about R's NA weirdness in the C library. Starting with igraph 0.10, vertex indices are proper integers.
  • Semantically, the result here is not a vertex sequence, so it shouldn't be represented as an igraph.vs at all. What we have instead is a vertex -> vertex mapping. And yes, some vertices do not map to anything, we need to represent that somehow (with an NA or otherwise). To allow that, isn't it better to use a different result type than igraph.vs instead of trying to tolerate NAs in vertex sequences?

@krlmlr Before making up your mind on this, I strongly suggest you implement a Strahler index computation using bfs() to experience the issues first-hand.


As I said above, the first issue is whether the result should be an igraph.vs at all, give that there is a need for representing missing values. There is ...

  • Fixes igraph/rigraph#522

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot avatar Oct 28 '25 21:10 Copilot