`graph_from_incidence_matrix()` errors when a sparse matrix contains `NA`s and `weighted = NULL`
Describe the bug
For any NA elements in a sparse incidence matrix, x[3] in the following line will be NA, which results in an error:
https://github.com/igraph/rigraph/blob/da27d5b5f79efb9e0a5bf1d9e5bdd91c24d30ca7/R/incidence.R#L70
To reproduce
library(igraph)
#>
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:stats':
#>
#> decompose, spectrum
#> The following object is masked from 'package:base':
#>
#> union
library(Matrix)
m <- matrix(c(1:6), 2, 3)
M <- Matrix(m, sparse = TRUE)
n <- replace(m, m == 3, NA)
N <- Matrix(n, sparse = TRUE)
## Unweighted sparse without NA runs
graph_from_incidence_matrix(M)
#> IGRAPH 44beafe U--B 5 6 --
#> + attr: type (v/l)
#> + edges from 44beafe:
#> [1] 1--3 2--3 1--4 2--4 1--5 2--5
## Unweighted dense with NA runs
graph_from_incidence_matrix(n)
#> IGRAPH 29d12d4 U--B 5 6 --
#> + attr: type (v/l)
#> + edges from 29d12d4:
#> [1] 1--3 1--4 1--5 2--3 2--4 2--5
## Weighted sparse with NA runs
graph_from_incidence_matrix(N, weighted = TRUE)
#> IGRAPH 4609590 U-WB 5 6 --
#> + attr: type (v/l), weight (e/n)
#> + edges from 4609590:
#> [1] 1--3 2--3 1--4 2--4 1--5 2--5
## Unweighted sparse with NA crashes
graph_from_incidence_matrix(N)
#> Error in rep(unname(x[1:2]), x[3]): invalid 'times' argument
Created on 2023-10-25 with reprex v2.0.2
Version information
- R/igraph version: 1.5.1
- R version: 4.1.2
- Operating system: Ubuntu 20.04.6 LTS
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Matrix_1.6-1.1 igraph_1.5.1
loaded via a namespace (and not attached):
[1] rstudioapi_0.15.0 knitr_1.44 magrittr_2.0.3 lattice_0.20-45
[5] R.cache_0.16.0 R6_2.5.1 rlang_1.1.1 fastmap_1.1.1
[9] fansi_1.0.5 styler_1.10.2 tools_4.1.2 grid_4.1.2
[13] xfun_0.40 R.oo_1.25.0 utf8_1.2.3 cli_3.6.1
[17] clipr_0.8.0 withr_2.5.1 htmltools_0.5.6.1 yaml_2.3.7
[21] digest_0.6.33 tibble_3.2.1 lifecycle_1.0.3 processx_3.8.2
[25] callr_3.7.3 purrr_1.0.2 ps_1.7.5 vctrs_0.6.4
[29] R.utils_2.12.2 fs_1.6.3 glue_1.6.2 evaluate_0.22
[33] rmarkdown_2.25 unix_1.5.5 reprex_2.0.2 compiler_4.1.2
[37] pillar_1.9.0 R.methodsS3_1.8.2 pkgconfig_2.0.3
@szhorvat how should NA in the incidence matrix be handled here?
The practical advice is: Don't ever pass NA to igraph. You can store it in attributes, but that's about all that's safe to do with NA. It will take some time to clean up the NaN/NA handling in igraph ...
The long term plan on this is:
- Adjacency matrix operations should be handled in the C core, not in R (ongoing project). This function was already renamed to use the term "biadjacency", and it falls in the same category.
- We should figure out how to handle NaN values (which NA translates to) in the C core. Chances are they will just trigger an error, but maybe we pass them through to the edge weight vector. See https://github.com/igraph/igraph/issues/2413
in the meantime, should the function error immediately if passed an incidence matrix with NA then?
Yes, that is reasonable. If we can relax this restriction later, we will do that. But chances are we won't be able to.
we could fix that in #1654 maybe?
(graph_from_adjacency_matrix has the same issue though)
can I assign this issue to you? 😸
Yes please 🙂