adegenet icon indicating copy to clipboard operation
adegenet copied to clipboard

genind2df() doesn't work for 'rupica' when 'oneColPerAll = TRUE'

Open dfriend21 opened this issue 3 years ago • 1 comments

Running this code produces an error:

library(adegenet)
data(rupica)
genind2df(rupica, oneColPerAll = TRUE)
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 333, 335, 332, 334
In addition: Warning messages:
1: In matrix(unlist(e), ncol = x@ploidy[1], byrow = TRUE) :
  data length [665] is not a sub-multiple or multiple of the number of rows [333]
2: In matrix(unlist(e), ncol = x@ploidy[1], byrow = TRUE) :
  data length [669] is not a sub-multiple or multiple of the number of rows [335]
3: In matrix(unlist(e), ncol = x@ploidy[1], byrow = TRUE) :
  data length [667] is not a sub-multiple or multiple of the number of rows [334]
4: In matrix(unlist(e), ncol = x@ploidy[1], byrow = TRUE) :
  data length [665] is not a sub-multiple or multiple of the number of rows [333]
session info
─ Session info ────────────────────────────────────
 setting  value
 version  R version 4.1.2 (2021-11-01)
 os       macOS Big Sur 11.2
 system   x86_64, darwin17.0
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Los_Angeles
 date     2022-01-22
 rstudio  2021.09.2+382 Ghost Orchid (desktop)
 pandoc   NA

─ Packages ────────────────────────────────────────
 package     * version date (UTC) lib source
 ade4        * 1.7-18  2021-09-16 [1] CRAN (R 4.1.0)
 adegenet    * 2.1.5   2021-10-09 [1] CRAN (R 4.1.0)
 ape           5.6-1   2022-01-07 [1] CRAN (R 4.1.2)
 bench         1.1.2   2021-11-30 [1] CRAN (R 4.1.0)
 cli           3.1.1   2022-01-20 [1] CRAN (R 4.1.2)
 cluster       2.1.2   2021-04-17 [1] CRAN (R 4.1.2)
 colorspace    2.0-2   2021-06-24 [1] CRAN (R 4.1.0)
 crayon        1.4.2   2021-10-29 [1] CRAN (R 4.1.0)
 digest        0.6.29  2021-12-01 [1] CRAN (R 4.1.0)
 dplyr         1.0.7   2021-06-18 [1] CRAN (R 4.1.0)
 ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.1.0)
 fansi         1.0.2   2022-01-14 [1] CRAN (R 4.1.2)
 fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.1.0)
 generics      0.1.1   2021-10-25 [1] CRAN (R 4.1.0)
 ggplot2       3.3.5   2021-06-25 [1] CRAN (R 4.1.0)
 glue          1.6.0   2021-12-17 [1] CRAN (R 4.1.0)
 gtable        0.3.0   2019-03-25 [1] CRAN (R 4.1.0)
 hierfstat     0.5-10  2021-11-17 [1] CRAN (R 4.1.0)
 htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.1.0)
 httpuv        1.6.5   2022-01-05 [1] CRAN (R 4.1.2)
 igraph        1.2.11  2022-01-04 [1] CRAN (R 4.1.2)
 later         1.3.0   2021-08-18 [1] CRAN (R 4.1.0)
 lattice       0.20-45 2021-09-22 [1] CRAN (R 4.1.2)
 lifecycle     1.0.1   2021-09-24 [1] CRAN (R 4.1.0)
 magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.1.0)
 MASS          7.3-54  2021-05-03 [1] CRAN (R 4.1.2)
 Matrix        1.3-4   2021-06-01 [1] CRAN (R 4.1.2)
 mgcv          1.8-38  2021-10-06 [1] CRAN (R 4.1.2)
 mime          0.12    2021-09-28 [1] CRAN (R 4.1.0)
 munsell       0.5.0   2018-06-12 [1] CRAN (R 4.1.0)
 nlme          3.1-153 2021-09-07 [1] CRAN (R 4.1.2)
 pegas         1.1     2021-12-16 [1] CRAN (R 4.1.0)
 permute       0.9-5   2019-03-12 [1] CRAN (R 4.1.0)
 pillar        1.6.4   2021-10-18 [1] CRAN (R 4.1.0)
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.1.0)
 plyr          1.8.6   2020-03-03 [1] CRAN (R 4.1.0)
 promises      1.2.0.1 2021-02-11 [1] CRAN (R 4.1.0)
 purrr         0.3.4   2020-04-17 [1] CRAN (R 4.1.0)
 R6            2.5.1   2021-08-19 [1] CRAN (R 4.1.0)
 Rcpp          1.0.8   2022-01-13 [1] CRAN (R 4.1.2)
 reshape2      1.4.4   2020-04-09 [1] CRAN (R 4.1.0)
 rlang         0.4.12  2021-10-18 [1] CRAN (R 4.1.0)
 scales        1.1.1   2020-05-11 [1] CRAN (R 4.1.0)
 seqinr        4.2-8   2021-06-09 [1] CRAN (R 4.1.0)
 sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.0)
 shiny         1.7.1   2021-10-02 [1] CRAN (R 4.1.0)
 stringi       1.7.6   2021-11-29 [1] CRAN (R 4.1.0)
 stringr       1.4.0   2019-02-10 [1] CRAN (R 4.1.0)
 tibble        3.1.6   2021-11-07 [1] CRAN (R 4.1.0)
 tidyselect    1.1.1   2021-04-30 [1] CRAN (R 4.1.0)
 utf8          1.2.2   2021-07-24 [1] CRAN (R 4.1.0)
 vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.1.0)
 vegan         2.5-7   2020-11-28 [1] CRAN (R 4.1.0)
 xtable        1.8-4   2019-04-21 [1] CRAN (R 4.1.0)

 [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library

───────────────────────────────────────────────────

I noticed that there are two closed issues that might be relevant: #18 and #192.

dfriend21 avatar Jan 22 '22 22:01 dfriend21

I encountered a similar problem when one allele of a diploid genotype is unkown. For example, with 3 individuals, genotypes at a locus may be A/A, T, A/T, then it causes the problem. Note that for the second individual, only 1 allele was recovered, and the other allele isn't known (e.g. failed sequencing). In the code below, I added a section which deals with this problem.

I'm attaching a text file, which correct this problem. The fix was based on the version 2.1.5 of adegenet.

The additional argument, rm.incompleteGeno controls what to do when there are missing alleles for a genotype. By default (rm.incompleteGeno=F), the genotypes of the second individual in the example above will become T/NA. If you set the option to true, it will remove the incomplete genotypes, and replace it with NA/NA (for diploid).

I also noticed that character string of "NA" is used instead of NA with oneColPerAll=T. So I added another fix to convert them to real NA in the code below.

With this modification, the sample code by @dfriend21 gives the expected result without the error.

genind2df.txt

ntakebay avatar Mar 10 '22 00:03 ntakebay