grpregOverlap
grpregOverlap copied to clipboard
ExpandX is not working
Hi I tried to run the grpregOverlap but i have a issue in the ExpandX function. The problem is :
expand X to X.latent
``X.latent <- NULL names <- NULL
for(i in 1:nrow(incidence.mat)) { idx <- incidence.mat[i,]==1 X.latent <- cbind(X.latent, X[, idx, drop=FALSE]) names <- c(names, colnames(incidence.mat)[idx]) # colnames(X.latent) <- c(colnames(X.latent), colnames(X)[incidence.mat[i,]==1]) }`` Everytime i try to run the function same error, they is not the same number of row but it like normal because X.latent is NULL. So if someone have the same issue with and have the solution it will be great to share it. Thanks
I infer (from some correspondence we've had) that your data X
is a data.frame, whereas I typically pass a matrix for X
. Here's an example that I think will reproduce your problem and also show how things "work" when X
is a matrix. I'm not yet clear on what's causing this, but I think this helps focus the troubleshooting.
## install grpregOverlap from github using devtools
install.packages("devtools")
devtools::install_github("YaoHuiZeng/grpregOverlap")
library(grpregOverlap)
## generate simple synthetic data with overlapping groups
n <- 10
p <- 3
X <- data.frame(
gene1 = rnorm(n),
gene2 = rnorm(n),
gene3 = rnorm(n)
)
group <- list(
"pathway1" = c("gene1", "gene2"),
"pathway2" = c("gene2", "gene3")
)
y <- rnorm(10)
## fitting fails with output:
## Error in data.frame(..., check.names = FALSE) :
## arguments imply differing number of rows: 0, 10
fm <- grpregOverlap(X, y, group, returnX.latent = TRUE)
## same, but now X is a matrix
n <- 10
p <- 3
X <- matrix(rnorm(n*p), n, p)
colnames(X) <- c("gene1","gene2","gene3")
group <- list(
"pathway1" = c("gene1", "gene2"),
"pathway2" = c("gene2", "gene3")
)
y <- rnorm(10)
## fitting works
fm <- grpregOverlap(X, y, group, returnX.latent = TRUE)
It seems like the error is happening at this line. From some testing, cbind(NULL, X)
behaves differently depending on whether X
is a matrix or a data.frame.
# cbind is able to bind NULL and matrix
> X = matrix(1:12, 4, 3)
> cbind(NULL, X)
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
# cbind is unable to bind NULL and data.frame
> X = data.frame(a = 1:4, b = 5:5, c = 9:12)
> cbind(NULL, X)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 0, 4
For now, I'd suggest "working around" this by making X
a matrix with named columns, like I did in my example code:
X <- matrix(rnorm(n*p), n, p)
colnames(X) <- c("gene1","gene2","gene3")
Hi Dan, I would like to thanks you for your help. I finely resolve the problem. I will share it here for everyone who will have the same problem.
gene_present # is my dataframe X=as.matrix(gene_present) y=traitData$labs groups=sub_pathway$pathways
I had a binary outcome (R or NR) wich i transform as 1=R and 0=NR and i named the column labs. And it should work
res=grpregOverlap::grpregOverlap(X, y, groups, returnX.latent = TRUE)
Thanks
Glad it worked!
Just a statistical note: if your response is binary, you might want to consider something other than the default setting of family = "gaussian"
in your call to grpregOverlap
and might want to instead do something like logistic regression (with family = "binomial"
, I think), but you know your data better than me so feel free to ignore this suggestion.
Yes it is exactly what i did with the grlasso penalty. Thanks for the suggestion.