ArchR icon indicating copy to clipboard operation
ArchR copied to clipboard

AddModuleScore with Peaks

Open waltno opened this issue 3 years ago • 26 comments

Unable to generate Module Score with peak data

I am using ArchR/release 1.0.3. which should have the most updated version of AddModuleScore... to my knowledge.

gr<-getFeatures(project "PeakMatrix")
peaks_gr<-peaks_that_i_want %>% Signac::StringToGRanges(sep = c(":", "-"))
hit<-findOverlaps(gr,peaks_gr)
peaks<-gr[hit@from]

head(peaks)
GRanges object with 6 ranges and 1 metadata column:
      seqnames          ranges strand |       idx
         <Rle>       <IRanges>  <Rle> | <integer>
  [1]     chr1 1020975-1021475      * |        75
  [2]     chr1 2229090-2229590      * |       332
  [3]     chr1 2255455-2255955      * |       339
  [4]     chr1 2290641-2291141      * |       346
  [5]     chr1 2290641-2291141      * |       346
  [6]     chr1 3652206-3652706      * |       482
project<-addModuleScore(project, useMatrix = "PeakMatrix", name = "peaks_of_interest", features = list(peaks))
Computing Module 1 of 1
Error in sample.int(length(x), size, replace, prob) : 
  invalid first argument

waltno avatar Sep 15 '22 19:09 waltno

Hi @waltno! Thanks for using ArchR! Please make sure that your post belongs in the Issues section. Only bugs and error reports belong in the Issues section. Usage questions and feature requests should be posted in the Discussions section, not in Issues.
Before we help you, you must respond to the following questions unless your original post already contained this information: 1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved? 2. Can you recapitulate your error using the tutorial code and dataset? If so, provide a reproducible example. 3. Did you post your log file? If not, add it now. 4. Remove any screenshots that contain text and instead copy and paste the text using markdown's codeblock syntax (three consecutive backticks). You can do this by editing your original post.

rcorces avatar Sep 15 '22 19:09 rcorces

My suspicion is that some recent changes have altered how this would work. In particular, my guess is that your PeakMatrix has a column called "name" which was recently added. If that is the case, then addModuleScore() is expecting you to pass the values of the name column rather than a GRanges object. You can check this with:

featureDF <- ArchR:::.getFeatureDF(head(getArrowFiles(ArchRProj),2), subGroup="PeakMatrix")
head(featureDF)

rcorces avatar Sep 16 '22 13:09 rcorces

Hmmmm I don't see something called name

DataFrame with 6 rows and 4 columns
  seqnames       idx     start       end
     <Rle> <integer> <integer> <integer>
1     chr1         1    807573    808073
2     chr1         2    816903    817403
3     chr1         3    827308    827808
4     chr1         4    858581    859081
5     chr1         5    869642    870142
6     chr1         6    904536    905036

waltno avatar Sep 16 '22 17:09 waltno

Thanks for providing that information.

I'm unable to recapitulate this error with the tutorial data. Can you provide a traceback() showing where the error is occurring?

My peak matrix looks like yours:

> featureDF <- ArchR:::.getFeatureDF(head(getArrowFiles(projHeme5),2), subGroup="PeakMatrix")
> head(featureDF)
DataFrame with 6 rows and 4 columns
  seqnames       idx     start       end
     <Rle> <integer> <integer> <integer>
1     chr1         1    752502    753002
2     chr1         2    762685    763185
3     chr1         3    779917    780417
4     chr1         4    793283    793783
5     chr1         5    801006    801506
6     chr1         6    805039    805539

But I am able to get a module score on a GRanges of peaks.

> ps <- getPeakSet(ArchRProj = projHeme5)
> pax5_peaks <- ps[which(mcols(ps)$nearestGene == "PAX5")]
> features <- list(pax5_score = pax5_peaks)
> projHeme5 <- addModuleScore(ArchRProj = projHeme5, useMatrix = "PeakMatrix", name = "Module", features = features)
Computing Module 1 of 1
> head(projHeme5@cellColData$Module.pax5_score)
[1] -0.04875000  0.01041667 -0.01875000  0.05166667 -0.01916667 -0.04500000
> plotEmbedding(ArchRProj = projHeme5, embedding = "UMAP", colorBy = "cellColData", name = "Module.pax5_score", imputeWeights = getImputeWeights(projHeme5))

image

rcorces avatar Sep 19 '22 12:09 rcorces

Thank you so much for helping out!

When I run that code I get this error

> m2 <- addModuleScore(ArchRProj = m2, useMatrix = "PeakMatrix", name = "Module", features = pax5_peaks)
Error in .validInput(input = features, name = "features", valid = c("list")) : 
  Input value for 'features' is not a list, (features = GRanges) please supply valid input!
> traceback()
3: stop("Input value for '", name, "' is not a ", paste(valid, collapse = ","), 
       ", (", name, " = ", class(input), ") please supply valid input!")
2: .validInput(input = features, name = "features", valid = c("list"))
1: addModuleScore(ArchRProj = m2, useMatrix = "PeakMatrix", name = "Module", 
       features = pax5_peaks)

waltno avatar Sep 20 '22 17:09 waltno

sorry, copy and paste error. I've edited my post above to include the line where features is created.

rcorces avatar Sep 20 '22 17:09 rcorces

Here's my traceback

> m2 <- addModuleScore(ArchRProj = m2, useMatrix = "PeakMatrix", name = "Module", features)
Computing Module 1 of 1
Error in sample.int(length(x), size, replace, prob) : 
  invalid first argument
> traceback()
8: sample.int(length(x), size, replace, prob)
7: sample(x, nBgd)
6: FUN(X[[i]], ...)
5: lapply(binx, function(x) sample(x, nBgd))
4: unlist(lapply(binx, function(x) sample(x, nBgd)), use.names = FALSE)
3: FUN(X[[i]], ...)
2: lapply(seq_along(featureList), function(x) {
       message("Computing Module ", x, " of ", length(featureList))
       binx <- binList[moduleList[[x]]]
       idxFgd <- featureList[[x]]
       idxBgd <- unlist(lapply(binx, function(x) sample(x, nBgd)), 
           use.names = FALSE)
       m <- ArchR:::.getPartialMatrix(ArrowFiles = getArrowFiles(ArchRProj), 
           featureDF = featureDF[c(idxFgd, idxBgd), ], useMatrix = useMatrix, 
           cellNames = ArchRProj$cellNames, threads = threads, verbose = FALSE, 
           doSampleCells = FALSE)
       Matrix::colMeans(m[seq_along(idxFgd), ]) - Matrix::colMeans(m[-seq_along(idxFgd), 
           ])
   })
1: addModuleScore(ArchRProj = m2, useMatrix = "PeakMatrix", name = "Module", 
       features)

waltno avatar Sep 20 '22 17:09 waltno

can you post your log file as well?

rcorces avatar Sep 21 '22 17:09 rcorces

actually, I just realized that the logging functionality hasnt been fully implemented in addModuleScore.

rcorces avatar Sep 21 '22 17:09 rcorces

yeaaaa i dont get a log file :/ Do you think I should try different ArchR releases? Are you using 1.0.3?

waltno avatar Sep 21 '22 17:09 waltno

yes - I'm using 1.0.3 at the moment. Will look into this a bit further.

rcorces avatar Sep 21 '22 17:09 rcorces

I'm having trouble understanding whats going on here. I quickly added some code to dev to make sure a logfile will be created but haven added more logging of other info for the function. Perhaps you can start by upgrading to the latest dev branch and testing it on the tutorial dataset (projHeme5 like I show above). Otherwise, I'm not really sure what the problem is.

rcorces avatar Sep 21 '22 18:09 rcorces

Sounds good! I’ll keep you posted

Get Outlook for iOShttps://aka.ms/o0ukef


From: Ryan Corces @.> Sent: Wednesday, September 21, 2022 11:27:53 AM To: GreenleafLab/ArchR @.> Cc: Waltner, Olivia G @.>; Mention @.> Subject: Re: [GreenleafLab/ArchR] AddModuleScore with Peaks (Issue #1631)

I'm having trouble understanding whats going on here. I quickly added some code to dev to make sure a logfile will be created but haven added more logging of other info for the function. Perhaps you can start by upgrading to the latest dev branch and testing it on the tutorial dataset (projHeme5 like I show above). Otherwise, I'm not really sure what the problem is.

— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GreenleafLab_ArchR_issues_1631-23issuecomment-2D1254074821&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhfEazhEXT91ASHynm_9f1N0&r=lgKUj5yLo8eWcSt76mcSQn__77wdiajNCMMkicUKdjI&m=ItAEBcg74FX19s28rxhv_xiPr8YJ-tuOi9NcMjXSEWzrhJQBKTQDPBYQY7E0mMM-&s=KnrRVHy9Lhv51NIbx922TN2QVnQqwndj2hpgnPo80aE&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AWT2R3REDQCCKWZXEBLPQADV7NHSTANCNFSM6AAAAAAQNW2R3E&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhfEazhEXT91ASHynm_9f1N0&r=lgKUj5yLo8eWcSt76mcSQn__77wdiajNCMMkicUKdjI&m=ItAEBcg74FX19s28rxhv_xiPr8YJ-tuOi9NcMjXSEWzrhJQBKTQDPBYQY7E0mMM-&s=5VvRfzxghWjRQOAx6K0lALvFyjhrRTiCj5CJVVfyX0w&e=. You are receiving this because you were mentioned.Message ID: @.***>

waltno avatar Sep 21 '22 18:09 waltno

@waltno - any update on this? any idea if it fails for you on the tutorial dataset?

rcorces avatar Sep 30 '22 12:09 rcorces

Hi @rcorces just ran it on the tutorial dataset and did not get any errors! I guess that's good new for you and more troubleshooting for me :) Looking at differences between projHeme5's and my object's peak matrices now!

waltno avatar Sep 30 '22 20:09 waltno

Hi @rcorces I remade my object in case I did something buggy. I really just followed the standard work flow. I still get the same error &e traceback as above., the log file is telling much either.

ArchR-addModuleScore-5c701caf811b-Date-2022-10-03_Time-10-12-35.log

waltno avatar Oct 03 '22 17:10 waltno

I added some additional logging outputs in the dev_moduleScore branch. Could you try installing that branch and seeing if you get more info in the log file?

devtools::install_github("GreenleafLab/ArchR", ref="dev_moduleScore", repos = BiocManager::repositories())
#to unload a package and reload
detach("package:ArchR", unload=TRUE)
library(ArchR)

rcorces avatar Oct 05 '22 00:10 rcorces

thank you for helping out so much! I think this branch is a little buggy as I get this error with both my object and projheme

> p3 <- getPeakSet(ArchRProj = m3)
> pax5_peaks <- p3[which(mcols(p3)$nearestGene == "PAX5")]
> features <- list(pax5_score = pax5_peaks)
> m3 <- addModuleScore(ArchRProj = m3, useMatrix = "PeakMatrix", name = "Module", features = features)
ArchR logging to : ArchRLogs/ArchR-addModuleScore-1b89472c0b4d-Date-2022-10-05_Time-11-32-17.log
If there is an issue, please report to github with logFile!
Error in paste0("Feature List - ", x) : object 'x' not found
> traceback()
3: paste0("Feature List - ", x)
2: .logThis(nBgd, name = paste0("Feature List - ", x), logFile = logFile)
1: addModuleScore(ArchRProj = m3, useMatrix = "PeakMatrix", name = "Module", 
       features = features)

ArchR-addModuleScore-1b89472c0b4d-Date-2022-10-05_Time-11-32-17.log

waltno avatar Oct 05 '22 18:10 waltno

sorry. will fix today and make sure it works on my end before sending it back to you.

rcorces avatar Oct 05 '22 18:10 rcorces

I fixed the typo. Can you re-install and re-test?

rcorces avatar Oct 05 '22 19:10 rcorces


> m3 <- addModuleScore(ArchRProj = m3, useMatrix = "PeakMatrix", name = "Module", features = features)
ArchR logging to : ArchRLogs/ArchR-addModuleScore-1b89790ec56c-Date-2022-10-05_Time-12-46-23.log
If there is an issue, please report to github with logFile!
Computing Module 1 of 1
Error in sample.int(length(x), size, replace, prob) : 
  invalid first argument

ArchR-addModuleScore-1b89790ec56c-Date-2022-10-05_Time-12-46-23.log

waltno avatar Oct 05 '22 19:10 waltno

I've made a few more logging additions. If you could re-install and test again and post your log file, that would be helpful.

devtools::install_github("GreenleafLab/ArchR", ref="dev_moduleScore", repos = BiocManager::repositories())
#to unload a package and reload
detach("package:ArchR", unload=TRUE)
library(ArchR)

rcorces avatar Oct 06 '22 17:10 rcorces

I'm fairly confident that the latest change to dev_moduleScore (https://github.com/GreenleafLab/ArchR/commit/12849f328789811c072ebfdf71148c6354219bd7) will fix your problem. I believe it was an off-by-one indexing error that only manifested when trying to sample a background peak from the top-most or bottom-most bin of accessibility. Can you re-install and report back?

rcorces avatar Oct 06 '22 21:10 rcorces

That worked! Thank you so so much!!!!!! Should I close this thread or keep it open?

waltno avatar Oct 06 '22 21:10 waltno

we can leave open for now until the change I made has been reviewed and merged into dev

rcorces avatar Oct 06 '22 22:10 rcorces