CoGAPS
CoGAPS copied to clipboard
CoGAPS does not learn to specifed nPatterns when runnign in dsitributed mode
I am running CoGAPS
on a small single-cell data set: 11623 genes x 900 cells. I have noticed that when I run CoGAPS
in distributed mode, it will not produce the number of patterns I specified in nPatterns. Here is the full params stored in the result object: as
cogapsresult@metadata$params
-- Standard Parameters --
nPatterns 6
nIterations 500
seed 1234
sparseOptimization TRUE
distributed genome-wide
-- Sparsity Parameters --
alpha 0.01
maxGibbsMass 100
-- Distributed CoGAPS Parameters --
nSets 7
cut 6
minNS 4
maxNS 11
however, as you can see, only 4 patterns were learned:
cogapsresult
[1] "CogapsResult object with 11623 features and 900 samples"
[1] "4 patterns were learned"
Now, if I run not in distributed mode, it takes longer, but I get the number of patterns I asked for. Here are the parameters for this run:
cogapsresult@metadata$params
-- Standard Parameters --
nPatterns 6
nIterations 500
seed 1234
sparseOptimization TRUE
-- Sparsity Parameters --
alpha 0.01
maxGibbsMass 100
And the object itself:
cogapsresult
[1] "CogapsResult object with 11623 features and 900 samples"
[1] "6 patterns were learned"
I don't know why this is happening. I assumed I was overwriting some parameters when I created the distributed params object, but as you can see, the intended number of patterns is indeed being passed on the the CoGAPS
function.
This data set is small, so I can afford to run in standard mode, but it's not scaleable without the ability to run distributed and generate the intended number of patterns. Could you please help me understand what's going on here? I'm hoping there's something simple I'm overlooking. Thanks!