cellxgene icon indicating copy to clipboard operation
cellxgene copied to clipboard

Gene sets load slowly for large sets / large datasets

Open ambrosejcarr opened this issue 3 years ago • 6 comments

The performance issue On the Azimuth dataset, which has approximately 1 million cells and 77,000 genes, a gene set of 194 genes takes 105s to fully load.

To Reproduce Steps to reproduce the behavior:

  1. Download the azimuth dataset
  2. Download the hallmark.csv file containing hallmark gene sets in cellxgene format.
  3. cellxgene launch local.h5ad --gene-sets-file hallmark.csv

Version (please complete the following information):

  • Desktop
  • Version 0.17.0

ambrosejcarr avatar Jul 13 '21 03:07 ambrosejcarr

I believe this isn't necessarily a regression. If you load the Azimuth dataset without gene sets, it takes equally as long because the dataset is so large. In contrast, if you did cellxgene launch local.h5ad --gene-sets-file hallmark.csv --backed then the time to load is quite quick. Of course then the slowness is passed on to other parts.

@bkmartinjr am I roughly on target with the assessment here? If so, my question to @signechambers1 would be whether to invest in this for 1.0 or not.

maniarathi avatar Jul 14 '21 18:07 maniarathi

@bkmartinjr am I roughly on target with the assessment here? If so, my question to @signechambers1 would be whether to invest in this for 1.0 or not.

I don't know - would need to explore a bit to give you an idea. Historically, we almost always have regressions (initially) when making changes like this, so it isn't all that unlikely. LMK if you want me to investigate.

Any fixes here would benefit all deployments. And it has been a long time since I did a performance sweep of the component rendering (ie, there is likely some wins available).

bkmartinjr avatar Jul 14 '21 18:07 bkmartinjr

OK I think I may have misunderstood -- @ambrosejcarr what did you mean by load? As in the time it takes before localhost:5000 is available or the time it take to render the UI once you land on localhost:5000?

maniarathi avatar Jul 14 '21 19:07 maniarathi

My bad, chatted offline with Ambrose. This is the time it takes for all the genes to load when you open a large gene set.

maniarathi avatar Jul 14 '21 22:07 maniarathi

Arathi is correct - the time from unfurling a large gene set to the time when the last gene in the gene set loads. Sorry my initial description wasn't clear!

On Wed, Jul 14, 2021 at 6:33 PM maniarathi @.***> wrote:

My bad, chatted offline with Ambrose. This is the time it takes for all the genes to load when you open a large gene set.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/chanzuckerberg/cellxgene/issues/2289#issuecomment-880253158, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABH7C4D4IEVS2Q3KHG6GDGLTXYGCZANCNFSM5AIBT44Q .

ambrosejcarr avatar Jul 15 '21 00:07 ambrosejcarr

Removing from Desktop 1.0 epic since we made more granular tickets for the fixes that are in for Desktop 1.0. Keeping open for tracking purposes and will reevaluate after testing.

signechambers1 avatar Aug 02 '21 23:08 signechambers1