MultiQC icon indicating copy to clipboard operation
MultiQC copied to clipboard

Module request: CellRanger COUNT

Open santiagorevale opened this issue 6 years ago • 14 comments

Related to #413.

  • Name of tool:

    • CellRanger COUNT (from 10X Genomics)
  • Tool description:

    • "The Chromium Single Cell Gene Expression Solution provides high-throughput, single cell expression measurements that enable discovery of gene expression dynamics and molecular profiling of individual cell types."
  • Tool homepage:

    • https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome
  • Complete log file output:

  • Log filename pattern:

    • web_summary.html
  • Most interesting data for General Stats table:

    • Most of the data on the Summary tab
  • Data suitable for MultiQC plot(s):

    • From Summary tab:
      • Knee plot (Barcodes vs UMI counts)
      • Mapping data could be somehow turned into a stacked column plot
    • From Analysis tab:
      • Sequencing Saturation plot
      • Median Genes per Cell plot

santiagorevale avatar May 03 '18 13:05 santiagorevale

Thanks! Parsing the HTML is horrible though, it's far better to use CSV or JSON summary files if at all possible (see for example the longranger input that's used).

Phil

ewels avatar May 04 '18 10:05 ewels

Note to self: can probably steal what I wrote for http://10xqc.com/

Looks like I did it all in JavaScript: https://github.com/ewels/10XQC/blob/master/www/js/10XQC_submission.js

Also, I remember now - the HTML contains a compressed blob string with the data that we need, and I think we struggled to find this anywhere else. Still far from ideal (slow to parse, susceptible to breaking with future releases, probably needs a new dependency library to decompress), but could be a last resort if there are no other files with the data that we need.

ewels avatar May 04 '18 11:05 ewels

Hi Phil,

Yes, you are right. You have done the work on 10xqc. I'm also attaching the CSV file from 10X for the General Metrics, because as you mentioned, it will be much more easier to parse. However, the interesting data for plotting is in the HTML file. Maybe General Metrics could be read from the CSV and only data for plotting from the HTML?

  • Complete log file output:
    • metrics_summary.txt Here is the metrics text file for this report, also provided by 10X. I renamed it from CSV to TXT to be able to uploaded it here.
    • Some more examples can be downloaded from 10X website here
  • Log filename pattern:
    • web_summary.html
    • metrics_summary.csv
  • Most interesting data for General Stats table:
    • Most data from metrics_summary.csv.
  • Data suitable for MultiQC plot(s):
    • From metrics_summary.csv:
      • Mapping data could be somehow turned into a stacked column plot
    • From web_summary.html:
      • From Summary tab:
        • Knee plot (Barcodes vs UMI counts)
      • From Analysis tab:
        • Sequencing Saturation plot
        • Median Genes per Cell plot

santiagorevale avatar May 08 '18 13:05 santiagorevale

Ok great, thanks! Yes, the parsing from the HTML is a difficult task, so that will not be near the top of the list for a long time I'm afraid. The CSV is much quicker.

Phil

ewels avatar May 08 '18 15:05 ewels

+1 for this. At least for the metrics_summary.csv part for now.

AMChalkie avatar Oct 03 '18 06:10 AMChalkie

This would be a great feature for MultiQC! Thanks for working on this and for a great tool overall.

FerrenaAlexander avatar Jan 04 '19 19:01 FerrenaAlexander

Really eager to see this released

chansigit avatar May 12 '19 07:05 chansigit

+1 metrics_summary.csv provides the required info..

rgranit avatar Jan 23 '21 21:01 rgranit

Hi both,

This module request is in the queue, along with a lot of others. I am struggling with to cope with the maintenance of MultiQC recently and my focus is on merging pull-requests and high priority bugs. So realistically I am not going to have time to write this module any time soon. If you'd like to have a stab at it, the documentation should hopefully be fairly comprehensive: https://multiqc.info/docs/#writing-new-modules

Phil

ewels avatar Jan 24 '21 11:01 ewels

Hi Phil,

I'll try and have a go at it.. 🤓

rgranit avatar Jan 24 '21 21:01 rgranit

@rgranit Did you make progress on this?

chris-rands avatar Mar 16 '21 10:03 chris-rands

Hi @chris-rands, unfortunately have made much progress (beyond preparing my work ENV according to MultiQC spec).. hope to find some time in the next few weeks

rgranit avatar Mar 16 '21 12:03 rgranit

Hello, Is there any progress onthis issue ? Maybe I can help ? Let me know ! And point to me any git branch or fork that I can join to help you.

drbecavin avatar Jun 23 '22 12:06 drbecavin

I haven't gotten to read the multiqc documentation in order to implement the feature here, but wrote a simple package with some of this functionality.. https://github.com/compugen-ltd/supercells (sorry, it was just simpler to me)

rgranit avatar Jun 23 '22 19:06 rgranit

Finally added in #1689 and #1821! 🎉 Thanks @edg1983!

ewels avatar Jan 08 '23 23:01 ewels

Hello @ewels and @edg1983, Fantastic to see this module come to life. Thank you very much for putting this together! Would it be too much work to have the module also work on files generated by the cellranger-arc-2.0.0 (at least the cell cellranger-arc count part that concerns the RNAseq data layer of the multiome assay) pipeline? 😇 The html files are indeed structured a bit differently. Thank you!!

andrea-de-micheli avatar Feb 06 '23 15:02 andrea-de-micheli

@andrea-de-micheli - please open a new issue and attach some zipped example report files 😊

ewels avatar Feb 06 '23 16:02 ewels