MultiQC
MultiQC copied to clipboard
Module request: CellRanger COUNT
Related to #413.
-
Name of tool:
- CellRanger COUNT (from 10X Genomics)
-
Tool description:
- "The Chromium Single Cell Gene Expression Solution provides high-throughput, single cell expression measurements that enable discovery of gene expression dynamics and molecular profiling of individual cell types."
-
Tool homepage:
- https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome
-
Complete log file output:
- Attached file web_summary.html.gz. GZipped only for uploading.
-
Log filename pattern:
-
web_summary.html
-
-
Most interesting data for General Stats table:
- Most of the data on the Summary tab
-
Data suitable for MultiQC plot(s):
- From Summary tab:
- Knee plot (Barcodes vs UMI counts)
- Mapping data could be somehow turned into a stacked column plot
- From Analysis tab:
- Sequencing Saturation plot
- Median Genes per Cell plot
- From Summary tab:
Thanks! Parsing the HTML is horrible though, it's far better to use CSV or JSON summary files if at all possible (see for example the longranger input that's used).
Phil
Note to self: can probably steal what I wrote for http://10xqc.com/
Looks like I did it all in JavaScript: https://github.com/ewels/10XQC/blob/master/www/js/10XQC_submission.js
Also, I remember now - the HTML contains a compressed blob string with the data that we need, and I think we struggled to find this anywhere else. Still far from ideal (slow to parse, susceptible to breaking with future releases, probably needs a new dependency library to decompress), but could be a last resort if there are no other files with the data that we need.
Hi Phil,
Yes, you are right. You have done the work on 10xqc. I'm also attaching the CSV file from 10X for the General Metrics, because as you mentioned, it will be much more easier to parse. However, the interesting data for plotting is in the HTML file. Maybe General Metrics could be read from the CSV and only data for plotting from the HTML?
-
Complete log file output:
- metrics_summary.txt Here is the metrics text file for this report, also provided by 10X. I renamed it from CSV to TXT to be able to uploaded it here.
- Some more examples can be downloaded from 10X website here
-
Log filename pattern:
-
web_summary.html
-
metrics_summary.csv
-
-
Most interesting data for General Stats table:
- Most data from
metrics_summary.csv
.
- Most data from
-
Data suitable for MultiQC plot(s):
- From
metrics_summary.csv
:- Mapping data could be somehow turned into a stacked column plot
- From
web_summary.html
:- From Summary tab:
- Knee plot (Barcodes vs UMI counts)
- From Analysis tab:
- Sequencing Saturation plot
- Median Genes per Cell plot
- From Summary tab:
- From
Ok great, thanks! Yes, the parsing from the HTML is a difficult task, so that will not be near the top of the list for a long time I'm afraid. The CSV is much quicker.
Phil
+1 for this. At least for the metrics_summary.csv part for now.
This would be a great feature for MultiQC! Thanks for working on this and for a great tool overall.
Really eager to see this released
+1 metrics_summary.csv
provides the required info..
Hi both,
This module request is in the queue, along with a lot of others. I am struggling with to cope with the maintenance of MultiQC recently and my focus is on merging pull-requests and high priority bugs. So realistically I am not going to have time to write this module any time soon. If you'd like to have a stab at it, the documentation should hopefully be fairly comprehensive: https://multiqc.info/docs/#writing-new-modules
Phil
Hi Phil,
I'll try and have a go at it.. 🤓
@rgranit Did you make progress on this?
Hi @chris-rands, unfortunately have made much progress (beyond preparing my work ENV according to MultiQC spec).. hope to find some time in the next few weeks
Hello, Is there any progress onthis issue ? Maybe I can help ? Let me know ! And point to me any git branch or fork that I can join to help you.
I haven't gotten to read the multiqc documentation in order to implement the feature here, but wrote a simple package with some of this functionality.. https://github.com/compugen-ltd/supercells (sorry, it was just simpler to me)
Finally added in #1689 and #1821! 🎉 Thanks @edg1983!
Hello @ewels and @edg1983, Fantastic to see this module come to life. Thank you very much for putting this together! Would it be too much work to have the module also work on files generated by the cellranger-arc-2.0.0 (at least the cell cellranger-arc count part that concerns the RNAseq data layer of the multiome assay) pipeline? 😇 The html files are indeed structured a bit differently. Thank you!!
@andrea-de-micheli - please open a new issue and attach some zipped example report files 😊