MultiQC icon indicating copy to clipboard operation
MultiQC copied to clipboard

Picard Target Region Coverage hard to read

Open Redmar-van-den-Berg opened this issue 2 years ago • 6 comments

Description of bug

In recent versions of picard CollectHsMetrics (>2.23.8), PCT_TARGET_BASES is reported up to 100000x coverage, to support high depth amplicon sequencing (see https://github.com/broadinstitute/picard/pull/1542)

However, this extends the x-axis of the MultiQC graph up to 100.000, when most 'regular' sequencing project have a coverage around 100x.

MultiQC graph for picard 2.23.5 image

MultiQC graph for picard 2.26.10 (same data) image

File that triggers the error

No response

MultiQC Error log

No response

Redmar-van-den-Berg avatar Feb 09 '22 13:02 Redmar-van-den-Berg

MultiQC graph for picard 2.26.10 (same data), with the code from https://github.com/ewels/MultiQC/pull/1626. Note that the x-axis now logarithmic. image

Redmar-van-den-Berg avatar Feb 14 '22 04:02 Redmar-van-den-Berg

We've come across something similar with InsertSizeMetrics and WgsMetrics before and have these config options as a result:

https://multiqc.info/docs/#insertsizemetrics https://multiqc.info/docs/#wgsmetrics

Instead of a new solution, could we mimic the same pattern for this module for consistency?

ewels avatar Feb 22 '22 10:02 ewels

I can add the config options for consistency, but I would still like to keep the default behaviour as is. That way, a regular 'naive' user will get the third plot by default, but they are free to modify it using the fold_coverage_xmax option. Otherwise, they will get the second plot by default, which would make MultiQC more difficult to use for anyone but power users.

Redmar-van-den-Berg avatar Feb 23 '22 09:02 Redmar-van-den-Berg

I'm sure that MultiQC has functionality to cut off long tails for plots though 🤔 That's what I was originally thinking about with the above post. I'm sure that there is some function or config to set the xmax automatically based on say 90% of the data. Need to sit down and try to find this again..

ewels avatar Mar 07 '22 21:03 ewels

Do you have an update for this issue? My current solution simply cuts of 0 values at the end of the range, so it should mimic the MultiQC functionality to cut off long tails. Not that this does not strip out 0 values that are between higher counts.

Redmar-van-den-Berg avatar Sep 16 '22 06:09 Redmar-van-den-Berg

So the insertsize module simply has a config option to set xmax: https://github.com/ewels/MultiQC/blob/81dd59cb6f582bf198e3058b02996d39b02b8175/multiqc/modules/picard/InsertSizeMetrics.py#L188-L191

So a short-term fix for just you would be to customise the plot config for this plot when you run MultiQC (see docs).

The WgsMetrics module is a bit more clever and is the one I was thinking about. Unless a threshold is manually defined in a config, it runs over the data and sets the xmax at 99%:

https://github.com/ewels/MultiQC/blob/81dd59cb6f582bf198e3058b02996d39b02b8175/multiqc/modules/picard/WgsMetrics.py#L148-L159

What would be ideal would be to take this code (or something like it) and move it into the core line graph plotting code. Then it could be switched on or off for any line graph plot with a config option. There's already some similar stuff, like the data smoothing code, so it could follow a similar model (probably just setting xmax if it's not already set).

If you fancy having a stab at writing this, that would be great. As you can see by the number of open PRs I have a huge backlog currently (new job + paternity leave) but I'm doing my best to work through it.

Phil

ewels avatar Sep 16 '22 08:09 ewels

Is anyone owning this issue since Phil has been out?

tmelman avatar Feb 28 '23 18:02 tmelman

I'm back tomorrow! 🎉 I'll try to take a look when I can. It'll likely still be a little while, so if anyone fancies having a go at what I proposed above then please go ahead 👍🏻 (just make a comment here first so that we don't duplicate work).

ewels avatar Feb 28 '23 20:02 ewels

This is fixed with https://github.com/ewels/MultiQC/pull/1626 :)

vladsavelyev avatar Nov 17 '23 11:11 vladsavelyev