eager icon indicating copy to clipboard operation
eager copied to clipboard

Add --platypus option to PMDtools

Open jfy133 opened this issue 5 years ago • 8 comments

Is your feature request related to a problem? Please describe.

@pontussk suggests including the possibility for users to run PMDtools with the following flag --platypus. This is apparently used for mammals nDNA, as CpG cytosines are not affected by UDG treatment (and you can use it to see damage patterns still in UDG treatment, when using alongside --deamination!).

he also recommends including --first and --CpG

I will add those flags and maybe ask him to contribute some documentation (or 'recommendations' when to use those flags).

jfy133 avatar Feb 10 '20 14:02 jfy133

Note --CpG is already implemented in https://github.com/nf-core/eager/blob/da407f8c589da286ff54904106801a7be2d4b05c/main.nf#L1610 where it is always selected if UDG type is either 'full'. If UDG is set to 'half' the --UDGhalf parameter is used instead.

--deamination is already automatically added in with https://github.com/nf-core/eager/blob/da407f8c589da286ff54904106801a7be2d4b05c/main.nf#L1623

So I will just add --first and --platypus

jfy133 avatar Feb 10 '20 15:02 jfy133

Thanks for investigating this!

apeltzer avatar Feb 11 '20 18:02 apeltzer

Preparing a PR that will be adding the option to include --platypus to the command.

I cannot get --first to work. It seems that the latest pmdtools release lacks some code changes that let this option work as intended (at least within my limited tests).

tagged release code: https://github.com/pontussk/PMDtools/blob/0.60/pmdtools.0.60.py#L1074-L1082

vs code at master (cc91fb8) https://github.com/pontussk/PMDtools/blob/master/pmdtools.0.60.py#L1075-L1081

I suspect a new release will need to be pushed before --first can be implemented in eager.

TCLamnidis avatar Mar 22 '21 13:03 TCLamnidis

Can you make an issue on PMDtools to ask, and ref back here, so we can keep track of the convo?

jfy133 avatar Mar 22 '21 13:03 jfy133

https://github.com/pontussk/PMDtools/issues/8

TCLamnidis avatar Mar 22 '21 14:03 TCLamnidis

Hi Thiseas,

That is unfortunate that --first does not work. I can have a look at fixing this if you are able to send me any error messages.

Best, Pontus

On Mon, 22 Mar 2021 at 14:09, Thiseas C. Lamnidis @.***> wrote:

pontussk/PMDtools#8 https://github.com/pontussk/PMDtools/issues/8

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nf-core/eager/issues/349#issuecomment-804091323, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABO76I65S2D5II4YNJN5ZZDTE5FQHANCNFSM4KSPAPZA .

pontussk avatar Mar 22 '21 20:03 pontussk

Using Conda

$ conda create --name pmdtools pmdtools -c bioconda`
$ conda activate pmdtools
pmdtools v0.50
$ samtools calmd -b JK2067_rmdup.bam hs37d5.fa | samtools view -h - | pmdtools --deamination --first --platypus --range 10 --CpG  -n 10000 > "JK2067".cpg.range."10".1st.txt`

Traceback (most recent call last):
  File "/projects1/users/lamnidis/miniconda3/envs/pmdtools/bin/pmdtools", line 1074, in <module>
    if n>0:
NameError: name 'n' is not defined
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1
[E::bgzf_flush] File write failed (wrong size)
samtools calmd: failed to write to output file: Broken pipe
[E::bgzf_close] File write failed

Master branch (cc91fb8)

$ cd ~/Software/
$ git clone [email protected]:pontussk/PMDtools.git
$ python2 ~/Software/PMDtools/pmdtools.0.60.py --version

pmdtools.0.60.py v0.50
$ cd -
$ samtools calmd -b JK2067_rmdup.bam hs37d5.fa | samtools view -h - | python2 ~/Software/PMDtools/pmdtools.0.60.py --deamination --first --platypus --range 10 --CpG  -n 10000 > "JK2067".cpg.range."10".1st.github.txt

samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1
[bam_fillmd1] different MD for read 'K00233:65:HKTTVBBXX:6:2119:1387:28446': '30^T24' -> '30^N24'
[bam_fillmd1] different MD for read 'K00233:65:HKTTVBBXX:5:1112:4066:2070': '0A0C0T53' -> '0A0C0N53'

The latter command provides a valid output file, but the former does not.

$ ls -l 

-rw-r--r-- 1 lamnidis domain users 4.2K Mar 24 19:12 JK2067.cpg.range.10.1st.github.txt
-rw-r--r-- 1 lamnidis domain users    0 Mar 24 19:05 JK2067.cpg.range.10.1st.txt

I attach the bam and bai used above. Archive.zip

TCLamnidis avatar Mar 24 '21 18:03 TCLamnidis

So the pipeline is able to do it, once the PMDTools tool / recipe is fixed this is runnable 👍🏻

apeltzer avatar Mar 29 '21 15:03 apeltzer