modkit icon indicating copy to clipboard operation
modkit copied to clipboard

dmr multi doesn't handle replicates

Open Ge0rges opened this issue 1 year ago • 1 comments

Hi @ArtRand,

Running the following:

modkit dmr multi   
-s methylation_10/brevundimonas_r-contigs/barcode01.bed.gz top   
-s methylation_10/brevundimonas_r-contigs/barcode02.bed.gz middle   
-s methylation_10/brevundimonas_r-contigs/barcode03.bed.gz bottom   
-s methylation_10/brevundimonas_r-contigs/barcode05.bed.gz top   
-s methylation_10/brevundimonas_r-contigs/barcode06.bed.gz middle   
-s methylation_10/brevundimonas_r-contigs/barcode07.bed.gz bottom  
 -s methylation_10/brevundimonas_r-contigs/barcode08.bed.gz top   
-s methylation_10/brevundimonas_r-contigs/barcode09.bed.gz middle   
-s methylation_10/brevundimonas_r-contigs/barcode10.bed.gz bottom   
-s methylation_10/brevundimonas_r-contigs/barcode11.bed.gz barcode11  
 -s methylation_10/brevundimonas_r-contigs/barcode12.bed.gz barcode12   
-s methylation_10/brevundimonas_r-contigs/barcode13.bed.gz barcode13   
-s methylation_10/brevundimonas_r-contigs/barcode14.bed.gz barcode14   
-r methylation_10/brevundimonas_r-contigs/gene-coordinates.txt   
-o methylation_10/brevundimonas_r-contigs/dmr_by_gene/   
-t 20   --ref mags/brevundimonas_r-contigs.fna   --base C   --base A   --min-valid-coverage 10

I get the following error:

> creating directory at "methylation_10/brevundimonas_r-contigs/dmr_by_gene/"
> loaded 3361948 regions
> processing 30 regions concurrently
> Error! refusing to overwrite "methylation_10/brevundimonas_r-contigs/dmr_by_gene/top_middle.bed"

This happens after the progress bar shows an initial top/middle comparison, and that comes again. I guess here the expected behavior (maybe wrongly?) was that the replicates would be somehow aggregated and compared. I achieve this currently by just aggregating the modBams before created the pileup to pass to this command.

Ge0rges avatar Jul 15 '24 20:07 Ge0rges

Hey @Ge0rges,

I agree that modkit dmr multi should handle replicates the way that the other dmr commands do when you've annotated them like you have here. I can work on implementing that feature.

ArtRand avatar Jul 15 '24 22:07 ArtRand

Hello @Ge0rges,

As of Version 0.4.0 you can use multiple replicates in dmr pair multi by using the same name for samples to be combined. Using your example:

modkit dmr multi   
-s methylation_10/brevundimonas_r-contigs/barcode01.bed.gz top  <-- this will be combined with 
-s methylation_10/brevundimonas_r-contigs/barcode02.bed.gz middle   
-s methylation_10/brevundimonas_r-contigs/barcode03.bed.gz bottom   
-s methylation_10/brevundimonas_r-contigs/barcode05.bed.gz top   
-s methylation_10/brevundimonas_r-contigs/barcode06.bed.gz middle   
-s methylation_10/brevundimonas_r-contigs/barcode07.bed.gz bottom  
 -s methylation_10/brevundimonas_r-contigs/barcode08.bed.gz top   <-- this one
-s methylation_10/brevundimonas_r-contigs/barcode09.bed.gz middle   
-s methylation_10/brevundimonas_r-contigs/barcode10.bed.gz bottom   
-s methylation_10/brevundimonas_r-contigs/barcode11.bed.gz barcode11  
 -s methylation_10/brevundimonas_r-contigs/barcode12.bed.gz barcode12   
-s methylation_10/brevundimonas_r-contigs/barcode13.bed.gz barcode13   
-s methylation_10/brevundimonas_r-contigs/barcode14.bed.gz barcode14   
-r methylation_10/brevundimonas_r-contigs/gene-coordinates.txt   
-o methylation_10/brevundimonas_r-contigs/dmr_by_gene/   
-t 20   --ref mags/brevundimonas_r-contigs.fna   --base C   --base A   --min-valid-coverage 10

and so forth.

ArtRand avatar Sep 19 '24 21:09 ArtRand