modkit icon indicating copy to clipboard operation
modkit copied to clipboard

`dmr multi` still looks for `--regions`

Open Ge0rges opened this issue 2 years ago • 8 comments

I was testing out the latest rc release to do dmr multi without specifying --regions and still got the error:

error: the following required arguments were not provided:                   
  --regions-bed <REGIONS_BED>

I ran: modkit dmr multi $samples -o methylation/$genome_name/ -t 120 --ref $genome --base C --base A where samples defines the series of -s arguments.

Ge0rges avatar Nov 28 '23 02:11 Ge0rges

@Ge0rges yes that's correct. In 0.2.3 (out this week) dmr multi will work at the individual base level.

ArtRand avatar Nov 28 '23 16:11 ArtRand

Apologies I was unaware of the status of RC releases.

Ge0rges avatar Nov 28 '23 16:11 Ge0rges

Seems like I've run into this again on v0.2.6. Running the following command:

modkit dmr multi -s methylation_5/34h_assembly/top.bed.gz top -s methylation_5/34h_assembly/barcode01.bed.gz barcode01 -s methylation_5/34h_assembly/barcode02.bed.gz barcode02 -s methylation_5/34h_assembly/barcode03.bed.gz barcode03 -s methylation_5/34h_assembly/barcode04.bed.gz barcode04 -s methylation_5/34h_assembly/barcode05.bed.gz barcode05 -s methylation_5/34h_assembly/barcode06.bed.gz barcode06 -s methylation_5/34h_assembly/barcode07.bed.gz barcode07 -s methylation_5/34h_assembly/barcode08.bed.gz barcode08 -s methylation_5/34h_assembly/barcode09.bed.gz barcode09 -s methylation_5/34h_assembly/barcode10.bed.gz barcode10 -s methylation_5/34h_assembly/barcode11.bed.gz barcode11 -s methylation_5/34h_assembly/barcode12.bed.gz barcode12 -s methylation_5/34h_assembly/barcode13.bed.gz barcode13 -s methylation_5/34h_assembly/barcode14.bed.gz barcode14 -s methylation_5/34h_assembly/barcode15.bed.gz barcode15 -s methylation_5/34h_assembly/barcode16.bed.gz barcode16 -s methylation_5/34h_assembly/barcode17.bed.gz barcode17 -s methylation_5/34h_assembly/barcode18.bed.gz barcode18 -o methylation_5/34h_assembly/dmr_by_position/ -t 80 --ref 34h_assembly.fna --base C --base A --min-valid-coverage 5

I get:

error: the following required arguments were not provided:
  --regions-bed <REGIONS_BED>

I'm fairly confident this exact command worked prior to v0.2.6 as I don't recall making any changes to my script.

Ge0rges avatar Apr 11 '24 16:04 Ge0rges

Hello @Ge0rges,

Are you trying to perform single-site analysis with multiple samples? My previous comment ended up being incorrect. There are two things you can do:

  1. Test two conditions between multiple samples, you use modkit dmr pair with multiple -a and -b options such as
modkit dmr pair \
  -a ${norm_pileup_1}.gz \
  -a ${norm_pileup_2}.gz \
  -b ${tumor_pileup_1}.gz \
  -b ${tumor_pileup_2}.gz \
  -o ${dmr_result_replicates} \
  --ref ${ref} \
  --base C \
  --threads ${threads} \
  --log-filepath dmr.log
  1. Test "all pairwise" combinations of multiple samples (looks like this is what your script does). To do this you use modkit dmr multi but you must have --regions as you've discovered. To perform single-site analysis on all pairwise combinations you have to make a script with a loop.

ArtRand avatar Apr 11 '24 17:04 ArtRand

Hi @ArtRand,

I am indeed trying to do an all pairwise comparison in the context of a single site analysis.

Is there a reason multi doesn't adopt package that loop?

Ge0rges avatar Apr 11 '24 18:04 Ge0rges

@Ge0rges,

You'll need to script up the loop then. There is no reason that dmr multi couldn't do the loop for you. The implementation in multi isn't my favorite so I'd like to refactor it a bit instead of bolting on more features.

You might also be interested in the entopy feature (still alpha). Right now you'd have to combine your mod-BAMs together, but then you could find high entropy intervals of the assembly which would indicate differential methylation between your samples.

ArtRand avatar Apr 11 '24 18:04 ArtRand

Thanks @ArtRand! No worries about implementing that immediately, writing the loop is easy.

I'm curious have you compared DMR results to the entropy feature?

Ge0rges avatar Apr 12 '24 01:04 Ge0rges

Hello @Ge0rges,

I've added the ability to use multiple mod-BAMs as input to the modkit entropy alpha feature (see the above thread for a build).

I took a quick look at a simple correlation between the two metrics, they aren't very well correlated. I think this makes sense because the model in DMR looks at differences in frequencies of each modification. Whereas the entropy calculation is really about how different are the modification patterns in the reads. In fact, you could have equivalent modification rates and high entropy (see Figure 2 of doi:10.1038/ng.3805). I'd be keen to hear what you find in your samples, I'm doing some experiments as well.

ArtRand avatar Apr 16 '24 00:04 ArtRand