pypiper
pypiper copied to clipboard
Hard coded total chromosome sizes
Total chromosome sizes are hardcoded in the function "macs2CallPeaksATACSeq" and "macs2CallPeaks" of ngstk.py. So I ran into problems when I did the analysis with mm9. Maybe this could be added to the atacseq.yaml
Also, I wonder if those genome sizes are correct: For mm9, I summed up the chromosome size values from the chromosome_sizes files: /data/prod/ngs_resources/genomes/mm9/mm9_chromlength.txt The size i get exactly corresponds to this one: http://genomewiki.ucsc.edu/index.php/Genome_size_statistics
However, if I do the same for the other genomes (e.g. hg19) I do get 3.1e9 bases, which is similar to the link above but different from what's defined in ngstk.py.
Those numbers are taken straight from here: https://github.com/taoliu/MACS I guess one could be more accurate, but I wouldn't think it is so critical.
@afrendeiro what do you think about changing these to use refgenieconf? All we would need is a chrom_sizes asset, and then you would just use `refgenieconf.get_asset(genome, "chrom_sizes") to get the chromsizes file.
that way it works with any genome.