TADbit icon indicating copy to clipboard operation
TADbit copied to clipboard

valid read pairs is too few compared with other tools

Open shiyi-pan opened this issue 3 years ago • 4 comments

Hi, I want to use TADbit to detect TADs, the TADbit detect 25,521,413 valid read pairs only ,but other tools such as HiCExplorer could detect 58949201 read pairs using the same data, I think maybe my code has some problem, could you give me some advise? I ues GEM as mapper and here is my code, thank you very much:

from pytadbit.mapping.full_mapper import full_mapping species = 'soybean'

for the first side of the reads

full_mapping(mapper_index_path='NN1138-2.v1.0.genome.gem', out_map_dir='results/fragment/{0}/01_mapping/mapped_{0}_r1/'.format(species), fastq_path='soybean_R1.fastq.gz', r_enz='MboI', frag_map=True, clean=True, nthreads=8, temp_dir='results/fragment/{0}/01_mapping/mapped_r1_tmp/'.format(species))

shiyi-pan avatar Jan 08 '21 15:01 shiyi-pan

Hi,

The number of reads that you get with any pipeline depends on several factors but mainly on two: the mismatch you allow in the mapping of the reads and the filters you applied to the mapped reads. By default we use GEM, but even different versions of GEM have different default mismatches. Version 2 is about 4% or 6% if I remember correctly, other mappers allow higher mismatches like 15%. We also support bowtie2 and hisat2 and those will also produce a different number of reads depending on the parameters you use. By default we use the --very-sensitive option. Then you have the filters to discard what you consider non-informative or errors. Each pipeline has his own (although conceptually similar) filters.

Regards

David

david-castillo avatar Jan 10 '21 15:01 david-castillo

Thank you for your reply, David. I will try other mappers like bowtie2 and hisat2 . I don't know clearly how to use bowtie2 into the TADbit pipeline . Is it just mapping Hi-C reads with bowtie2 and parse the bam file with parse_map function or not ?

shiyi-pan avatar Jan 11 '21 13:01 shiyi-pan

You can use the same code that you already have but specify another mapper in the full_mapper function (like mapper='bowtie2'). Be sure you input the right mapper_index_path because for bowtie2 is not a single file, you have to specify the path and the prefix to the index.

http://3dgenomes.github.io/TADbit/reference/reference_mapping.html#mapping

You can also customize bowtie2 parameters with the mapper_params parameter if you need.

After that the parse is exactly the same.

Regards

David

david-castillo avatar Jan 12 '21 09:01 david-castillo

Thanks a lot , David.

shiyi-pan avatar Jan 12 '21 14:01 shiyi-pan