TADbit
TADbit copied to clipboard
Restriction with Arima kit
Dear team,
I would like to clarify if the --renz parameter in tadbit map supports the restriction site for the ARIMA kit (https://arimagenomics.com/products/genome-wide-hic/) - ^GATC,G^ANTC. If so, what is the string I need to use for the --renz parameter?
Many thanks, Jeff
Hi Jeff,
yes, this would be the enzymes DpnII (or MboI) and HinfI. So you can just input --renz MboI Hinfl
.
Let us know how this works.
cheers
Hi @fransua,
Thank you for the reply.
I ran tadbit map with --renz MboI Hinfl and --renz MboI DpnII HinfI and it produced the following error -
Writing log to /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out/process.log
Writing versions of TADbit and dependencies
Generating Hi-C QC plot
Traceback (most recent call last):
File "/home/jgodwin/miniconda3/envs/tadbit_add/bin/tadbit", line 170, in
The mapping runs fine with just --renz DpnII but the addition of HinfI produces the above error. How do I troubleshoot this?
Thanks
It appears the "HinfI" enzyme (STRING) is the issue. Any other combinations of enzymes proceeds normally. Running with just --renz HinfI produces the same error as above Curiously it does appear in the list of available enzymes yet somehow is not identified by the fastq_utils.
this is indeed unexpected...
I had a look to it, with the current version of TADbit in this repo.
Seems to work fine with me (in the mapping stage at least).
but I am perhaps missing something, can you share your command line?
My TADbit version is v1.1
I'm attaching my bash script for your reference -
fastq_dir="/home/jgodwin/Data/Hi-C/RK_data/1_Raw_data" workdir="/home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out" yeast_ref="/home/jgodwin/Data/Hi-C/Yeast_ref"
tadbit map -w ${workdir} --fastq ${fastq_dir}/WT_L1_1.dsrc --read 1 --index ${yeast_ref}/sacCer3.gem --renz DpnII HinfI --genome ${yeast_ref}/sacCer3.fa -C 3 tadbit map -w ${workdir} --fastq ${fastq_dir}/WT_L1_2.dsrc --read 2 --index ${yeast_ref}/sacCer3.gem --renz DpnII HinfI --genome ${yeast_ref}/sacCer3.fa -C 3
I have tried with --renz MboI HinfI as well.
ok, I have tried with Saccharomyces reference genome and the QC step is generated properly:
(uggly because using mock data)
but it's not the genome either... can you try to reinstall TADbit? Perhaps it is something that has been fixed already.
(Note: MboI and DpnII are the same internally, it should not matter if you use one or the other... but perhaps you could try MboI and HindIII)
Unfortunately the problem exists after re-installation too.
Running with MboI and HindIII proceeds without a problem. Attaching the log here -
Writing log to /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out/process.log Writing versions of TADbit and dependencies Generating Hi-C QC plot
- Dangling-ends (sensu-stricto): 10.191%
- Dangling-ends (sensu-stricto): 0.138%
- Ligation sites: 3.069%
- Ligation sites: 0.011%
- Ligation sites: 0.006%
- Ligation sites: 0.000% mapping /home/jgodwin/Data/Hi-C/RK_data/1_Raw_data/WT_L1_1.dsrc read 1 to /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out Preparing FASTQ file
- conversion to MAP format Mapping reads in window 1-end_5f7b312935... TO GEM 3 /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a90dcp1y /home/jgodwin/miniconda3/envs/tadbit-3d/bin/gem-mapper -I /home/jgodwin/Data/Hi-C/Yeast_ref/sacCer3.gem -t 3 -F SAM -o /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a90dcp1y_full_1-end_5f7b312935.map -i /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a90dcp1y Parsing result... x removing GEM input /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a90dcp1y x removing map /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a90dcp1y_full_1-end_5f7b312935.map
- splitting into restriction enzyme (RE) fragments using ligation sites
- ligation sites are replaced by RE sites to match the reference genome
- enzymes: MboI & MboI, ligation site: GATCGATC, RE site: GATC & GATC
- enzymes: MboI & HindIII, ligation site: GATCAGCTT, RE site: GATC & AAGCTT
- enzymes: HindIII & MboI, ligation site: AAGCTGATC, RE site: AAGCTT & GATC
- enzymes: HindIII & HindIII, ligation site: AAGCTAGCTT, RE site: AAGCTT & AAGCTT Preparing MAP file x removing pre-GEM input /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a90dcp1y_filt_1-end_5f7b312935.map Mapping fragments of remaining reads... TO GEM 3 /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a56ec3qt /home/jgodwin/miniconda3/envs/tadbit-3d/bin/gem-mapper -I /home/jgodwin/Data/Hi-C/Yeast_ref/sacCer3.gem -t 3 -F SAM -o /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a56ec3qt_frag_1-end_5f7b312935.map -i /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a56ec3qt Parsing result... x removing GEM input /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a56ec3qt x removing failed to map /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a90dcp1y_fail_5f7b312935.map x removing tmp mapped /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out_tmp_r1_5f7b312935/WT_L1_1_a56ec3qt_frag_1-end_5f7b312935.map ,---------------. | MAPPED_INPUTs | ,----.--------.------------.------.------.------.--------------.-----------------------------.---------------------------------------------------------------------------------.----------.-----------------.---------. | Id | PATHid | Entries | Trim | Frag | Read | Enzyme | Dangling_Ends | Ligation_Sites | WRKDIRid | MAPPED_OUTPUTid | INDEXid | |----+--------+------------+------+------+------+--------------+-----------------------------+---------------------------------------------------------------------------------+----------+-----------------+---------| | 1 | 2 | 23,709,683 | None | full | 1 | MboI-HindIII | MboI:10.191% HindIII:0.138% | MboI-MboI:3.069% MboI-HindIII:0.011% HindIII-MboI:0.006% HindIII-HindIII:0.000% | 1 | 5 | 3 | | 2 | 2 | 6,487,346 | None | frag | 1 | MboI-HindIII | MboI:10.191% HindIII:0.138% | MboI-MboI:3.069% MboI-HindIII:0.011% HindIII-MboI:0.006% HindIII-HindIII:0.000% | 1 | 6 | 3 | '----^--------^------------^------^------^------^--------------^-----------------------------^---------------------------------------------------------------------------------^----------^-----------------^---------' ,-------. | PATHs | ,----.-------.------------------------------------------------.--------------. | Id | JOBid | Path | Type | |----+-------+------------------------------------------------+--------------| | 1 | 1 | /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out | WORKDIR | | 2 | 1 | ../1_Raw_data/WT_L1_1.dsrc | MAPPED_FASTQ | | 3 | 1 | ../../Yeast_ref/sacCer3.gem | INDEX | | 4 | 1 | WT_L1_1.dsrc_MboI-HindIII_5f7b312935.png | FIGURE | | 5 | 1 | 01_mapped_r1/WT_L1_1_full_1-end_5f7b312935.map | SAM/MAP | | 6 | 1 | 01_mapped_r1/WT_L1_1_frag_1-end_5f7b312935.map | SAM/MAP | '----^-------^------------------------------------------------^--------------' ,------. | JOBs | ,----.------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.---------------------.---------------------.------.----------------. | Id | Parameters | Launch_time | Finish_time | Type | Parameters_md5 | |----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+------+----------------| | 1 | skip_mapping:0 genome:[/home/jgodwin/Data/Hi-C/Yeast_ref/sacCer3.fa] read:1 chr_name:[] noX:0 fast_fragment:0 cpus:3 mapper:gem mapper_binary:/home/jgodwin/miniconda3/envs/tadbit-3d/bin/gem-mapper mapper_param:{} gem_version:3 | 19/04/2024 17:50:30 | 19/04/2024 18:00:13 | Map | 5f7b312935 | '----^------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------^---------------------^---------------------^------^----------------' ,---------------. | MAPPED_INPUTs | ,----.--------.------------.------.------.------.--------------.-----------------------------.---------------------------------------------------------------------------------.----------.-----------------.---------. | Id | PATHid | Entries | Trim | Frag | Read | Enzyme | Dangling_Ends | Ligation_Sites | WRKDIRid | MAPPED_OUTPUTid | INDEXid | |----+--------+------------+------+------+------+--------------+-----------------------------+---------------------------------------------------------------------------------+----------+-----------------+---------| | 1 | 2 | 23,709,683 | None | full | 1 | MboI-HindIII | MboI:10.191% HindIII:0.138% | MboI-MboI:3.069% MboI-HindIII:0.011% HindIII-MboI:0.006% HindIII-HindIII:0.000% | 1 | 5 | 3 | | 2 | 2 | 6,487,346 | None | frag | 1 | MboI-HindIII | MboI:10.191% HindIII:0.138% | MboI-MboI:3.069% MboI-HindIII:0.011% HindIII-MboI:0.006% HindIII-HindIII:0.000% | 1 | 6 | 3 | '----^--------^------------^------^------^------^--------------^-----------------------------^---------------------------------------------------------------------------------^----------^-----------------^---------' ,-------. | PATHs | ,----.-------.------------------------------------------------.--------------. | Id | JOBid | Path | Type | |----+-------+------------------------------------------------+--------------| | 1 | 1 | /home/jgodwin/Data/Hi-C/RK_data/2_Tadbit_Out | WORKDIR | | 2 | 1 | ../1_Raw_data/WT_L1_1.dsrc | MAPPED_FASTQ | | 3 | 1 | ../../Yeast_ref/sacCer3.gem | INDEX | | 4 | 1 | WT_L1_1.dsrc_MboI-HindIII_5f7b312935.png | FIGURE | | 5 | 1 | 01_mapped_r1/WT_L1_1_full_1-end_5f7b312935.map | SAM/MAP | | 6 | 1 | 01_mapped_r1/WT_L1_1_frag_1-end_5f7b312935.map | SAM/MAP | '----^-------^------------------------------------------------^--------------' ,------. | JOBs | ,----.------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.---------------------.---------------------.------.----------------. | Id | Parameters | Launch_time | Finish_time | Type | Parameters_md5 | |----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+------+----------------| | 1 | skip_mapping:0 genome:[/home/jgodwin/Data/Hi-C/Yeast_ref/sacCer3.fa] read:1 chr_name:[] noX:0 fast_fragment:0 cpus:3 mapper:gem mapper_binary:/home/jgodwin/miniconda3/envs/tadbit-3d/bin/gem-mapper mapper_param:{} gem_version:3 | 19/04/2024 17:50:30 | 19/04/2024 18:00:13 | Map | 5f7b312935 | '----^------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------^---------------------^---------------------^------^----------------' cleaning temporary files
It appears it's a problem with the reads. The mapping proceeds with the demo data used for the 3DARORC tutorial (SRR5077821) without a problem. I'm rechecking the reads although I do believe it was prepared with the ARIMA double digestion kit.
Hi, yes, it works for me too with other dataset... I suspect it is either the reference genome, can you try with this index: https://drive.google.com/file/d/1vBqQ3CjlEDnVo9p5cIHt81aB5blFNSDD (guessing you are using gem v3).
or an heterogeneous read lengths, can you check that all your reads are the same size?
Hi Francois,
The same issue shows up with the new index as well. My read lengths are the same (35bp - wonder if this is tripping something). I tried mapping with bowtie and hisat as well. The reads are from publicly available data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM5918358) I was able to process them through the nextflow HiC pipeline (https://nf-co.re/hic/2.1.0/docs/usage) with the --digestion 'arima' parameter. I will let you know if anything else comes up.
Thanks again.