HiCUP
HiCUP copied to clipboard
ERROR: The restriction site (re1) needs to be a valid DNA sequence
Hi, Steve, I received this error: The restriction site (re1) needs to be a valid DNA sequence
My digest header looks like this:
Genome:hg38 Restriction_Enzyme1:DpnII [^GATC] Arima [G^ANTC] Restriction_Enzyme2:None Hicup Digester versi on 0.8.0 Chromosome Fragment_Start_Position Fragment_End_Position Fragment_Number RE1_Fragment_Number 5'_Restriction_Site3'_Restriction_Site chr1 1 11159 1 1 None Re1 chr1 11160 11507 2 2 Re1 Re1
I am using hicup-0.9.2.
This digest also not working Genome:hg38 Restriction_Enzyme1:DpnII [^GATC] HinfI [G^AATC] Restriction_Enzyme2:None [None] Hicup Digest er version 0.6.1
Not sure what is going on, since this first digest is provided by a company, should have been used by many others. Thanks for your time.
Jun
Hi Jun,
Thank you for your message. I'm not sure why this is happening, however before I investigate further I suggest that you re-make your Digest file using hicup_digester v0.9.2 - it may be an incompatibility between HiCUP v0.6.1 and v0.9.2 that is causing the error.
Does that resolve the issue?
Regards, Steven
Hi, Steven, Thanks for your reply (I did not change my gmail so often). I did run a digest using v.0.9.2 Hicup and it has the same error. This should be the line where you are checking the ACTGN base names in the enzyme description but I can't find anything wrong with the header (very strange and frustrating).
I will try to run a test dataset as described in the tutorial. I will also ask our admin to install an older version of hicup if possible to test that. Thanks Jun
On Fri, Jun 30, 2023 at 10:27 AM Steven Wingett @.***> wrote:
Hi Jun,
Thank you for your message. I'm not sure why this is happening, however before I investigate further I suggest that you re-make your Digest file using hicup_digester v0.9.2 - it may be an incompatibility between HiCUP v0.6.1 and v0.9.2 that is causing the error.
Does that resolve the issue?
Regards, Steven
— Reply to this email directly, view it on GitHub https://github.com/StevenWingett/HiCUP/issues/92#issuecomment-1614819035, or unsubscribe https://github.com/notifications/unsubscribe-auth/APBXGRAFXONP4T5VTXMB7W3XN3V7FANCNFSM6AAAAAAZWHNLGA . You are receiving this because you authored the thread.Message ID: @.***>
Hi,
Ah,okay. Would you send me the commands you used to:
-
Make the digest file
-
Run the HiCUP pipeline (and attach the config file).
Thanks,
Steven
Hi, Steve, Really appreciate your help on this.
Files I used are attached:
- header_10 are digestions, either from the company (probably old), or digested by myself using current v.0.9.2.
- lsf cmd running files, we use bsub lsf to submit jobs
- conf file for hicup
Of note, the digest command I used contained a G^ANTC site with N in it:
hicup_digester --genome hg38 --re1 ^GATC,DpnII:G^AATC,HinfI:G^ACTC,H2:G^AGTC,H3:G^ATTC,H4 hg38.fa
I was trying to break down HinfI enzyme to explicitly name GATC sequences without N because the digester cannot take N in it (is this true the digester cannot take N?). Even if like this, the digester is taking G^AATC the first HinfI I listed and ignored all other possibilities (H2-H4), so the digestion is only partial (is it true the digester only takes to enzymes for re1?).
So for this digestion to be complete, I have to enumerate the HinfI sites and do more digestions like this one by one, then merge the result files into one big file (not sure how the company has done this). I checked my digestion, looks like a partial of the company's whole.
Attached conf file has a digestion file resulted from the above digestion (partial digestion), still the error was unrecognized letter GATCN as I posted. Please check whether I had some easy mistakes. Thanks
Jun
On Mon, Jul 3, 2023 at 3:29 AM Steven Wingett @.***> wrote:
Hi,
Ah,okay. Would you send me the commands you used to:
Make the digest file 2.
Run the HiCUP pipeline (and attach the config file).
Thanks,
Steven
— Reply to this email directly, view it on GitHub https://github.com/StevenWingett/HiCUP/issues/92#issuecomment-1617620106, or unsubscribe https://github.com/notifications/unsubscribe-auth/APBXGRDMVMM4SPLCODVGGBTXOJ7GRANCNFSM6AAAAAAZWHNLGA . You are receiving this because you authored the thread.Message ID: @.***>
Genome:hg38 Restriction_Enzyme1:DpnII [^GATC] Arima [G^ANTC] Restriction_Enzyme2:None Hicup Digester version 0.8.0 Chromosome Fragment_Start_Position Fragment_End_Position Fragment_Number RE1_Fragment_Number 5'_Restriction_Site 3'_Restriction_Site chr1 1 11159 1 1 None Re1 chr1 11160 11507 2 2 Re1 Re1 chr1 11508 11522 3 3 Re1 Re1 chr1 11523 11685 4 4 Re1 Re1 chr1 11686 12410 5 5 Re1 Re1 chr1 12411 12460 6 6 Re1 Re1 chr1 12461 12685 7 7 Re1 Re1 chr1 12686 12828 8 8 Re1 Re1
Genome:hg38 Restriction_Enzyme1:DpnII [^GATC] HinfI [G^AATC] Restriction_Enzyme2:None [None] Hicup Digester version 0.6.1 Chromosome Fragment_Start_Position Fragment_End_Position Fragment_Number RE1_Fragment_Number 5'_Restriction_Site 3'_Restriction_Site chr1 1 11159 1 1 None Re1 chr1 11160 12410 2 2 Re1 Re1 chr1 12411 12460 3 3 Re1 Re1 chr1 12461 12685 4 4 Re1 Re1 chr1 12686 12828 5 5 Re1 Re1 chr1 12829 13314 6 6 Re1 Re1 chr1 13315 13419 7 7 Re1 Re1 chr1 13420 13565 8 8 Re1 Re1