Integron_Finder icon indicating copy to clipboard operation
Integron_Finder copied to clipboard

[BUG] Coordinates error

Open rpalcab opened this issue 6 months ago • 2 comments

Hi,

I have observed an error referring to the coordinates, although different from the previously addressed on issue #114 .

When looking for integrons on a set of samples, I find that the integrase promoter is not correctly predicted in a couple of them. Specifically, P_int is predicted outside of the actual length of the contigs.

To Reproduce

# integron_finder 2.0.6 
# cmd: integron_finder Kpne_VR_40131.fasta --promoter-attI --cpu 2 --outdir .

ID_integron	ID_replicon	element	pos_beg	pos_end	strand	evalue	type_elt	annotation	model	type	default	distance_2attC	considered_topology
integron_01	contig_3	contig_3_205	186675	187688	-1	1.8999999999999998e-24	protein	intI	intersection_tyr_intI	In0	Yes	NA	lin
integron_01	contig_3	Pc_int1	187691	187717	1	NA	Promoter	Pc_1	NA	In0	Yes	NA	lin
integron_01	contig_3	attI1	187768	187826	1	NA	attI	attI_1	NA	In0	Yes	NA	lin
integron_01	contig_3	P_intI1	188068	188102	-1	NA	Promoter	Pint_1	NA	In0	Yes	NA	lin
integron_01	contig_4	contig_4_7	4724	5071	-1	NA	protein	protein	NA	complete	Yes	NA	lin
integron_01	contig_4	attc_001	5174	5233	-1	1.5e-09	attC	attC	attc_4	complete	Yes	NA	lin
integron_01	contig_4	contig_4_8	5235	6014	-1	NA	protein	protein	NA	complete	Yes	NA	lin
integron_01	contig_4	attc_002	6030	6089	-1	1e-06	attC	attC	attc_4	complete	Yes	797.0	lin
integron_01	contig_4	attc_003	6350	6439	-1	2.9e-05	attC	attC	attc_4	complete	Yes	261.0	lin
integron_01	contig_4	contig_4_9	6434	6931	-1	NA	protein	protein	NA	complete	Yes	NA	lin
integron_01	contig_4	attI1	6937	6995	-1	NA	attI	attI_1	NA	complete	Yes	NA	lin
integron_01	contig_4	P_intI1	7020	7054	1	NA	Promoter	Pint_1	NA	complete	Yes	NA	lin
integron_01	contig_4	Pc_int1	7046	7072	-1	NA	Promoter	Pc_1	NA	complete	Yes	NA	lin
integron_01	contig_4	contig_4_10	7076	8089	1	2.6e-25	protein	intI	intersection_tyr_intI	complete	Yes	NA	lin

# integron_finder 2.0.6 
# cmd: integron_finder Ecol_VR_96159.fasta --promoter-attI --cpu 2 --outdir .
ID_integron	ID_replicon	element	pos_beg	pos_end	strand	evalue	type_elt	annotation	model	type	default	distance_2attC	considered_topology
integron_01	contig_8	contig_8_1	2	502	1	NA	protein	protein	NA	CALIN	Yes	NA	lin
integron_01	contig_8	attc_001	497	629	1	1.8e-06	attC	attC	attc_4	CALIN	Yes	NA	lin
integron_01	contig_8	contig_8_2	633	1421	1	NA	protein	protein	NA	CALIN	Yes	NA	lin
integron_01	contig_8	attc_002	1468	1524	1	0.0023	attC	attC	attc_4	CALIN	Yes	839.0	lin
integron_01	contig_8	contig_8_3	1627	1974	1	NA	protein	protein	NA	CALIN	Yes	NA	lin
integron_02	contig_8	contig_8_93	77027	77788	-1	1.2e-22	protein	intI	intersection_tyr_intI	In0	Yes	NA	lin
integron_02	contig_8	Pc_int1	77791	77817	1	NA	Promoter	Pc_1	NA	In0	Yes	NA	lin
integron_02	contig_8	P_intI1	78180	78214	-1	NA	Promoter	Pint_1	NA	In0	Yes	NA	lin

Find the original FASTA files at https://zenodo.org/records/15720417

Expected behavior

I would expect no P_intI1 in the reports, since there is no possible match (nor complete or partial). In both cases, P_intI starts after the contig ends.

  • First sample (Kpne_VR_40131): P_intI1 at (188068-188102), contigs ends at 187829
  • Second sample (Ecol_VR_96159): P_intI1 at (78180-78214), contig ends at 77917

Please complete the following information):

OS:

  • [x] Linux
  • [ ] Windows
  • [ ] Mac

Integron_Finder Version:

integron_finder: 2.0.6

rpalcab avatar Jun 23 '25 10:06 rpalcab

Hi! Any updates about this issue?

rpalcab avatar Oct 13 '25 15:10 rpalcab

Hello,

Thanks for the update, I started to have a look and then forgot about it. I think there is a bug when we compute the position of the promoter (and likely attI) when it's on the side of the replicon. If @bneron you can have a look, otherwise I'll try to have look at it but I don't have much time.

Best Jean

jeanrjc avatar Oct 13 '25 16:10 jeanrjc