funannotate
funannotate copied to clipboard
Imperfect solution to antiSMASH cluster numbering
This is the solution I am currently using for the issue described in #736 where antiSMASH cluster numbering starts over from 1 on each contig. It isn't terribly elegant, but at least each cluster ends up with a different number.
I think the issue here is in the parsing of the clusters. Originally the feature in the GenBank file for each cluster was protocluster
- I think that has now changed to either cluster
or candidate_cluster
I can't recall off the top of my head. So perhaps a better fix is to change the parsing of the antiSMASH GBK to then align with the antiSMASH HTML output.
If I understand the gbk file correctly, they are using both protocluster
and candidate_cluster
; the issue is that both numbering schemes are at the contig level instead of the genome level.
I think we just want to use the same numbers they use on the antiSMASH html output correct? In v4 this was not in the GBK file but I think in >v4 they started to add that value into the GBK file? I don't have an example in from of me to l validate.
You mean the 1.1, 1.2, 2.1... etc. numbers in the html output? Those are not in printed to any field in the gbk file.