DBSCAN-SWA icon indicating copy to clipboard operation
DBSCAN-SWA copied to clipboard

Error in determining bac_sequence

Open peng-ye opened this issue 2 years ago • 1 comments

Dear authors,

I found some IDs have no sequence in the resulting fna file. From what I saw, all those sequences should start at "0". I.e., the corresponding IDs look like xxxxx|0:\d+|DBSCAN-SWA (see below). I think there is sth wrong in determining the boundary for bac_sequence.

Another observation supporting this is that many sequences start with "[T|G|C]ATG", but not "ATG". It seems like the window should slide to the right by one base.

Would you please help check it out? Thanks.

Screen Shot 2022-05-13 at 00 16 44

peng-ye avatar May 12 '22 16:05 peng-ye

I am sorry for the miss. I parsed protein locations using python package "Bio". The start location added 1 base automatically . Now I have updated dbscan-swa.py on https://github.com/gancao/DBSCAN-SWA-1

Thanks for your interest in DBSCAN-SWA. If you have any other questions, please comment on github

gancao avatar May 14 '22 07:05 gancao