cgmlst-dists icon indicating copy to clipboard operation
cgmlst-dists copied to clipboard

Add functionality to use chewBBACA's v3 new hashed output

Open KHajji opened this issue 1 year ago • 0 comments

chewBBACA recently got an update that includes a feature to output hashed sequences instead of allele numbers. Since cgmlst-dists has specific filters for the input data, it is incompatible with this new format.|

This PR addresses this issue by adding a -H parameter to the command line interface to indicated the new hashed sequence format. When the parameter is supplied, a 64 bit integer is calculated from the hexadecimal hash and is used instead of the allele number when calculating distances.

Furthermore, when any allele in the pairwise comparison is of the form "-" or "NA", it is ignored and does not add to the distance.

KHajji avatar May 22 '23 10:05 KHajji