cgmlst-dists
cgmlst-dists copied to clipboard
Add functionality to use chewBBACA's v3 new hashed output
chewBBACA recently got an update that includes a feature to output hashed sequences instead of allele numbers. Since cgmlst-dists has specific filters for the input data, it is incompatible with this new format.|
This PR addresses this issue by adding a -H parameter to the command line interface to indicated the new hashed sequence format. When the parameter is supplied, a 64 bit integer is calculated from the hexadecimal hash and is used instead of the allele number when calculating distances.
Furthermore, when any allele in the pairwise comparison is of the form "-" or "NA", it is ignored and does not add to the distance.