Aaron Rosenfeld
Aaron Rosenfeld
When collapsing similar clones with the same copy number, V-identity should be used to break the tie. https://github.com/arosenfeld/immunedb/blob/24d34edbc6b732fdc2848d05d5c6919ba39fe1df/immunedb/aggregation/clones.py#L91-92
https://github.com/arosenfeld/immunedb/blob/0cfc896d7cdacfa38b7a7516b7d20cce16dfa043/immunedb/common/modify.py#L38 When swapping sample names, this duplication check doesn't avoid an integrity error.
When trees are re-generated, but no sequences are found, the tree should be set to null. https://github.com/arosenfeld/immunedb/blob/0cfc896d7cdacfa38b7a7516b7d20cce16dfa043/immunedb/trees/__init__.py#L58-L65
V-ties across genes with different IMGT gapping at the allele level cause odd behavior. E.g.: ``` >IGHV4-4*04 ...CCATCAGC---------AGTA... >IGHV4-4*07 ...CCATC------------AGTA... >TIE ...CCATCNNN---------AGTA... ``` These genes should not tie at all...
Create filter for CDR3 AA for clones and sequences in API.
If baseline fails during clone_stats, the only error will be the output file is not found. It should instead display the baseline error.
Mutation stats for a given region are not populated if there are no mutations in that region. It should place 0s in all mutation types.