biotite
biotite copied to clipboard
Add support for structural alphabets
Structural alphabets are a fusion of structure and sequence methods and can greatly benefit from the already implemented functionality in Biotite. In summary they tokenize each residue into some symbol from an alphabet of limited size. The resulting sequence can than be input to sequence-based methods. Support for the following structural alphabets is planned:
- [x] Protein Blocks (#676)
- [ ] ~CLePAPS (#681)~ (on hold due to inconsistency with reference implementation)
- [x] 3Di (#665)
Furthermore the following tasks need to be done for all of them:
- [x] Add common
undefined_symbol
for all structural alphabets - [x] Add benchmark for each method
- [x] Move their test modules in
tests/structure/alphabet
- [x] Generate color schemes for their subsitution matrices with
gecos
and updatecolor_schemes
example - [ ] Add docstring for
biotite.structure.alphabet
- [ ] Add tutorial for structural alphabets
- [ ] Add at least one example
- [x] Mention structural alphabets in the Sequence section on the home page