progen icon indicating copy to clipboard operation
progen copied to clipboard

How is the sequence ID calculated in an efficient manner

Open eric-jm-lang opened this issue 2 years ago • 1 comments

Hello, In your excellent paper, a key asspect used is the sequence identity between the artificial and any known natural sequences. May I ask how this sequence identity is calculated in an effective manner? As it requires to screen all the databases for each sequences. Many thanks in advance

eric-jm-lang avatar Feb 01 '23 11:02 eric-jm-lang

These values are calculated using the MMseqs2 tool to find the closest matches between the generated sequences and the protein databases. We report the identity to the top database hit for each generated sequence.

jeffreyruffolo avatar Feb 01 '23 18:02 jeffreyruffolo