ColabFold
ColabFold copied to clipboard
Using embeddings (single residue and pairwise)
How can I connect the embedding to the original sequence? For example, for the input sequence of 120 residues I got a single residue embedding of 132x384. Does it include insertions in the MSA? Thanks!
The first 120 dimension are right. The remaining 12 are a padding. We should have trimmed the representation.
To avoid padding you can use --recompile-padding 1.0
Thank you for pointing out the issue and providing the solution. I also face the same issue here: the dimension of generated single representation is larger than the length of protein fasta sequence. Therefore, I sum up the last dimension of the single representation np.sum(single_representation, 1), and I found that the remaining dimension are the same as you guys point out these dimensions are all padding. I think the fastest way is to get rid of those dimensions.