esm
esm copied to clipboard
atom name all left justified
Bug description In all PDB files provided by ESM atlas, the atom name at column 13-16 of the ATOM record are always left justified, which is not the standard string formatting specified by the PDB format at https://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM
Reproduction steps For example, the first residue of MGYP000911143359 is
ATOM 1 N MET A 1 -26.091 68.903 7.841 1.00 0.90 N
ATOM 2 CA MET A 1 -26.275 67.677 7.069 1.00 0.91 C
ATOM 3 C MET A 1 -24.933 67.025 6.755 1.00 0.90 C
ATOM 4 CB MET A 1 -27.033 67.967 5.773 1.00 0.89 C
ATOM 5 O MET A 1 -24.314 67.331 5.734 1.00 0.90 O
ATOM 6 CG MET A 1 -28.544 67.973 5.934 1.00 0.86 C
ATOM 7 SD MET A 1 -29.390 68.904 4.598 1.00 0.86 S
ATOM 8 CE MET A 1 -29.202 67.734 3.224 1.00 0.83 C
Expected behavior In standard PDB format, the above residue should have been
ATOM 1 N MET A 1 -26.091 68.903 7.841 1.00 0.90 N
ATOM 2 CA MET A 1 -26.275 67.677 7.069 1.00 0.91 C
ATOM 3 C MET A 1 -24.933 67.025 6.755 1.00 0.90 C
ATOM 4 CB MET A 1 -27.033 67.967 5.773 1.00 0.89 C
ATOM 5 O MET A 1 -24.314 67.331 5.734 1.00 0.90 O
ATOM 6 CG MET A 1 -28.544 67.973 5.934 1.00 0.86 C
ATOM 7 SD MET A 1 -29.390 68.904 4.598 1.00 0.86 S
ATOM 8 CE MET A 1 -29.202 67.734 3.224 1.00 0.83 C
Additional context Although the non-standard atom name justification does not affect the visualization of the PDB file, it does affect the some structure analysis tools such as US-align and REDUCE.
Thanks for flagging!
@nikitos9000 Let's fix this in our internal and released esmfold infer_pdb
functions simultaneously so new predictions are formatted correctly.
Unfortunately we won't be able to fix the existing predictions in the Atlas.
I have prepared a C++ program at https://github.com/pylelab/USalign/blob/master/pdbAtomName.cpp which can very quickly fix atom name for old ESM atlas pdb files. With this program, it should be pretty easy to fix all existing predictions in a few days. The program can be compiled by
git clone https://github.com/pylelab/USalign.git
cd USalign
make pdbAtomName