graphein
graphein copied to clipboard
Add support for DSSP version and add rsa node features to residues with insertion codes
Reference Issues/PRs
Fixes Issue #353 Fixes Issue #354
What does this implement/fix? Explain your changes
For Issue #353, this PR added a line to determine dssp version, which is passed to the function Bio.PDB.DSSP.dssp_dict_from_pdb_file otherwise, biopython will use default version number 3.9.9 and for users with dssp version >= 4.0.0, this will lead to an empty dssp DataFrame
For Issue #354, this modification fixes errors caused by missing rsa feature from nodes with insertion code. This is caused by skipping insertions when creating node_id. This modifications adds insertion codes to node_id if insertions is set to True in ProteinGraphConfiguration.
What testing did you do to verify the changes in this PR?
Added a script test_dssp.py to tests/features
The example input pdb file (with cryst1 line and insertion codes) is attached here: input_pdb_cryst1.pdb.gz
File with suffix .pdb is not supported by GitHub, thus the compressed version. Need to uncompress it first before running the test.
Pull Request Checklist
- [ ] Added a note about the modification or contribution to the
./CHANGELOG.mdfile (if applicable) - [x] Added appropriate unit test functions in the
./graphein/tests/*directories (if applicable) - [x] Modify documentation in the corresponding Jupyter Notebook under
./notebooks/(if applicable) - [x] Ran
python -m py.test tests/and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g.,python -m py.test tests/protein/test_graphs.py) - [x] Checked for style issues by running
black .andisort .
Thanks @biochunan LGTM! Only comment re: pdb files - I'm quite sure these can be uploaded uncompressed. See: https://github.com/a-r-j/graphein/tree/master/tests/protein/test_data
On second thought, it's likely the .gitignore that's preventing you from adding the .pdb file. Try: git add input_pdb_cryst1.pdb --force.
Hi @a-r-j thanks for going through the PR.
For the pdb file, I was trying to upload it through the PR window and GitHub doesn't support files with the suffix pdb.
Will use existing pdb files in ./test_data in the future.
I had a look at the three pdb files in tests/protein/test_data, and they do not have insertion code. Maybe it's a good idea to add a PDB file with insertion codes for insertions related testing.
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
2 Code Smells
No Coverage information
0.0% Duplication
Quality Gate passed
Issues
2 New issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code
Codecov Report
Attention: Patch coverage is 90.00000% with 1 lines in your changes are missing coverage. Please review.
Project coverage is 45.07%. Comparing base (
8123f42) to head (26cfdbb). Report is 166 commits behind head on master.
| Files | Patch % | Lines |
|---|---|---|
| graphein/protein/features/nodes/dssp.py | 90.00% | 1 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## master #355 +/- ##
==========================================
+ Coverage 40.27% 45.07% +4.80%
==========================================
Files 48 113 +65
Lines 2811 7916 +5105
==========================================
+ Hits 1132 3568 +2436
- Misses 1679 4348 +2669
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.