mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

Molecular volume and (solvent-accessible) surface area

Open schneiderfelipe opened this issue 5 years ago • 40 comments

Hi, I would like to suggest two features. I couldn't find how to calculate the molecular volume and SASA (solvent-accessible surface area) for a trajectory. Do you think those would be useful additions?

schneiderfelipe avatar Jan 04 '20 22:01 schneiderfelipe

This would indeed be great tools to have in MDAnalysis. We would be happy to look at contributions in that direction.

jbarnoud avatar Jan 05 '20 18:01 jbarnoud

A crude estimate of molecular volume might the volume of the ConvexHull of the molecule using SciPy.

This would likely be reasonably accurate for i.e., globular structures, but you could probably come up with some structures for which you might need to calculate an alpha shape or something more exotic. I suppose there is likely some literature on this; maybe summing the constituent particle volumes could also be used in some approaches (summing all the Martini sphere volumes? plus some adjustment factor for bonded areas or something).

tylerjereddy avatar Jan 07 '20 23:01 tylerjereddy

There’s also pysasa (I think). Would be great to integrate it.

orbeckst avatar Jan 10 '20 15:01 orbeckst

Would even be a fun GSOC project!

richardjgowers avatar Jan 17 '20 10:01 richardjgowers

FreeSASA comes under MIT license and has a C core and python bindings:

  • C-library: https://github.com/mittinatten/freesasa
  • Python bindings: https://github.com/freesasa/freesasa-python

By default Lee & Richards' algorithm is used, but Shrake & Rupley's is also available.

Simon Mitternacht (2016) FreeSASA: An open source C library for solvent accessible surface area calculation. F1000Research 5:189 (doi: 10.12688/f1000research.7931.1)

Todo:

  1. evaluate freesasa if it is suitable; look at alternatives (e.g. MDTraj implements Shake-Rupley)
  2. figure out the above can be installed as pip and conda package, if necessary create the packages
  3. create test case (use existing files in MDA and run external implementation for reference)
  4. integrate analysis module into MDAnalysis with tests
  5. benchmark performance

orbeckst avatar Jan 17 '20 16:01 orbeckst

I refined the above and added it as a potential GSOC 2020 project to the GSoC 2020 wiki page.

orbeckst avatar Jan 19 '20 19:01 orbeckst

FWIW (and maybe less biologically relevant) I'd also consider zeo++ and poreblazer similar tools that need a similar treatment.

Maybe more interestingly, good bindings to voro++ would probably allow a range of different volume/surface tools to be constructed.

richardjgowers avatar Jan 28 '20 10:01 richardjgowers

I'd be curious what voro++ is providing that you can't get from scipy.spatial interface to Qhull. If more things need to be exposed there, that would be good to know in terms of upstream vs. here, depending on how specialized.

tylerjereddy avatar Jan 29 '20 00:01 tylerjereddy

@tylerjereddy periodic boundary conditions

richardjgowers avatar Jan 29 '20 09:01 richardjgowers

@orbeckst @richardjgowers @IAlibay I am Syed Farhan, an Undergraduate Computer and Electronics Engineering Student from Bangalore, India. Linkedin

I am very much interested to work on this project as part of GSoC'20 this summer. I am associated with the 'Centre of Excellence in Computational Genomics' of my college and am good at python, C.

I have done some basic calculations with FreeSASA on the 3LAU PDB Structure. I am also familiazing myself with the MDAnalysis codebase.

Making FreeSASA a part of MDAnalysis (especially for MDSimulations) to calculate the SAS scores for each and every frame, also seems good to me.

It would be really appreciated if you could guide me with the further steps. I would also like to know if this project will make it to the final GSoC Project List, and if the project is up-for-grabs.

Kind regards Syed Farhan RVCE

born-2learn avatar Mar 03 '20 10:03 born-2learn

I have written a basic script to save the protein atom name and atom Surface Area in a text file, using FreeSASA. Here is the link to the repo: https://github.com/born-2learn/molecular-dynamics-sasa

Text file generated: here PDB file used: here

born-2learn avatar Mar 03 '20 13:03 born-2learn

Hello Syed Farhan,

Welcome to MDAnalysis and GSoC.

Please see the blog post https://www.mdanalysis.org/2020/02/22/gsoc2020/ and the FAQ https://github.com/MDAnalysis/mdanalysis/wiki/GSoC-FAQ if you haven't yet. As detailed in the blog, we have a series of issues tagged as easy bugs & GSOC starters which are worth looking at if you are still trying to find an issue to contribute to. We’re also happy to talk about projects when you have specific questions.

If possible, could you also provide us with your GitHub handle? This way we can consolidate any issues/PRs you contribute to with this introduction and any other interactions we have with you on the developers list.

On Mar 3, 2020, at 3:55 AM, Syed Farhan [email protected] wrote:

@orbeckst https://github.com/orbeckst @richardjgowers https://github.com/richardjgowers @IAlibay https://github.com/IAlibay I am Syed Farhan, an Undergraduate Computer and Electronics Engineering Student from Bangalore, India. Linkedin https://www.linkedin.com/in/syedfarhanahmad/ I am very much interested to work on this project as part of GSoC'20 this summer. I am associated with the 'Centre of Excellence in Computational Genomics' of my college and am good at python, C.

I have done some basic calculations with FreeSASA on the 3LAU PDB Structure https://www.rcsb.org/structure/3LAU. I am also familiazing myself with the MDAnalysis codebase.

That’s great but also note that you need to have at least one merged pull request in order to be eligible to apply to work with MDAnalysis. See the linked GSoC docs above.

Making FreeSASA a part of MDAnalysis (especially for MDSimulations) to calculate the SAS scores for each and every frame, also seems good to me.

It would be really appreciated if you could guide me with the further steps.

Please read our docs.

I would also like to know if this project will make it to the final GSoC Project List, and if the project is up-for-grabs.

This is not how it works. In order to be eligible to apply you need to work on our code base, as explained in our docs. You will then write your application. It would be great if you were to select the SASA project, but anyone else can do the same. We will read and assess the applications and select the best candidate (again, read our docs to see what we are looking for). We are happy to discuss projects and help you to write a strong proposal – just remember that we also want to get a sense of how you would work with us before you submit the application.

Best, Oliver

Kind regards Syed Farhan RVCE

orbeckst avatar Mar 03 '20 16:03 orbeckst

@orbeckst Thank you for your reply.

My GitHub handle is born-2learn and here is my GitHub profile.

Thank you for the links. I have gone through the MDAnalysis Blog and the FAQs, and will shortly start contributing to the MDAnalysis repo by working on issues/PRs.

I would like to know if the discussion of projects/issues have to be done in the MDAnalysis Google Groups apart from the GitHub comments or only the comments in GitHub are sufficent.

I am very much interested in the SASA project and am also looking into other Project Ideas.

Kind Regards Syed Farhan

born-2learn avatar Mar 03 '20 19:03 born-2learn

On Mar 3, 2020, at 12:26 PM, Syed Farhan [email protected] wrote:

I would like to know if the discussion of projects/issues have to be done in the MDAnalysis Google Groups https://groups.google.com/forum/#!forum/mdnalysis-discussion apart from the GitHub comments or only the comments in GitHub are sufficent.

We typically keep general discussions on the mailing list.

If there’s an issue open (like the one for SASA) then it’s good practice to keep specific questions on the issue tracker. (But don’t open issues to ask questions – open issues for bugs/enhancements. Ask questions on the list.)

Discussions on active PRs are all on the PR itself in the issue tracker.

I am very much interested in the SASA project and am also looking into other Project Ideas.

Great!

Oliver

orbeckst avatar Mar 03 '20 19:03 orbeckst

Thank you. I will keep them in mind.

born-2learn avatar Mar 03 '20 19:03 born-2learn

I removed the "starter" label as this became a proposed GSoC 2020 project.

orbeckst avatar Mar 05 '20 23:03 orbeckst

@orbeckst Sir, I'm aiming for this as my gsoc project as it seemed interesting from a while. Can I please get some scenario of this and where to start for application. Thanks!

arunsetty avatar Mar 08 '20 05:03 arunsetty

Hi @arunsetty , read the blog post https://www.mdanalysis.org/2020/02/22/gsoc2020/ which explains everything and provides links to all related documentation. Importantly, we only accept applications from students with at least one merged PR.

For further general GSOC discussions please ask on the developer mailinglist https://groups.google.com/group/mdnalysis-devel.

orbeckst avatar Mar 08 '20 07:03 orbeckst

Is there any help needed with this ticket? I haven't seen any update since March. I feel, it is high priority one for the next release :)

mieczyslaw avatar Aug 13 '20 21:08 mieczyslaw

Yes, please, help is welcome. We were tending towards relying on pysasa. Look at the GSoC description https://github.com/MDAnalysis/mdanalysis/wiki/Project-Ideas-2020#project-1-molecular-volume-and-surface-analysis .

orbeckst avatar Aug 13 '20 22:08 orbeckst

Thanks @orbeckst ! Is anyone actively working on this one? I saw some students interested in participating (as GSOC)? I mean, either I can sort it out myself, or if anyone already started, I can join the team.

mieczyslaw avatar Aug 13 '20 22:08 mieczyslaw

Nobody is actively working on it, i.e., there's no active PR.

If you want to add a new analysis.sasa module then I would recommend you have a look at the https://userguide.mdanalysis.org/1.0.0/contributing.html guide (in case you haven't done so already).

orbeckst avatar Aug 13 '20 22:08 orbeckst

Sure, I will have a go then! So far my another contribution is close to being merged, so learnt a lot about dev practices in this project, but I will go again through the guide, especially because now it would be additional functionality rather than a fix.

mieczyslaw avatar Aug 13 '20 22:08 mieczyslaw

@mieczyslaw give me a ping if you get stuck getting the information you need for pysasa out of MDAnalysis!

richardjgowers avatar Aug 14 '20 09:08 richardjgowers

@richardjgowers sorry for the delay in starting working on that, just want to first finish another MDA contribution to avoid doing two things at once, but definitely interested in doing this one

mieczyslaw avatar Sep 18 '20 18:09 mieczyslaw

No chance to work on that in my free time right now, but happy to return to this task later on if there is no one else completing it. Sorry.

mieczyslaw avatar Dec 16 '20 17:12 mieczyslaw

Thanks for the update.

orbeckst avatar Dec 16 '20 18:12 orbeckst

Is this still available or feature is already created.I am new to MDanalysis and joined google group.So can you clarify some doubt of mine so I can try to make this feature.Is this just calculate the molecular volume of a element ? @orbeckst @mieczyslaw

sparkingdark avatar Jan 04 '21 11:01 sparkingdark

@sparkingdark As I mentioned above, I didn't have time to do this task (even though I really like it). I suppose @orbeckst and @richardjgowers can explain what is expected.

mieczyslaw avatar Jan 04 '21 11:01 mieczyslaw

Okay looking forward to it and one thing mdanalysis have any slack group.

sparkingdark avatar Jan 04 '21 11:01 sparkingdark