mzLib icon indicating copy to clipboard operation
mzLib copied to clipboard

Spectral decon

Open Alexander-Sol opened this issue 2 years ago • 4 comments

This is the minimal viable implementation of a new deconvolution method, spectral deconvolution.

Spectral Deconvolution works by generating theoretical isotopic distributions for each peptide/proteoform in a given library, generating isotopic envelopes for each charge state, then comparing the envelopes in each spectra to those in the library.

Note that this PR includes a significant refactor (see PR #671) that moves the Proteomics project into the MassSpectrometry project. Due to this change, loading this PR necessitates a clean and rebuild and/or a restart of visual studio.

I also added a Scorer class that implements the Strategy design pattern to score experimental spectra against theoretical spectra. I still need to write tests for this class.

Alexander-Sol avatar Nov 11 '22 20:11 Alexander-Sol

Codecov Report

Merging #675 (b744f49) into master (623dbc4) will decrease coverage by 0.14%. The diff coverage is 94.25%.

:exclamation: Current head b744f49 differs from pull request most recent head f1a4f28. Consider uploading reports for the commit f1a4f28 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #675      +/-   ##
==========================================
- Coverage   81.12%   80.98%   -0.14%     
==========================================
  Files         147      148       +1     
  Lines       25690    25762      +72     
==========================================
+ Hits        20840    20863      +23     
- Misses       4850     4899      +49     
Impacted Files Coverage Δ
...b/MassSpectrometry/MzSpectra/SpectralSimilarity.cs 100.00% <ø> (ø)
...teomics/ProteolyticDigestion/ProductTypeMethods.cs 0.00% <0.00%> (ø)
mzLib/MzLibUtil/DoubleRange.cs 100.00% <ø> (ø)
mzLib/Test/TestIsotopicEnvelope.cs 0.00% <0.00%> (ø)
...lution/Algorithms/ClassicDeconvolutionAlgorithm.cs 96.29% <50.00%> (-0.89%) :arrow_down:
...Lib/MassSpectrometry/Deconvolution/Deconvoluter.cs 85.18% <60.00%> (+1.85%) :arrow_up:
.../MassSpectrometry/Deconvolution/MinimalSpectrum.cs 60.60% <60.60%> (ø)
...etry/Proteomics/Modifications/ModificationMotif.cs 66.66% <66.66%> (ø)
...b/MassSpectrometry/Deconvolution/Scoring/Scorer.cs 69.23% <69.23%> (ø)
...ectrometry/Deconvolution/DeconvoluterExtensions.cs 75.00% <75.00%> (ø)
... and 109 more

codecov[bot] avatar Dec 01 '22 16:12 codecov[bot]

  • Looks like there are a few places to add tests for default value returns and exception throws.

    • Was there a dependency problem that led to needing to change where Proteomics is located?

The new deconvolution method relies on generating a library of theoretical isotopic distributions for all species in a given protein database. To do so, I need access to classes inside of Proteomics (e.g. PeptideWithSetModifications)

Alexander-Sol avatar Dec 02 '22 19:12 Alexander-Sol

Within Proteomics folder, it appears as through some things have specific namespaces while others do not:
AminoAcidPolymer -> MassSpectrometry.Proteomics.AminoAcidPolymer Fragmentaiton -> MassSpectrometry.Proteomics.Fragmentation Modification -> MassSpectrometry.Proteomics Protein -> MassSpectrometry.Proteomics ProteolyticDigestion -> MassSpectrometry.Proteomics.ProteolyticDigestion RetentionTimePrediction -> MassSpectrometry.Proteomics.RetentionTimePrediction

Would it make more sense to keep the namespaces of each file the same as they were before the refactoring? This would enable continuous integration as we would not need to change any namespace references in programs that implement MzLib, but would still allow for them to be in the same project, as was required for your PR.

nbollis avatar Dec 05 '22 20:12 nbollis

Historically, this comes from borrowing code from https://github.com/dbaileychess/CSMSL, which has a bit more specific namespace usage.

acesnik avatar Dec 06 '22 02:12 acesnik