Contributions
Contributions copied to clipboard
MsExperiment submission
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
- Repository: https://github.com/RforMassSpectrometry/MsExperiment/
Confirm the following by editing each check box to '[x]'
-
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
-
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
-
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
-
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
-
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
-
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
-
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
- [x] The 'devel' branch for new packages and features.
- [x] The stable 'release' branch, made available every six months, for bug fixes.
- [x] Bioconductor version control using Git (optionally via GitHub).
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.
Hi @lgatto
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: MsExperiment
Title: Infrastructure for Mass Spectrometry Experiments
Version: 0.99.0
Description: Infrastructure to store and manage all aspects related to
a complete proteomics or metabolomics mass spectrometry (MS) experiment. The
MsExperiment package provides light-weight and flexible containers for MS
experiments building on the new MS infrastructure provided by the Spectra,
QFeatures and related packages. Along with raw data representations, links
to original data files and sample annotations, additional metadata or
annotations can also be stored within the MsExperiment container. To
guarantee maximum flexibility only minimal constraints are put on the type
and content of the data within the containers.
Authors@R: c(person(given = "Laurent", family = "Gatto",
comment = c(ORCID = "0000-0002-1520-2268"),
email = "[email protected]",
role = c("aut", "cre")),
person(given = "Johannes", family = "Rainer",
comment = c(ORCID = "0000-0002-6977-7147"),
email = "[email protected]",
role = "aut"),
person(given = "Sebastian", family = "Gibb",
comment = c(ORCID = "0000-0001-7406-4443"),
email = "[email protected]",
role = "aut"))
Depends:
R (>= 4.2),
ProtGenerics (>= 1.9.1),
Imports:
methods,
S4Vectors,
IRanges,
Spectra,
SummarizedExperiment,
QFeatures
Suggests:
testthat,
knitr (>= 1.1.0),
roxygen2,
BiocStyle (>= 2.5.19),
rmarkdown,
rpx,
mzR,
msdata
License: Artistic-2.0
LazyData: no
VignetteBuilder: knitr
BugReports: https://github.com/RforMassSpectrometry/MsExperiment/issues
URL: https://github.com/RforMassSpectrometry/MsExperiment
biocViews: Infrastructure, Proteomics, MassSpectrometry, Metabolomics, ExperimentalDesign, DataImport
RoxygenNote: 7.2.0
Roxygen: list(markdown=TRUE)
Collate:
'MsExperiment-functions.R'
'MsExperimentFiles.R'
'MsExperiment.R'
'existMsExperimentFiles.R'
While building vignette by hand I hit
/home/stvjc/.cache/R/rpx
does not exist, create directory? (yes/no): yes
I wonder if the vignette should fail at this point, directing the user to deal with rpx cache -- which does not seem standard for BiocFileCache, should it use BiocFileCache?
That looks like a standard R caching directory -- likely there is a flag for creating this automatically when it is not manually run (ie. an interactive vs non interactive setting) -- or if there isn't there should be
> R_user_dir("cache")
[1] "/home/stvjc/.local/share/R/cache"
seems more standard to me. Also
label: unnamed-chunk-23
Quitting from lines 316-319 (MsExperiment.Rmd)
Error in .local(object, ...) :
Parameter 'files' is mandatory for 'MsBackendMzR'
... and named chunks are nice ... :)
It's the standard location for package caching:
> tools::R_user_dir(package = "rpx", which = "cache")
[1] "/home/lgatto/.cache/R/rpx"
rpx (and its cached) is used to download files from the PRIDE mass spectrometry database.
I'm surprised it fails when building the vignette, as the function should ask only in an interactive environment:
> rpx::rpxCache
function ()
{
cache <- tools::R_user_dir(package = "rpx", which = "cache")
BiocFileCache::BiocFileCache(cache, ask = interactive())
}
<bytecode: 0x55d2ec9e6140>
<environment: namespace:rpx>
Sorry about the chunk names - I will fix these.
it didn't fail, it just paused during my manual compilation of vignette. as programmed. it just doesn't happen very often.
As for cache location, ignore my comment but I expected to see .local/share ... as part of the path. My problem.
It did fail however at unnamed chunk 23.
I had a quick look and the error complains about an empty filenames vector, that should have been created in the previous (unnamed) chunk:
fls <- dir(system.file("sciex", package = "msdata"), full.names = TRUE)
basename(fls)
experimentFiles(lmse) <- MsExperimentFiles(
mzML_files = fls,
annotations = "internal_standards.txt")
producing
> fls <- dir(system.file("sciex", package = "msdata"), full.names = TRUE)
> basename(fls)
[1] "20171016_POOL_POS_1_105-134.mzML" "20171016_POOL_POS_3_105-134.mzML"
The msdata is in the Suggests - could it be possible that it isn't available on your system?
Sorry for so many bumps. I thought
Quitting from lines 158-160 (MsExperiment.Rmd)
Error: processing vignette 'MsExperiment.Rmd' failed with diagnostics:
error in evaluating the argument 'object' in selecting a method for function 'backendInitialize': The use of 'MsBackendMzR' requires package 'mzR'. Please install with 'BiocInstaller::install("mzR")'
--- failed re-building ‘MsExperiment.Rmd’
SUMMARY: processing the following file failed:
‘MsExperiment.Rmd’
was odd. I will keep pushing.
A reviewer has been assigned to your package. Learn what to expect during the review process.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/MsExperiment to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hi @nturaga - the error in the build report is due to an URL that seemed temporarily inaccessible (it works now). The first code chunk (other than loading packages) pulls data from an online repo using the rpx package. Could you trigger a new build, or do I need to push a version bump?
You need to push a version bump @lgatto
Weird, I pushed a version bump (plus code chunk labels) but it didn't triggered the spb. Any idea @nturaga?
Did you push to upstream at git.bioconductor.org? When I cloned the package the version bump was not there?
Received a valid push on git.bioconductor.org; starting a build for commit id: 9e5f85252c102b0188d67ce9440274907e8bbbae
Indeed, thank you @lshep, and sorry for missing the obvious reason.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/MsExperiment to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hi Laurent,
Taking a first look at this, here is some immediate feedback:
-
Imports:
- Importing selectively from the methods package is strongly discouraged. Note that you already import the whole methods package (
import(methods)) so there's actually no need to selectively import things withimportFrom(methods,"slot<-"),importFrom(methods,as),importMethodsFrom(methods,show), etc... - Same situation with ProtGenerics: you have
import(ProtGenerics)so no need toimportFrom(ProtGenerics,spectra). - I would also strongly suggest that you import the whole S4Vectors and IRanges, just to avoid unnecessary maintenance complications in the future.
- Importing selectively from the methods package is strongly discouraged. Note that you already import the whole methods package (
-
I suggest that you extend the
Annotatedclass from S4Vectors. This will give you themetadata()getter and setter for free, and also will ensure that they behave consistently across all objects in the Bioconductor ecosystem. -
Given that the
assayslot contains "quantification data" and that the slot is accessed via theqdata()getter and setter, why isn't the slot calledqdata? -
Can I suggest that you use the preferred
_OR_convention for all your union classes? E.g.MsExperimentFiles_OR_NULLinstead ofMsExperimentFilesOrNull, andQFeatures_OR_SummarizedExperiment_OR_NULLinstead ofQFeaturesOrSummarizedExperimentOrNull, etc... BTW I'm not sure why you need theMsExperimentFilesOrNullclass at all. Couldn't an empty (i.e. zero-length)MsExperimentFilesobject be used when there are no files to store? -
For a MsExperiment object
msewith 2 samples, I was expecting thatmse[ , 1:2]would be a no-op but I get:> mse Object of class MsExperiment Files: mzML_files, annotations Spectra: MS1 (1862) Experiment data: 2 sample(s) Sample data links: - experimentFiles.mzML_files: 2 sample(s) to 2 element(s). - experimentFiles.annotations: 2 sample(s) to 1 element(s). - spectra: 2 sample(s) to 1862 element(s). > identical(mse[,1:2], mse) [1] FALSEThis makes me wonder if I really understand what subsetting does.
-
I also wonder why enforce the use of the 2D syntax for subsetting (e.g.
mse[ , 1]) instead of just using the simpler 1D syntax (e.g.mse[1]). This kind of suggests that MsExperiment objects are somehow 2D objects but they are not (dim()returns NULL). Have you considered giving these objects a 1D semantic? This would be achieved by: (a) giving them alength()(the number of samples), (b) allowing them to carrynames()(the names/ids of the samples), and (c) allowing subsetting withmse[i]whereiis either an integer vector (index), or a logical vector of the same length asmse, or a character vector of sample names/ids. -
Files created with
saveRDS()are RDS files and thus should have the.rdsextension. Looks like you're using.rdain the vignette (and maybe in other places). -
Some cosmetic:
- Make sure to use
`r Biocpkg("BiocManager")`instead of bare backticks for package names in the text of the vignette. - Right now the
show()method printsObject of class MsExperimentfor an empty MsExperiment object. Would be nice if it could tell that the object is empty, like you did for QFeatures objects.
- Make sure to use
Thanks, H.
Received a valid push on git.bioconductor.org; starting a build for commit id: 49ce72383208ec6379a221a5023b5b6b8bc16d37
Hi @hpages
Thank you for your insightful comments and suggestions. Here's how we have addressed them:
- Imports:
- Importing selectively from the methods package is strongly discouraged. Note that you already import the whole methods package (
import(methods)) so there's actually no need to selectively import things withimportFrom(methods,"slot<-"),importFrom(methods,as),importMethodsFrom(methods,show), etc... - Same situation with ProtGenerics: you have
import(ProtGenerics)so no need toimportFrom(ProtGenerics,spectra). - I would also strongly suggest that you import the whole S4Vectors and IRanges, just to avoid unnecessary maintenance complications in the future.
- Importing selectively from the methods package is strongly discouraged. Note that you already import the whole methods package (
The global imports of methods and ProtGenerics were a mistake and have been removed. We don't mind selectively importing symbols from S4Vectors and IRanges - hope that's OK.
- I suggest that you extend the
Annotatedclass from S4Vectors. This will give you themetadata()getter and setter for free, and also will ensure that they behave consistently across all objects in the Bioconductor ecosystem.
Done.
> showClass("MsExperiment")
Class "MsExperiment" [package "MsExperiment"]
Slots:
Name: experimentFiles
Class: MsExperimentFiles
Name: spectra
Class: Spectra_OR_Null
Name: qdata
Class: QFeatures_OR_SummarizedExperiment_OR_Null
Name: otherData
Class: List
Name: sampleData
Class: DataFrame
Name: sampleDataLinks
Class: List
Name: metadata
Class: list
Extends: "Annotated"
- Given that the
assayslot contains "quantification data" and that the slot is accessed via theqdata()getter and setter, why isn't the slot calledqdata?
Indeed, we have changed as suggested:
> slotNames("MsExperiment")
[1] "experimentFiles" "spectra" "qdata" "otherData"
[5] "sampleData" "sampleDataLinks" "metadata"
- Can I suggest that you use the preferred
_OR_convention for all your union classes? E.g.MsExperimentFiles_OR_NULLinstead ofMsExperimentFilesOrNull, andQFeatures_OR_SummarizedExperiment_OR_NULLinstead ofQFeaturesOrSummarizedExperimentOrNull, etc... BTW I'm not sure why you need theMsExperimentFilesOrNullclass at all. Couldn't an empty (i.e. zero-length)MsExperimentFilesobject be used when there are no files to store?
We are now following the _OR_ convention and got rid of MsExperimentFilesOrNull as it wasn't needed.
- For a MsExperiment object
msewith 2 samples, I was expecting thatmse[ , 1:2]would be a no-op but I get:
This makes me wonder if I really understand what subsetting does.> mse Object of class MsExperiment Files: mzML_files, annotations Spectra: MS1 (1862) Experiment data: 2 sample(s) Sample data links: - experimentFiles.mzML_files: 2 sample(s) to 2 element(s). - experimentFiles.annotations: 2 sample(s) to 1 element(s). - spectra: 2 sample(s) to 1862 element(s). > identical(mse[,1:2], mse) [1] FALSE
See below for the list semantic.
As for the non-identical mseand mse[1:2] objects, this is expected, as we need to copy filenames that are pointed to by more than one sample to guarantee the consistency between samples and their associated files. This has the side effect that you point out. This was mentioned/demonstratd in the vignette and now also in the man page.
- I also wonder why enforce the use of the 2D syntax for subsetting (e.g.
mse[ , 1]) instead of just using the simpler 1D syntax (e.g.mse[1]). This kind of suggests that MsExperiment objects are somehow 2D objects but they are not (dim()returns NULL). Have you considered giving these objects a 1D semantic? This would be achieved by: (a) giving them alength()(the number of samples), (b) allowing them to carrynames()(the names/ids of the samples), and (c) allowing subsetting withmse[i]whereiis either an integer vector (index), or a logical vector of the same length asmse, or a character vector of sample names/ids.
You are absolutely right, and we have changed the semantic accordingly:
> mse
Object of class MsExperiment
Files: mzML_files, annotations
Experiment data: 2 sample(s)
Sample data links:
- experimentFiles.mzML_files: 2 sample(s) to 2 element(s).
- experimentFiles.annotations: 2 sample(s) to 1 element(s).
> length(mse)
[1] 2
> mse[1]
Object of class MsExperiment
Files: mzML_files, annotations
Experiment data: 1 sample(s)
Sample data links:
- experimentFiles.mzML_files: 1 sample(s) to 1 element(s).
- experimentFiles.annotations: 1 sample(s) to 1 element(s).
- Files created with
saveRDS()are RDS files and thus should have the.rdsextension. Looks like you're using.rdain the vignette (and maybe in other places).
This was a silly mistake - fixed now.
- Some cosmetic:
- Make sure to use
`r Biocpkg("BiocManager")`instead of bare backticks for package names in the text of the vignette. - Right now the
show()method printsObject of class MsExperimentfor an empty MsExperiment object. Would be nice if it could tell that the object is empty, like you did for QFeatures objects.
- Make sure to use
Both fixed.
> MsExperiment()
Empty object of class MsExperiment
Thank you!
Laurent & Johannes
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/MsExperiment to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Excellent! Thanks for all the changes.
Cheers, H.
Your package has been accepted. It will be added to the Bioconductor nightly builds.
Thank you for contributing to Bioconductor!
Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.
The master branch of your GitHub repository has been added to Bioconductor's git repository.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/lgatto.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("MsExperiment"). The package 'landing page' will be created at
https://bioconductor.org/packages/MsExperiment
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.