Contributions
Contributions copied to clipboard
mulea
Dear Bioconductor Team,
With this issue we would like to submit the mulea
package. mulea
is a comprehensive overrepresentation and functional enrichment analyser R package which reads ontologies (gene and protein sets) in a standardised GMT (Gene Matrix Transposed) format.
Kind regards, Tamás
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
- Repository: https://github.com/ELTEbioinformatics/mulea
Confirm the following by editing each check box to '[x]'
-
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
-
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
-
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
-
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
-
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
-
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
-
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
- [x] The 'devel' branch for new packages and features.
- [x] The stable 'release' branch, made available every six months, for bug fixes.
- [x] Bioconductor version control using Git (optionally via GitHub).
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.
Hi @stitam
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: mulea
Type: Package
Encoding: UTF-8
Title: mulea - an R package for enrichment analysis using
multiple ontologies and empirical FDR correction
Version: 0.99.10
Date: 2016-04-08
Authors@R: c(
person("Cezary", "Turek", role = c("aut", "ctb"),
comment = c(ORCID = "0000-0002-1445-5378")),
person("Márton", "Ölbei", role = c("aut", "ctb"),
comment = c(ORCID = "0000-0002-4903-6237")),
person("Tamás", "Stirling", email = "[email protected]",
role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-8964-6443")),
person("Gergely", "Fekete", role = c("aut"),
comment = c(ORCID = "0000-0001-9939-4860")),
person("Ervin", "Tasnádi", role = c("aut"),
comment = c(ORCID = "0000-0002-4713-5397")),
person("Leila", "Gul", role = c("aut")),
person("Balázs", "Bohár", role = c("aut"),
comment = c(ORCID = "0000-0002-3033-5448")),
person("Balázs", "Papp", role = c("aut"),
comment = c(ORCID = "0000-0003-3093-8852")),
person("Wiktor", "Jurkowski", role = c("aut"),
comment = c(ORCID = "0000-0002-7820-1991")),
person("Eszter", "Ari", role = c("aut", "ctb"),
comment = c(ORCID = "0000-0001-7774-1067")))
Description: Traditional gene set enrichment analyses are
typically limited to a few ontologies and do not account for
the interdependence of gene sets or terms, resulting in
overcorrected p-values. To address these chellenges,
we introduce mulea, an R package offering comprehensive
overrepresentation and functional enrichment analysis. mulea employs
an innovative empirical false discovery
rate (eFDR) correction method, specifically designed for
interconnected biological data, to accurately identify significant
terms within diverse ontologies. Beyond conventional tools,
mulea incorporates a wide range of
ontologies encompassing Gene Ontology,
pathways, regulatory elements, genomic locations, and protein domains.
This flexibility empowers researchers to tailor enrichment analysis
to their specific questions, such as identifying enriched
transcriptional regulators in gene expression data or
overrepresented protein domains in protein sets. To facilitate
seamless analysis, mulea provides
gene sets (in standardized GMT format) for 27 model organisms,
covering 16 databases and various identifiers. Additionally,
the muleaData ExperimentData Bioconductor package simplifies
access to these 879 pre-defined ontologies. Furthermore,
mulea's architecture allows for easy integration of
user-defined ontologies, expanding its applicability
across diverse research areas.
biocViews: Annotation, DifferentialExpression, GeneExpression,
GeneSetEnrichment, GO, GraphAndNetwork, MultipleComparison, Pathways,
Reactome, Software, Transcription, Visualization
License: MIT + file LICENSE
Depends:
R (>= 4.0.0)
Imports:
devtools,
data.table (>= 1.13.0),
dplyr,
fgsea (>= 1.0.2),
ggplot2,
ggraph (>= 2.0.3),
magrittr (>= 2.0.3),
methods,
parallel (>= 4.0.2),
plyr (>= 1.8.4),
Rcpp,
readr,
rlang,
scales,
stats,
stringi,
tibble,
tictoc,
tidygraph,
tidyverse
Suggests:
knitr,
rmarkdown,
testthat (>= 3.1.4)
LinkingTo:
Rcpp
VignetteBuilder: knitr
URL: https://github.com/ELTEbioinformatics/mulea
BugReports: https://github.com/ELTEbioinformatics/mulea/issues
RoxygenNote: 7.3.1
Roxygen: list(markdown = TRUE)
Config/testthat/edition: 3
Some comments:
We have already submitted muleaData
which is the data package for mulea
: https://github.com/Bioconductor/Contributions/issues/3291
mulea
has an upstream which is archived and will not be updated in the future.
BiocCheck::BiocCheck()
returns the following NOTES:
NOTE: Update R version dependency from 4.0.0 to 4.3.0.
I would rather not do this because then R CMD check will fail on oldrel-1 (https://github.com/ELTEbioinformatics/mulea/pull/24). The package does not require R 4.3.0. (I have turned off GitHub workflows for the Bioconductor submission process)
NOTE: Cannot determine whether maintainer is subscribed to the Bioc-Devel mailing list (requires admin
credentials). Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel
I am registered to both the mailing list and the support forum.
Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): mulea_0.99.10.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/mulea
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Please fix ERROR in build report before the package will more forward in review.
Hi @lshep it seems I cannot interact with the upstream (Permission denied (publickey)
). Can you please check if everything looks fine on your end? Is it possible that because of an earlier submission (which was not done by me) my GitHub account is not on the whitelist? FYI, I'm following this guide: https://contributions.bioconductor.org/git-version-control.html#new-package-workflow. SSH seems fine on my end.
everything is fine on our end. Please see about that you should activate your account and then if need be add additional ssh keys
Received a valid push on git.bioconductor.org; starting a build for commit id: 0417dfeadc43a7a1b52933236f2525e38af17be7
Thanks @lshep, my Bioconductor Git Credentials account was not activated.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): mulea_0.99.11.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/mulea
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 7514578d23c591d0ff75460cf5cad3b0d4a5e832
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): mulea_0.99.12.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/mulea
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
A quick note about the warning: library()
is necessary for importing cpp stuff into nodes when using multiple threads. I've tried to implement a solution without library()
but could not find one. We're happy to receive suggestions.
A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.
I have trialled the package and noted some issues to be addressed.
- The package lacks sufficient integration with existing Bioconductor infrastructure. It involves enrichment analysis but does not make use of GSEABase data structures. Also, the package depends on data strutures such as
data.frame
which have a Bioconductor equivalentDataFrame
.
> library(S4Vectors)
> d <- DataFrame(x = 1:5)
> is(d, "data.frame")
FALSE
Also, code uses parallel package instead of BiocParallel. For example, cl <- makeCluster(spec=nthread, type="PSOCK")
. Please refer to Parallel Recommendations.
Please greatly improve interoperability with existing Bioconductor conventions.
- The vignette imports an undocumented data set from the
inst
directory and useseval = FALSE
. No code chunks are allowed to useeval = FALSE
. Please see Package Data chapter for how to best include data sets. - Some of the steps in the vignette are long seem like they should be a function. For example,
# if there are duplicated Gene.symbols keep the first one only
geo2r_result_tab_filtered <- geo2r_result_tab %>%
# grouping by Gene.symbol to be able to filter
group_by(Gene.symbol) %>%
# keeping the first row for each Gene.symbol from rows with the same
# Gene.symbol
filter(row_number()==1) %>%
# ungrouping
ungroup() %>%
# arranging by logFC in descending order
arrange(desc(logFC)) %>%
select(Gene.symbol, logFC)
- Functions and variables need to use camelCase rather than snake_case format. See R Code.
- Namespace file has both selective and complete imports for a particular package.
import(magrittr)
importFrom(magrittr,"%<>%")
importFrom(magrittr,"%>%")
- Don't create empty vectors or lists and incrementally grow them. For example,
create_random_db <- function() {
DB <- list()
for (cat_i in seq_len(10)) {
DB_cat_values <- c()
Refer to Vectorize.
Many thanks @DarioS for reviewing the package, we'll address these ASAP.
It was unclear to us whether Bioconductor requires camel case or snake case is also accepted, we decided to go with snake case and only harmonised this for uer facing functions. Please advise, for a successful review 1. should we moveo to camel case? 2. should we use harmonise for internal functions as well?
Regarding your observation on the namespace file: If there is complete import there is no need to include selective import as well, this is the issue, right?
camelCase for user-facing functions is sufficient. Yes, import either completely or selectively, depending on how many functions.
Thank you Bioconductor Team for all your work! It was not feasible for us to implement these changes so we decided to publish the package on CRAN instead of Bioconductor. The package has been released: https://cran.r-project.org/web/packages/mulea/index.html.