Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Repository: https://github.com/na396/SGCP

Confirm the following by editing each check box to '[x]'

[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.

I am familiar with the essential aspects of Bioconductor software management, including:

[x] The 'devel' branch for new packages and features.
[x] The stable 'release' branch, made available every six months, for bug fixes.
[x] Bioconductor version control using Git (optionally via GitHub).

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

Oct 14 '22 15:10 na396

Hi @na396

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: SGCP
Type: Package
Title: SGCP: A semi-supervised pipeline for gene clustering using self-training approach in gene co-expression networks
Version: 0.99.0
Authors@R: c(person("Niloofar", "AghaieAbiane", email = "[email protected]" ,role = c("aut", "cre")),
			 person("Ioannis", "Koutis", email = " [email protected]",role = c("aut")))
Description: SGC is a semi-supervised pipeline for gene clustering in gene co-expression networks.
   SGC consists of multiple novel steps that enable the computation of highly enriched modules 
   in an unsupervised manner. But unlike all existing frameworks, it further incorporates a 
   novel step that leverages Gene Ontology information in a semi-supervised clustering method 
   that further improves the quality of the computed modules.
License: GPL-3
Encoding: UTF-8
LazyData: true
Imports: ggplot2, expm, caret, plyr, dplyr, GO.db, annotate, SummarizedExperiment, 
        genefilter, GOstats, RColorBrewer, xtable, Rgraphviz, reshape2, openxlsx,
        ggridges, DescTools, org.Hs.eg.db, methods, grDevices, stats
Suggests: knitr
Depends: R (>= 4.2.0)
biocViews: GeneExpression, GeneSetEnrichment, NetworkEnrichment, SystemsBiology,
   Classification, Clustering, DimensionReduction, GraphAndNetwork,
   NeuralNetwork, Network, mRNAMicroarray, RNASeq, Visualization
VignetteBuilder: knitr
NeedsCompilation: no
URL: https://github.com/na396/SGC
Date/Publication: 2022-10-06
RoxygenNote: 7.2.1

Oct 14 '22 15:10 bioc-issue-bot

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

Oct 17 '22 12:10 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "TIMEOUT, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/SGCP to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

Oct 17 '22 12:10 bioc-issue-bot

Greetings @jianhong @lshep Thank you for the comment. The timeout problem happens in " creating vignettes", because my package in general takes hours or even days to be completed. This is the nature of my package. The example I provided in the "vignettes" is the smallest data I could show as an example for my package.

Here is the way I wrote the vignettes. I provided a small dataset in the vignettes and then I tried to explain how to use the functions in my package using that dataset. So during this process, in section "creating vignettes", it may take up to 3 hours to be completed. Is there any solution for this scenario? Thank you so much

Oct 17 '22 15:10 na396

Tagging: @vjcitn / @hpages for additional thoughts and comments. In generally packages cannot take that long to build on our builders. Packages need to be able to be built daily by our daily builder with a smaller example dataset. Perhaps storing intermittent data objects to load in various steps while make more in depth long tests might be an option. The other option would be to convert it into a workflow package but the timeout limit for a workflow package I believe is 2 hours. @hpages would appreciate input as well.

Oct 17 '22 16:10 lshep

@lshep I check my code one more time, it takes about 1:00 hour to run. Can you tell me what your recommendation is? Thank you so much, and I apricate your help in advance.

Oct 17 '22 19:10 na396

You should have code and "pre-cooked" data that allow the package to build and check in under (20?) minutes. That's good for you and for us -- you can get a meaningful result in 20 minutes -- you will know if something has gone wrong with your use of the ecosystem almost interactively. Then accompany this with a workflow package that can consume an hour of build time but is run infrequently. It would have more realistic computations.

Oct 17 '22 19:10 vjcitn

@vjcitn Thank you so much for your comment. I appreciate a lot. This time excess is due to the nature of the algorithm inside package, not the data. Please see this https://arxiv.org/abs/2209.10545. In this package I need to call another library for 11 times in my algorithm, and each time call takes up to 7-8 minutes regardless of the input size, . So from my side, there is no way I could change the algorithm. Is there any solution you recommend?

Oct 17 '22 23:10 na396

I can't provide detailed information at this time. Perhaps this will have to wait for inclusion in a future release of Bioconductor. Do the best you can.

Oct 18 '22 15:10 vjcitn

@vjcitn Thank you so much. I do appreciate your help. I was wondering if you know the estimated time for Bioconductor release? Or Can I change the package into workflow?

Oct 18 '22 16:10 na396

Greeting @vjcitn @lshep I have changed the package, and now it takes roughly 13 minutes to be run. However, I have taken more space, in total less than 5 MB as I need to store some results. All rda files are compressed, and on my local computer I did not have any error and warnings. I pushed it to "[email protected]/SGCP.git". Please let me know if it's fine or I need to do anything. Many thanks for your consideration in advance

Oct 25 '22 23:10 na396

Hi @lshep I was wondering if you have seen my previous message?

Oct 28 '22 23:10 na396

You would probably want to store the results on the experiment hub to get the package down to a reasonable size. Also then users would only need to store/download the data when they were interested in running your examples rather than all the time.

Oct 31 '22 11:10 lshep

@lshep Thank you for the message. I have a quick question,. When I was looking at the Bioconductor guidance, I noticed that my package size, which is 3.12 MB, is in acceptable for a Bioconductor. So my question is do I still need to use experiment hub. I also have one more question, is there anything I need to do for further steps? Will my package evaluate for the Bioconductor open source? Thank you so much for your time and consideration

Nov 16 '22 01:11 na396

You need to get the package to not TIMEOUT. Please push any changes to see how the package runs on the system. I suggested ExperimentHub; looking back I misread your comment I thought you said in order to get the package to run that you were over the 5 MB limit so no ExperimentHub is not necessary.

Nov 16 '22 23:11 lshep

@lshep The timeout problem is resolved, and I have pushed the changed. And I this everything is ready.

Nov 17 '22 01:11 na396

Please push changes to git.bioconductor.org with a version bump. You need to trigger a new build. See https://github.com/Bioconductor/Contributions/issues/2840#issuecomment-1280774435

Nov 17 '22 02:11 lshep

Ok, will do soon, thanks

Nov 17 '22 02:11 na396

@lshep Sorry for keep asking question. I just checked my package, and noticed that the package directory size is 3.2 MB, while its installed size is 7.1 MB. Do I need to use the ExperimentHub? Thank you in advance

Nov 18 '22 02:11 na396

Received a valid push on git.bioconductor.org; starting a build for commit id: 147671fb991e5446858eb113742a0ea1cd693dc5

Nov 18 '22 05:11 bioc-issue-bot

@lshep Many many thanks, space, and time are resolved. I have bumped the version and pushed the changes. Everything is ready now, please let me know if I need to do any step. Thank you so much

Nov 18 '22 05:11 na396

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/SGCP to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

Nov 18 '22 05:11 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: fb737d0753cd7414625e488eda30d7a5e03e07b7

Nov 18 '22 18:11 bioc-issue-bot

@lshep Pushed another. Thanks

Nov 19 '22 01:11 na396

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/SGCP to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

Nov 21 '22 11:11 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: e0f0bd7edeb102c860d3485843c48945817df63d

Nov 21 '22 21:11 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/SGCP to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

Nov 21 '22 22:11 bioc-issue-bot

@lshep Hi, Do I need to do anything at this stage?

Dec 01 '22 14:12 na396

Please wait for the reviewer to do an indepth review of the package. This normally occurs with 2-3 weeks of a clean build report.

Dec 01 '22 14:12 lshep

Package 'SGCP' Review

Thank you for submition your package to Bioconductor. The package passed check and build. It is in pretty good shape. However there are several things need to be fixed. Please try to answer the comments line by line when you are ready for a second review.

Code: Note: please condsider; Important: must be addressed.

The NAMESPACE file

[ ] Note: Selective imports using importFrom instead of import all with import.
- in line 18 import("org.Hs.eg.db")
- in line 19 import("ggplot2")
- in line 20 import("expm")
- in line 21 import("dplyr")
- in line 22 import("GO.db")
- in line 23 import(annotate, except=c(toFile))
- in line 24 import("genefilter")
- in line 25 import("GOstats")
- in line 26 import("RColorBrewer")
- in line 27 import("xtable")
- in line 28 import("Rgraphviz")
- in line 29 import("reshape2")
- in line 30 import("openxlsx")
- in line 31 import("ggridges")
- in line 32 import("caret")
- in line 33 import("magick")

General package development

[ ] NOTE: Consider adding the maintainer's ORCID iD in 'Authors@R' with 'comment=c(ORCID="...")'
[ ] NOTE: Consider adding unit tests. We strongly encourage them. See https://contributions.bioconductor.org/tests.html

R code

[ ] NOTE: no direct slot access with @ or slot() - accessors implemented and used. Please ask help form HyperGResult-accessors
- In file R/SGCP_go.R:
  - at line 20 found ' GO_Genes <- hg@goDag@nodeData@data'
[ ] Important: No paste in message(), message, stop
- In file R/SGCP_ezSGC.R:
  - at line 191 found ' message(paste0("cluster ", remain, " is wiped out"))'
  - at line 193 found ' message(paste0("clusters", remain, " are wiped out"))}'
- In file R/SGCP_adjacencyMatrix.R:
  - at line 25 found ' caption_sym <- paste0(" output of ", stp, " , is not symmetric")'
  - at line 29 found ' caption_01 <- paste0(" output of ", stp, "are not in (0,1)")'
[ ] NOTE: :: is not suggested in source code unless you can make sure all the packages are imported. Some people think it is better to keep ::. However please note that you need to manully double check the import items when you make any change in the DESCRIPTION file during development. My recommendation is to remove one or two repeats to force the dependency check.
- In file R/globals.R:
  - at line 1 found 'utils::globalVariables(c("GOtype", "Method", "Pvalue", "Var1", "Var2",'
- In file R/SGCP_plot.R:
  - at line 355 found ' dplyr::group_by(clusterNum, GOtype) %>%'
  - at line 356 found ' dplyr::summarise(max = max(logPvalue), count = n())'
- In file R/SGCP_semiSupervised.R:
  - at line 56 found ' caret::train(label~., method = "knn", tuneGrid = expand.grid(k = kn),'
  - at line 72 found ' semiSuper <- caret::train(label ~., method = "multinom", data = train)'
[ ] NOTE: Vectorize: for loops present, try to replace them by *apply funcitons.
- In file R/SGCP_clustering.R:
  - at line 24 found ' for (inclus in clusters) {'
  - at line 36 found ' for(outclus in clusters){'
  - at line 281 found ' for(clus in unique(clusLab)){'
- In file R/SGCP_ezPlot.R:
  - at line 531 found ' for(plt in pdf.out){'
- In file R/SGCP_go.R:
  - at line 24 found ' for(ind in seq_len(nrow(hg_summary))){'
  - at line 101 found ' for(direct in direction){'
  - at line 103 found ' for(onto in ontology){'
  - at line 226 found ' for(lab in unique(geneClus$clusterLabel)){'
- In file R/SGCP_plot.R:
  - at line 307 found ' for(clus in levels(df$clusterLabel)){'
  - at line 361 found ' for(c in cluslabs){'
- In file R/SGCP_semiLabeling.R:
  - at line 84 found ' for(lab in clusterNums){'
  - at line 92 found ' for(go in GOIDs){'
[ ] Important: Remove unused code.
- In file R/SGCP_adjacencyMatrix.R:
  - at line 42 found ' #res <- which(vapply(x, class, numeric(1)) != "numeric")'
- In file R/SGCP_clustering.R:
  - at line 7 found ' # as.dist(y)'
  - at line 11 found ' #checkSym(dis.y, stp = "silhouette index")'
  - at line 88 found ' #M <- t(M)'
  - at line 91 found ' #M <- t(M)'
  - at line 183 found ' #k <- seq(2, maxNum)'
  - at line 189 found ' #dfgap <- dfgap[-1, ]'
  - at line 212 found ' #df$indices <- seq(1:nrow(df))'
  - at line 213 found ' #print(head(df))'
  - at line 233 found ' #if(plt){'
  - at line 448 found ' #df = summary(nl.t)'
- In file R/SGCP_go.R:
  - at line 121 found ' #pvalueCutoff = hgCutoff,'
  - at line 202 found ' # if(!is.character(annotation_db)){'
  - at line 203 found ' # stop("type of annotation_db must be character") }'
- In file R/SGCP_plot.R:
  - at line 123 found ' #geo_heatmap <- heatmap(as.matrix(melted_m), Rowv = NULL, Colv = NULL)'
  - at line 132 found ' # theme(axis.text.y = element_text(size = 10, face = 'bold','
  - at line 133 found ' # lineheight = 0.9)) +'
  - at line 138 found ' #theme(legend.position = c(.9, .9) ) +'
  - at line 423 found ' #wb = createWorkbook()'
[ ] NOTE: Avoid using '=' for assignment and use '<-' instead
- In file R/SGCP_ezSGC.R:
  - at line 27 found ' semilabel = FALSE}' + at line 30 found ' geneID = paste0(rep("gene", nrow(expData)), seq(1,nrow(expData)))}'
  - at line 46 found ' kopt = NULL}'
  - at line 52 found ' method_k = NULL }'
  - at line 57 found ' maxIteration = 1e+8'
  - at line 58 found ' numberStart = 1000}'
  - at line 83 found ' condTest = TRUE}'
  - at line 88 found ' percent = 0.10 }'
  - at line 93 found ' stp = 0.01 }'
  - at line 97 found ' model = "knn" }'
  - at line 103 found ' semilabel = FALSE}'
  - at line 108 found ' semilabel = FALSE }'
  - at line 128 found ' geneID = resClus$geneID'
  - at line 170 found ' geneLabel = resSup$FinalLabeling'
[ ] Important: Please consider to add drop=FALSE to avoid the reduction of dimension for matrices and arrays.
- In file R/SGCP_clustering.R:
  - at line 286 found ' M <- adja[indin, indout]'
  - at line 334 found ' Yf <- Y[, seq_len(min(2*krelativeGap, ncol(Y)))]'
  - at line 335 found ' Xf <- X[seq_len(min(2*krelativeGap, length(X)))]'
  - at line 342 found ' Ys <- Y[, seq_len(min(2*ksecondOrderGap, ncol(Y)))]'
  - at line 343 found ' Xs <- X[seq_len(min(2*ksecondOrderGap, length(X)))]'
  - at line 350 found ' Yg <- Y[, seq_len(min(2*kadditiveGap, ncol(Y)))]'
  - at line 351 found ' Xg <- X[seq_len(min(2*kadditiveGap, length(X)))]'
  - at line 430 found ' Yt <- Y[, seq_len(min(2*k, ncol(Y)))]'
  - at line 431 found ' Xt <- X[seq_len(min(2*k, length(X)))]'
  - at line 631 found ' adjaMat <- adjaMat[-ind, ]'
  - at line 632 found ' adjaMat <- adjaMat[, -ind]'
  - at line 662 found ' eg <- eg[order(eg$eigenvalues, decreasing = TRUE), ]'
  - at line 673 found ' adjaMat <- adjaMat[-nois_ind, ]'
  - at line 674 found ' adjaMat <- adjaMat[, -nois_ind]'
  - at line 675 found ' D <- D[-nois_ind, ]'
  - at line 676 found ' D <- D[, -nois_ind]'
  - at line 801 found ' Y <- Y[, -1]'
  - at line 808 found ' Yorig <- Y[, seq_len(min(n_egvec, ncol(Y)))]'
  - at line 809 found ' Xorig <- X[seq_len(min(n_egvec, length(X)))]'
  - at line 864 found ' Yf <- Y[, seq_len(min(2*krelativeGap, ncol(Y)))]'
  - at line 865 found ' Xf <- X[seq_len(min(2*krelativeGap, length(X)))]'
  - at line 884 found ' Ys <- Y[, seq_len(min(2*ksecondOrderGap, ncol(Y)))]'
  - at line 885 found ' Xs <- X[seq_len(min(2*ksecondOrderGap, length(X)))]'
  - at line 903 found ' Yg <- Y[, seq_len(min(2*kadditiveGap, ncol(Y)))]'
  - at line 904 found ' Xg <- X[seq_len(min(2*kadditiveGap, length(X)))]'
  - at line 925 found ' Yopt <- Y[, seq_len(min(2*kopt, ncol(Y)))]'
  - at line 926 found ' Xopt <- X[seq_len(min(2*kopt, length(X)))]'
  - at line 953 found ' sil <- sil[ , !(names(sil) %in% "geneIndices")]'
- In file R/SGCP_ezSGC.R:
  - at line 131 found ' expData = expData[-resClus$dropped.indices, ] }'
- In file R/SGCP_ezSGCP.R:
  - at line 125 found ' expData <- expData[-resClus$dropped.indices, ] }'
- In file R/SGCP_go.R:
  - at line 139 found ' df_hg <- df_hg[,c(8,9,1,2,3,4,5,6,7)]'
- In file R/SGCP_semiSupervised.R:
  - at line 23 found ' train <- specExp[rownames(specExp) %in% geneLab$geneID & !is.na(geneLab$label), ]'
  - at line 24 found ' test <- specExp[rownames(specExp) %in% geneLab$geneID & is.na(geneLab$label), ]'
  - at line 37 found ' train <- train[, which(names(train) %!in% "geneID" )]'
  - at line 49 found ' gg <- geneLab[complete.cases(geneLab), ]'
[ ] NOTE: Functional programming: code repetition.
- repetition in clustering and cvConductance
  - in clustering
    - line 114: if (method == "relativeGap") {
    - line 115: krelativeGap <- k$relativeGap
    - line 116: Yf <- Y[, seq_len(min(2 * krelativeGap, ncol(Y)))]
    - line 117: Xf <- X[seq_len(min(2 * krelativeGap, length(X)))]
    - line 118: Yf <- divideNorm(Yf, rowWise = TRUE)
    - line 119: clusf <- kmeans(Yf, krelativeGap, iter.max = maxIter,
    - line 120: nstart = numStart)
    - line 121: conf <- conductance(adja = adjaMat, clusLab = clusf$cluster)
    - line 131: ksecondOrderGap <- k$secondOrderGap
    - line 132: Ys <- Y[, seq_len(min(2 * ksecondOrderGap, ncol(Y)))]
    - line 133: Xs <- X[seq_len(min(2 * ksecondOrderGap, length(X)))]
    - line 134: Ys <- divideNorm(Ys, rowWise = TRUE)
    - line 135: cluss <- kmeans(Ys, ksecondOrderGap, iter.max = maxIter,
    - line 136: nstart = numStart)
    - line 137: cons <- conductance(adja = adjaMat, clusLab = cluss$cluster)
    - line 147: kadditiveGap <- k$additiveGap
    - line 148: Yg <- Y[, seq_len(min(2 * kadditiveGap, ncol(Y)))]
    - line 149: Xg <- X[seq_len(min(2 * kadditiveGap, length(X)))]
    - line 150: Yg <- divideNorm(Yg, rowWise = TRUE)
    - line 151: clusg <- kmeans(Yg, kadditiveGap, iter.max = maxIter,
    - line 152: nstart = numStart)
    - line 153: cong <- conductance(adja = adjaMat, clusLab = clusg$cluster)
  - in cvConductance
    - line 3: message("Conductance Validation...")
    - line 4: krelativeGap <- k$relativeGap
    - line 5: Yf <- Y[, seq_len(min(2 * krelativeGap, ncol(Y)))]
    - line 6: Xf <- X[seq_len(min(2 * krelativeGap, length(X)))]
    - line 7: Yf <- divideNorm(Yf, rowWise = TRUE)
    - line 8: clusf <- kmeans(Yf, krelativeGap, iter.max = maxIter, nstart = numStart)
    - line 10: ksecondOrderGap <- k$secondOrderGap
    - line 11: Ys <- Y[, seq_len(min(2 * ksecondOrderGap, ncol(Y)))]
    - line 12: Xs <- X[seq_len(min(2 * ksecondOrderGap, length(X)))]
    - line 13: Ys <- divideNorm(Ys, rowWise = TRUE)
    - line 14: cluss <- kmeans(Ys, ksecondOrderGap, iter.max = maxIter,
    - line 15: nstart = numStart)
    - line 16: cons <- conductance(adja = adja, clusLab = cluss$cluster)
    - line 17: kadditiveGap <- k$additiveGap
    - line 18: Yg <- Y[, seq_len(min(2 * kadditiveGap, ncol(Y)))]
    - line 19: Xg <- X[seq_len(min(2 * kadditiveGap, length(X)))]
    - line 20: Yg <- divideNorm(Yg, rowWise = TRUE)
    - line 21: clusg <- kmeans(Yg, kadditiveGap, iter.max = maxIter, nstart = numStart)
- repetition in clustering and ezSGCP
  - in clustering
    - line 34: }
    - line 35: if (!is.null(kopt) && kopt != round(kopt)) {
    - line 36: warning("kopt must be either null or an integer", call. = FALSE)
    - line 37: message("making kopt null")
    - line 38: kopt <- NULL
    - line 39: }
    - line 40: if (length(setdiff(method, c("relativeGap", "secondOrderGap",
    - line 41: "additiveGap"))) != 0) {
    - line 42: warning("method can be either relativeGap, secondOrderGag, or additiveGap",
    - line 43: call. = FALSE)
    - line 44: message("making method to NULL")
    - line 45: method <- NULL
    - line 46: }
    - line 47: if (!is.numeric(maxIter) || !is.numeric(numStart)) {
    - line 48: warning("maxIter and numStart must be numeric and integer",
  - in ezSGCP
    - line 27: }
    - line 28: if (!is.null(kopt) && kopt != round(kopt)) {
    - line 29: warning("kopt must be either null or an integer")
    - line 30: message("making k null")
    - line 31: kopt <- NULL
    - line 32: }
    - line 33: if (length(setdiff(method_k, c("relativeGap", "secondOrderGap",
    - line 34: "additiveGap"))) != 0) {
    - line 35: warning("method_k can be either relativeGap, secondOrderGap, or additiveGap",
    - line 36: call. = FALSE)
    - line 37: message("making method to NULL")
    - line 38: method_k <- NULL
    - line 39: }
    - line 40: if (!is.numeric(maxIteration) || !is.numeric(numberStart)) {
    - line 41: warning("maxIteration and numStart must be numeric and integer")
- repetition in clustering and sigClusGO and cvConductance
  - in clustering
    - line 167: Yopt <- Y[, seq_len(min(2 * kopt, ncol(Y)))]
    - line 168: Xopt <- X[seq_len(min(2 * kopt, length(X)))]
    - line 169: Yopt <- divideNorm(Yopt, rowWise = TRUE)
    - line 170: clusopt <- kmeans(Yopt, kopt, iter.max = maxIter, nstart = numStart)
    - line 171: conopt <- conductance(adja = adjaMat, clusLab = clusopt$cluster)
  - in sigClusGO
    - line 3: Yt <- Y[, seq_len(min(2 * k, ncol(Y)))]
    - line 4: Xt <- X[seq_len(min(2 * k, length(X)))]
    - line 5: Yt <- divideNorm(Yt, rowWise = TRUE)
    - line 6: clust <- kmeans(Yt, k, iter.max = maxIter, nstart = numStart)
    - line 7: cont <- conductance(adja = adja, clusLab = clust$cluster)
  - in cvConductance
    - line 6: Xf <- X[seq_len(min(2 * krelativeGap, length(X)))]
    - line 7: Yf <- divideNorm(Yf, rowWise = TRUE)
    - line 8: clusf <- kmeans(Yf, krelativeGap, iter.max = maxIter, nstart = numStart)
    - line 9: conf <- conductance(adja = adja, clusLab = clusf$cluster)
    - line 19: Xg <- X[seq_len(min(2 * kadditiveGap, length(X)))]
    - line 20: Yg <- divideNorm(Yg, rowWise = TRUE)
    - line 21: clusg <- kmeans(Yg, kadditiveGap, iter.max = maxIter, nstart = numStart)
    - line 22: cong <- conductance(adja = adja, clusLab = clusg$cluster)
- repetition in DOM and TOM
  - in DOM
    - line 1:{
    - line 2: diag(mat) <- 0
    - line 3: degreeRow <- replicate(dim(mat)[1], rowSums(mat))
    - line 4: degreeCol <- t(replicate(dim(mat)[1], colSums(mat)))
    - line 5: degreeMin <- pmin(degreeRow, degreeCol)
    - line 6: rm(degreeRow, degreeCol)
    - line 7: degreeRow <- replicate(dim(mat)[1], rowSums(mat)^2)
    - line 8: degreeCol <- t(replicate(dim(mat)[1], colSums(mat)^2))
    - line 9: degreeMin2 <- pmin(degreeRow, degreeCol)
    - line 10: numerator <- mat + (mat %^% 2) + (mat %^% 3)
    - line 11: denominator <- degreeMin2 + degreeMin + (1 - mat)
    - line 12: res <- numerator/denominator
    - line 13: diag(res) <- 1
    - line 14: rm(degreeCol, degreeRow, degreeMin, degreeMin2)
    - line 15: return(as.matrix(res))
  - in TOM
    - line 1:{
    - line 2: diag(mat) <- 0
    - line 3: degreeRow <- replicate(dim(mat)[1], rowSums(mat))
    - line 4: degreeCol <- t(replicate(dim(mat)[1], colSums(mat)))
    - line 5: degreeMin <- pmin(degreeRow, degreeCol)
    - line 6: numerator <- (mat %^% 2) + mat
    - line 7: denominator <- degreeMin + (1 - mat)
    - line 8: res <- numerator/denominator
    - line 9: diag(res) <- 1
    - line 10: rm(degreeCol, degreeRow, degreeMin)
    - line 11: return(as.matrix(res))
- repetition in ezSGCP and geneOntology
  - in ezSGCP
    - line 45: }
    - line 46: if (all(dir %!in% c("under", "over"))) {
    - line 47: warning("dir must be in c(under or over) \n making to default",
    - line 48: call. = FALSE)
    - line 49: dir <- c("over", "under")
    - line 50: }
    - line 51: if (length(dir) > 2) {
    - line 52: warning("dir must be in c(under or over) \n making to default",
    - line 53: call. = FALSE)
    - line 54: dir <- c("over", "under")
    - line 55: }
    - line 56: if (all(onto %!in% c("BP", "CC", "MF"))) {
    - line 57: warning(" onto must be in BP CC MF \n making to default",
    - line 58: call. = FALSE)
    - line 59: onto <- c("BP", "CC", "MF")
    - line 60: }
    - line 61: if (length(onto) > 3) {
    - line 62: warning(" onto must be in BP CC MF \n making to default",
    - line 63: call. = FALSE)
    - line 64: onto <- c("BP", "CC", "MF")
    - line 65: }
    - line 66: if (!is.null(hgCut) && (hgCut >= 1 || hgCut <= 0)) {
    - line 67: warning(" not correct hgCutoff value \n making to default",
    - line 68: call. = FALSE)
    - line 69: }
    - line 70: if (condTest != TRUE && condTest != FALSE) {
    - line 71: warning(" condTest must be boolean! \n making to deafult",
    - line 72: call. = FALSE)
    - line 73: condTest <- TRUE
    - line 74: }
  - in geneOntology
    - line 3:{
    - line 4: if (all(direction %!in% c("under", "over"))) {
    - line 5: warning("direction must be in c(under or over) \n making to default",
    - line 6: call. = FALSE)
    - line 7: direction <- c("over", "under")
    - line 8: }
    - line 9: if (length(direction) > 2) {
    - line 10: warning("direction must be in c(under or over) \n making to default",
    - line 11: call. = FALSE)
    - line 12: direction <- c("over", "under")
    - line 13: }
    - line 14: if (all(ontology %!in% c("BP", "CC", "MF"))) {
    - line 15: warning(" ontology must be in BP CC MF \n making to default",
    - line 16: call. = FALSE)
    - line 17: ontology <- c("BP", "CC", "MF")
    - line 18: }
    - line 19: if (length(ontology) > 3) {
    - line 20: warning(" ontology must be in BP CC MF \n making to default",
    - line 21: call. = FALSE)
    - line 22: ontology <- c("BP", "CC", "MF")
    - line 23: }
    - line 24: if (!is.null(hgCutoff) && (hgCutoff >= 1 || hgCutoff <= 0)) {
    - line 25: warning(" not correct hgCutoff value \n making to default",
    - line 26: call. = FALSE)
    - line 27: }
    - line 28: if (cond != TRUE && cond != FALSE) {
    - line 29: warning(" cond must be boolean! \n making to deafult",
    - line 30: call. = FALSE)
    - line 31: cond <- TRUE
    - line 32: }
- repetition in ezSGCP and semiLabeling
  - in ezSGCP
    - line 74: }
    - line 75: if (percent >= 1 || percent <= 0) {
    - line 76: warning("percent must be in (0,1) \n making percent to default",
    - line 77: call. = FALSE)
    - line 78: percent <- 0.1
    - line 79: }
    - line 80: if (stp >= 1 || stp <= 0) {
    - line 81: warning("step must be in (0,1) \n making stp to default",
    - line 82: call. = FALSE)
    - line 83: stp <- 0.01
    - line 84: }
  - in semiLabeling
    - line 11: }
    - line 12: if (percent >= 1 || percent <= 0) {
    - line 13: warning("percent must be in (0,1) \n making percent to default",
    - line 14: call. = FALSE)
    - line 15: percent <- 0.1
    - line 16: }
    - line 17: if (stp >= 1 || stp <= 0) {
    - line 18: warning("stp must be in (0,1) \n making percent to default",
    - line 19: call. = FALSE)
    - line 20: stp <- 0.01
    - line 21: }
- repetition in ezSGCP and semiSupervised
  - in ezSGCP
    - line 84: }
    - line 85: if (!is.null(model) && model != "knn" & model != "lr") {
    - line 86: warning("model must be either NULL, knn, or lr \n setting to knn",
    - line 87: call. = FALSE)
    - line 88: model <- "knn"
  - in semiSupervised
    - line 5: }
    - line 6: if (!is.null(model) & model != "knn" & model != "lr") {
    - line 7: warning("model must be either NULL, knn, or lr \n setting to knn")
    - line 8: model <- "knn"
    - line 9: }
- repetition in GeneOfGOTerm and GOenrichment
  - in GeneOfGOTerm
    - line 15: labeledGenes <- c(labeledGenes, temp)
    - line 16: }
    - line 17: labeledGenes <- labeledGenes[-1]
    - line 18: newList <- list(labeledGenes = labeledGenes, GOTermGenes = GOTermGenes)
  - in GOenrichment
    - line 91: labeledGenes <- labeledGenes[-1]
    - line 92: labeledGenes <- unique(labeledGenes)
    - line 93: }
    - line 94: newList <- list(labeledGenes = labeledGenes, GOTermGenes = GOTermGenes,
- repetition in geneOntology and GOenrichment
  - in geneOntology
    - line 17: ontology <- c("BP", "CC", "MF")
    - line 18: }
    - line 19: if (length(ontology) > 3) {
    - line 20: warning(" ontology must be in BP CC MF \n making to default",
    - line 22: ontology <- c("BP", "CC", "MF")
    - line 23: }
    - line 24: if (!is.null(hgCutoff) && (hgCutoff >= 1 || hgCutoff <= 0)) {
    - line 25: warning(" not correct hgCutoff value \n making to default",
  - in GOenrichment
    - line 32: ontology <- c("BP", "CC", "MF")
    - line 33: }
    - line 34: if (length(ontology) > 3) {
    - line 35: warning(" ontology must be in BP CC MF", call. = FALSE)
    - line 37: ontology <- c("BP", "CC", "MF")
    - line 38: }
    - line 39: if (!is.null(hgCutoff) && (hgCutoff >= 1 || hgCutoff <= 0)) {
    - line 40: warning(" not correct hgCutoff value", call. = FALSE)
[ ] NOTE: Functional programming: code repetition Type 2. In function df2mat you already removed the colnames and rownames, but you call remove them again at line 158-159.

Documentation

[ ] Important: Please include Bioconductor installation instructions using BiocManager.
- rmd file vignettes/SGCP.Rmd
[ ] Note: Vignette includes motivation for submitting to Bioconductor as part of the abstract/intro of the main vignette.
- rmd file vignettes/SGCP.Rmd

Dec 05 '22 16:12 jianhong

Contributions Contributions copied to clipboard

SGCP

Package 'SGCP' Review

The NAMESPACE file

General package development

R code

Documentation

Contributions
Contributions copied to clipboard