Contributions icon indicating copy to clipboard operation
Contributions copied to clipboard

SPIAT

Open fuerzhou opened this issue 2 years ago • 26 comments

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

  • Repository: https://github.com/TrigosTeam/SPIAT

Confirm the following by editing each check box to '[x]'

  • [x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.

  • [x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.

  • [x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.

  • [x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.

  • [x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.

  • [x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.

  • [x] I am familiar with the Bioconductor code of conduct and agree to abide by it.

I am familiar with the essential aspects of Bioconductor software management, including:

  • [x] The 'devel' branch for new packages and features.
  • [x] The stable 'release' branch, made available every six months, for bug fixes.
  • [x] Bioconductor version control using Git (optionally via GitHub).

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

fuerzhou avatar Apr 01 '22 01:04 fuerzhou

Hi @fuerzhou

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: SPIAT
Type: Package
Title: Spatial Image Analysis of Tissues
Version: 0.99.0
Authors@R: c(
    person(given = "Anna",
 family="Trigos",
 role=c("aut"),
 email="[email protected]"),
    person(given = "Yuzhou",
 family = " Feng",
 role=c("aut", "cre"),
 email="[email protected]"),
    person(given = "Tianpei",
 family = " Yang",
 role="aut",
 email="[email protected]"),
    person(given = "Mabel",
 family = " Li",
 role="aut"),
    person(given = "John",
 family = " Zhu",
 role="aut"),
    person(given = "Volkan",
 family=" Ozcoban",
 role="aut",
 email = "[email protected]"),
    person(given = "Maria",
 family= "Doyle",
 role="aut",
 email = "[email protected]"))
Description: SPIAT (Spatial Image Analysis of Tissues) is a suite of data processing, 
    quality control, visualization, data handling and data analysis tools for 
    multiplex immunohistochemistry. Converts INFORM or HALO images into a 
    SingleCellExperiment class. SPIAT includes novel algorithms for the 
    identification of cell clusters, cell margins and cell gradients, the 
    calculation of neighbourhood proportions, and algorithms for the prediction 
    of cell phenotypes in tissue images. SPIAT also includes speedy 
    implementations of the calculation of cell distances and detection of cell 
    communities.
License: Artistic-2.0
Encoding: UTF-8
Depends: 
    R (>= 4.1.0),
    SingleCellExperiment (>= 1.4.1),
    SummarizedExperiment
Imports: 
    apcluster (>= 1.4.7),
    ggplot2 (>= 3.2.1),
    gridExtra (>= 2.3),
    gtools (>= 3.8.1),
    reshape2 (>= 1.4.3),
    dplyr (>= 0.8.3),
    plotly (>= 4.9.0),
    RANN (>= 2.6.1),
    pracma (>= 2.2.5),
    dbscan (>= 1.1-5),
    mmand (>= 1.5.4),
    tibble (>= 2.1.3),
    grDevices,
    stats,
    utils,
    vroom,
    ComplexHeatmap,
    dittoSeq,
    spatstat.geom,
    alphahull,
    methods,
    spatstat.core,
    raster,
    sp,
    xROI,
    Seurat,
    elsa,
    Rtsne,
    umap,
    rlang
Suggests:
    BiocStyle,
    knitr,
    rmarkdown,
    pkgdown,
    testthat
biocViews: BiomedicalInformatics, CellBiology, Spatial, Clustering, DataImport, ImmunoOncology, QualityControl, SingleCell, Software, Visualization
BugReports: https://github.com/trigosteam/SPIAT/issues
RoxygenNote: 7.1.2
LazyData: true
VignetteBuilder: knitr
URL: https://trigosteam.github.io/SPIAT/

bioc-issue-bot avatar Apr 01 '22 01:04 bioc-issue-bot

Your github action for R CMD check is failing on mac and windows ...? Is it ready for submission from the portability perspective?

vjcitn avatar Apr 03 '22 16:04 vjcitn

Now I see that the reason for failure in github actions is due to lack of R 4.2 on those platforms.

vjcitn avatar Apr 03 '22 16:04 vjcitn

Yes, R 4.2 is not available for Mac and windows. Do I need to change anything for that?

fuerzhou avatar Apr 03 '22 22:04 fuerzhou

Hi @vjcitn,

May I ask if there is anything that I need to change for this package? If not, could we proceed to the next stage?

Thank you!

fuerzhou avatar Apr 11 '22 04:04 fuerzhou

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot avatar Apr 18 '22 14:04 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/SPIAT to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Apr 18 '22 15:04 bioc-issue-bot

Hi @fuerzhou,

Sorry for how long it is taking me to review your package. I was only able to start looking at SPIAT this week, following the latest Bioconductor release and NHMRC grant deadlines. I'd hoped to have finished an initial review for you this week, but unfortunately I'm now sick and so that's not looking likely. Thank you for your understanding and I will provide my review as soon as possible.

Thanks, Pete

PeteHaitch avatar May 11 '22 23:05 PeteHaitch

@PeteHaitch

Thank you for your message. Please don't worry about this and rest well!

Take care, Yuzhou

fuerzhou avatar May 12 '22 00:05 fuerzhou

Hi @fuerzhou,

Thank you for submitting SPIAT to Bioconductor. SPIAT is quite an ambitious package in its scope and has many functions. Again, I apologise it has taken so long to post an initial review.

Technically, the package is in a good state because it is passing most of the automated checks. It's great to see you seeking to operate with key Bioconductor classes. However, my main concern is that with this being used to spatial and imaging data that the main data object could/should be a SpatialExperiment object rather than a SingleCellExperiment object. @lmweber, @drighelli, and @HelenaLC, who develop the SpatialExperiment package, can likely advise better than I can on how best to transition from SingleCellExperiment to SpatialExperiment. For example, my initial impression is that the Cell.X.Position and Cell.Y.Position data, currently stored in the colData of the SingleCellExperiment, would be stored in the spatialCoords of a SpatialExperiment (see https://lmweber.org/OSTA-book/spatialexperiment.html for an illustration).

I'd like to better understand your choice of SingleCellExperiment over SpatialExperiment. I think that this choice of data class is a fundamental point that must be first addressed, but I have also provided additional comments in my review (separated into Required and Recommended) that I would also ask you to address. Please add your point-by-point replies to the review in this thread.

Cheers, Pete

Required

  • [ ] Please take a reasonable attempt to address all NOTES raised by BiocCheck::BiocCheck().
  • [ ] Upon acceptance, please remove github installation instructions from vignette/README and only include the BiocManager::install()-based instructions.
  • [ ] Please proofread the documentation and vignette.
    • [ ] E.g., there is a comment in the vignette "#maybe you no longer need this if we are using the general format?" which doesn't seem like it should be there.
    • [ ] Another example is the documentation of dimensionality_reduction_plot() contains text that seems like it shouldn't be there: "Generates the dimensionality reduction plots (UMAP or tSNE) based on marker intensities. Cells are grouped by the categories under the selected column. – have you tried doing PCA on the matrix and then doing the UMAP/tSNE? Does it help? scRNAseq workflows do this. (?)"
    • [ ] Another example: §2.5 of vignette starts with "Plot the image." but nothing further and no image.
  • [ ] Please limit the use of unevaluated code chunks in the vignette (i.e. eval=FALSE). For example, there is an unevaluated code chunk in §2.1.2.1 of vignette and so the last 3 lines of the chunk aren't evaluated
# Unevaluated code in §2.1.2.1 of vignette
class(formatted_image) # The formatted image is a SingleCellExperiment object
dim(colData(formatted_image))
dim(assay(formatted_image))
  • [ ] Based on on the vignette, getting the phenotype format correct seems critical. Are the rules for defining cell phenotypes documented somewhere?
    • [ ] Are these rules defined by SPIAT or by the underlying assays (HALO, InFORM, etc).
    • [ ] Are non-binary or more complex phenotypes supported? E.g., 'CD3-low', 'CD3-mid', 'CD3-high'?
  • [ ] I don't quite understand the image_splitter() function and it has some problems.
    • [ ] It's not clear to me if users need access to this function or if it should be an internal function of SPIAT? (If it's intended as an internal function, then my concerns aren't so relevant.)
    • [ ] image_splitter() returns a list of data frames, which doesn't seem very convenient or consistent with using SingleCelLExperiment or SpatialExperiment. (Naively, I was expecting perhaps a list of SingleCellExperiment or SpatialExperiment objects.). When might users be calling this function and does it returning a 'list of data frames' make sense for that purpose?
    • [ ] An optional purpose of image_splitter() is to "plot the cell positions in each sub image", but this doesn't seem to work, e.g., this errors:
suppressPackageStartupMessages(library(SPIAT))
data("simulated_image")
image_splitter(simulated_image, number_of_splits=3, plot = TRUE)
#> Error in `[.data.frame`(cell_loc, , feature_colname): undefined columns selected
  • [ ] knitr::include_graphics() shouldn't appear in the rendered vignette because that's not what a user would run to view the plot. If you need to include a static plot from file, then have the code in the vignette display what the user would actually need to run to generate the plot (but set eval = FALSE so that the code isn't run) and then have a hidden chunk (echo = FALSE) that is actually evaluated and inserts the image with knitr::include_graphics().

Recommended

  • [ ] Good job on writing unit tests. Take a look at running covr::report() on the package to help identify functions with less testing coverage (e.g., format_image_to_sce() is relatively under-tested, which could be a problem in the long-term since it seems a key function of SPIAT).
  • [ ] Strongly recommend splitting the very long format_image_to_sce() function up into format-specific internal functions. Experience is that this will help code maintenance into the future. E.g., in pseudocode:
format_image_to_sce <- function(format, path, etc...) {
  if (format == "HALO") {
    format_HALO_to_sce(...) 
  } else if (format == "INFORM") {
    format_INFORM_to_sce(...)
  } else if (format == "CODEX") {
    format_CODEX_to_sce(...)
  } else if (foramt == "Visium") {
    SpatialExperiment::read10xVisium(...)
  }
    # etc.
  }
}
  • [ ] The format_image_to_sce() makes the default assay name counts; is that intentional? I ask because these immunohistochemistry assays don't necessarily return integer counts do they?
  • [ ] format_image_to_sce(format = "Visium") uses Seurat::Read10X(). You could instead use DropletUtils::read10xCounts() or SpatialExperiment::read10xVisium() (depending on the exact input and output format requirements) for a more Bioconductor-focused workflow (and potentially to remove the dependency on Seurat).
  • [ ] Proofread vignette to make sure code is visible and formatting of text is as desired (e.g., end of §2.2.1 'Structure of a SPIAT SingleCellExperiment object' has strange formatting).
  • [ ] Please run a spellcheck on the vignette and documentation.
  • [ ] Is print_feature() really needed? print_feature(simulated_image, feature_colname = "Phenotype") seems to be equivalent to unique(simulated_image$Phenotype) or unique(simulated_image[["Phenotype"]]). More generally, it's probably best not to re-invent the wheel by creating functions that could be achieved with short, basic R code.
  • [ ] predicted_image() creates multiple figures but the user may only see the last one if run interactively.
  • [ ] Legend of marker_prediction_plot() is barely legible (§2.2.5 of vignette).
  • [ ] plot_cell_marker_levels() (§2.3.2 of vignette) produced figure with very small points on my laptop, making it very hard to interpret. To change the point size, a user needs to know a bit about ggplot2 to change this (e.g., plot_cell_marker_levels(formatted_image, "Immune_marker1") + ggplot2::geom_point(size = 2)). For comparison, the plotting functions in the scater package (which also make use of ggplot2) expose a point_size argument (as well as other arguments to help make common modifications to the plot); see help("scater-plot-args", "scater")
  • [ ] Regarding AUC_of_cross_function(), my understanding is that usually 0 <= AUC <= 1; so what does AUC < 0 mean (this happens for the example in the vignette).
  • [ ] Recommend proofreading the Description field in the DESCRIPTION file.
  • [ ] There are quite a lot of dependencies (see note from R CMD check, below). Consider the recommendation made by R CMD check and think about whether there are packages that could eliminated by using base R functionality.
❯ checking package dependencies ... NOTE
  Imports includes 26 non-default packages.
  Importing from so many packages makes the package vulnerable to any of
  them becoming unavailable.  Move as many as possible to Suggests and
  use conditionally.
  • [ ] Not really helping the above point, but you could drop SummarizedExperiment from Depends because SingleCellExperiment already depends on it.
  • [ ] For packages that include data, we recommend not including LazyData: TRUE (https://contributions.bioconductor.org/description.html#description-lazydata)
  • [ ] Please update the NEWS file (it only refers to v0.2)
  • [ ] Check documentation of data. E.g., help("defined_image", "SPIAT") says the dataset is called simulated_image rather than (presumaby) defined_image.
  • [ ] Consider adding a package-level man page (https://contributions.bioconductor.org/docs.html#package-level-documentation)

PeteHaitch avatar May 19 '22 04:05 PeteHaitch

hi @PeteHaitch and @fuerzhou,

about the SpatialExperiment I'm not sure how to load data with other technologies, but for loading Visium data we have the read10xVisum function.

Additionally, the SpatialExperiment class has the imgData data structure specifically designed for image data handling. I think this could suit very well the case of the SPIAT package.

Hope this helps!

Dario

drighelli avatar May 19 '22 14:05 drighelli

Hi all, not sure if my comment is relevant here but I was in a similar situation when writing imcRtools. I decided to not support handling of multiplexed images as this can be done using cytomapper. With imcRtools I wanted to support the SpatialExperiment class (storing coordinates in spatialCoords) but for backward compatibility reasons (we have been using the SingleCellExperiment class for quite some years) the spatial analysis functions also support the SingleCellExperiment class (coordinates stored in colData, e.g. see here and for reader functions here). I agree with @drighelli that the SpatialExperiment class is quite suited for SPIAT but you might want to consider also supporting the SingleCellExperiment class if this is commonly used by the community. But please feel free to disagree :) Cheers, Nils

nilseling avatar May 20 '22 08:05 nilseling

Hi @nilseling,

thanks for your comments, it is true that the community is using the SingleCellExperiment class by storing the coordinates into the colData, but of course now with the SpatialExperiment it is possible to handle also the images directly with the new class, and we are coordinating with the community to support future spatial technologies with other images type as well. Additionally, some new functionalities for image handling are coming thanks to the contribution of other developers, and I'm pretty confident that more functionalities will be available in the future. So, please consider switching to the SpatialExperiment class when it comes to handling spatial technologies data into Bioconductor. Bests, Dario

drighelli avatar May 20 '22 08:05 drighelli

Thank you @PeteHaitch for your comprehensive review of SPIAT. We will read the comments carefully in the following days and address each one of the required actions and the recommended ones as well.   Thank you @nilseling @drighelli for your comments on SingleCellExperiment vs. SpatialExperiment. We started writing the package more than two years ago before SpatialExperiment was distributed in Bioconductor. We chose SingleCellExperiment because almost everyone in this field is familiar with it; therefore, it requires little additional knowledge for people to use our package. I agree that using SpatialExperiment is a trend for handling data of the emerging spatial technologies. The reasons why we still want to stick to SingleCellExperiment are:

  1. SPIAT is not designed specifically for spatial transcriptomics (ST) data while SpatialExperiment is mainly for ST, especially for 10X Visium. SPIAT can also read processed spatial proteomics data from platforms like MIBI, CODEX, and inForm and HALO (for Opal images) etc…
  2. SpatialExperiment assumes you have a microscopy image, while SPIAT is not meant to analyse raw microscopy images (e.g. analysis from pixel data), and therefore there is no practical reason to have an image associated with the object.
  3. In the future we will consider adding specific functions that read in microscopy images and plot the data onto it, but only as an optional extension as it is not a core intention of our project.

From these three points, SpatialExperiment does not suit the purpose of SPIAT and does not yield significant improvement of SPIAT. Since SpatialExperiment is built upon SingleCellExperiment, we believe that it’s better to use a more basic class if possible.

@PeteHaitch As such, we hope to continue using SingleCellExperiment. Please let us know your thoughts!

fuerzhou avatar May 23 '22 10:05 fuerzhou

Ultimately, the choice is yours. However, I would still argue that a SpatialExperiment is the better fit for the type of data you are processing with SPIAT.


We started writing the package more than two years ago before SpatialExperiment was distributed in Bioconductor.

A totally fair reason for why when you initially developed the package you used SingleCellExperiment. However, as you note, that was more than 2 years ago and now is when you are submitting to Bioconductor. SpatialExperiment was introduced 1.5 years ago and, as a new user of a package titled 'Spatial Image Analysis of Tissues', I was expecting SPIAT to interoperate with the standard Bioconductor data class for that type of data, a SpatialExperiment,

We chose SingleCellExperiment because almost everyone in this field is familiar with it; therefore, it requires little additional knowledge for people to use our package.

A SpatialExperiment is a SingleCellExperiment (in the formal S4 sense).

library(SpatialExperiment)
spe <- SpatialExperiment()
is(spe, "SingleCellExperiment")
#> [1] TRUE

Therefore, SpatialExperiment inherits the same API for _SingleCellExperiment, and then adds some very useful spatial-specific bits. Because of this relationship, if a user is familiar with SingleCellExperiment, then they are 90% of the way to being familiar with a SpatialExperiment.

SPIAT is not designed specifically for spatial transcriptomics (ST) data while SpatialExperiment is mainly for ST, especially for 10X Visium.

I can see why you may think that, e.g., the opening sentence of the 'Description' in help("SpatialExperiment", "SpatialExperiment") says, "The SpatialExperiment class is designed to represent spatially resolved transcriptomics (ST) data". However, that's an inaccuracy in the documentation rather than an actual limitation (I'll raise this with the SpatialExperiment developers). The SpatialExperiment class isn't restricted to spatial transcriptomics data just as the SingleCellExperiment class isn't restricted to single-cell transcriptomics.

SPIAT can also read processed spatial proteomics data from platforms like MIBI, CODEX, and inForm and HALO (for Opal images) etc…

Proteomics data are fine to go in a SpatialExperiment, there's no restriction there. Again, these are all spatial and/or image related technologies, hence SpatialExperiment seems a better fit than SingleCellExperiment.

SpatialExperiment assumes you have a microscopy image, while SPIAT is not meant to analyse raw microscopy images (e.g. analysis from pixel data), and therefore there is no practical reason to have an image associated with the object.

SpatialExperiment does not assume you have a microscropy image, i.e. an image isn't a requirement for a SpatialExperiment (see help("SpatialExperiment", "SpatialExperiment") where imgData is documented as 'Optional'). Your data type do have spatial co-ordinates, however, which are not a natural feature of a SingleCellExperiment but are a natural feature of a SpatialExperiment.

In the future we will consider adding specific functions that read in microscopy images and plot the data onto it, but only as an optional extension as it is not a core intention of our project.

If you are even considering this, then it's going to be easier if your main data structure already supports images (i.e. SpatialExperiment).


My point in all this is that now (during the Bioconductor submission and review process) is the time to get those decisions 'right', rather than having to modify a Bioconductor-released version of SPIAT at a later time when you have to preserve previous behaviour for compatibility with previous Bioconductor-released versions. At a high-level I think the changes would involve:

  • Your data import functions would return a SpatialExperiment (with an 'empty' image slot when there is no image data)
  • Functions that already work with a SingleCellExperiment (and don't make use of spatial co-ordinates) will continue to work with a SpatialExperiment (because a SpatialExperiment is a SingleCellExperiment) so it'd just be a matter of updating documentation.
  • The spatial co-ordinates would be accessible by the Bioconductor/SpatialExperiment-standard spatialCoords() accessors rather than SPIAT-specific x$Cell.X.Position and x$Cell.Y.Position.
  • Functions that make use of the spatial co-ordinates in addition to a SingleCellExperiment would instead extract the spatial co-ordinates from a SpatialExperiment using the spatialCoords() accessor.

As I said, ultimately, the choice is yours. Please proceed with your other responses to the initial review, as whether you choose SingleCellExperiment or SpatialExperiment as the main data class won't affect whether your package is accepted or not. I will be on leave and offline until June 9.

PeteHaitch avatar May 23 '22 22:05 PeteHaitch

Thank you for your reply. We are convinced that we should switch to SpatialExperiment and will start making the changes along with other responses to the review.

Have a good break!

fuerzhou avatar May 25 '22 00:05 fuerzhou

Thank you again for the comments. We have been working on all the recommended and required updates, including switching to SpatialExperiment, and we will soon be uploading the updated version. Thanks

annatrigos avatar Jul 05 '22 01:07 annatrigos

Closing this for now. @fuerzhou Please re-open the issue if and when you would like to proceed with the submission.

PeteHaitch avatar Aug 08 '22 23:08 PeteHaitch

This issue is being closed because there has been no progress for an extended period of time. You may reopen the issue when you have the time to actively participate in the review / submission process. Please also keep in mind that a package accepted to Bioconductor requires a commitment on your part to ongoing maintenance.

Thank you for your interest in Bioconductor.

bioc-issue-bot avatar Aug 08 '22 23:08 bioc-issue-bot

Test: Does the issue reopen automatically with a new comment?

fuerzhou avatar Aug 29 '22 01:08 fuerzhou

Hi @vjcitn @PeteHaitch,

I have submitted the updated version to Bioconductor. Shall we reopen the issue? I will submit my response to the review after seeing the build is successful.

Thank you!

fuerzhou avatar Aug 29 '22 01:08 fuerzhou

Dear @fuerzhou ,

We have reopened the issue to continue the review process. Please remember to push a version bump to git.bioconductor.org to trigger a new build.

bioc-issue-bot avatar Aug 29 '22 11:08 bioc-issue-bot

Received a valid push on git.bioconductor.org; starting a build for commit id: b997ca5198f38c38fe83e2539d87e928a243be00

bioc-issue-bot avatar Aug 29 '22 12:08 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/SPIAT to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Aug 29 '22 13:08 bioc-issue-bot

@PeteHaitch Thank you again for the review of our package and for your insightful feedback. I have tried to resolve all the required and recommended issues. The checked items without comments were directly fixed following the suggestions; some items have follow up explanations under the comments. Please do let us know if you have further comments or suggestions.

Required

  • [X] Please take a reasonable attempt to address all NOTES raised by BiocCheck::BiocCheck().
  1. NOTE: Consider multiples of 4 spaces for line indents; 884 lines (10%) are not.

I have tried my best to change the indentation generated from Roxygen2, but they cannot be changed manually. Please do advise if there is a way to do so.

  1. NOTE: Consider shorter lines; NOTE: The recommended function length is 50 lines or less.

These two notes are related to each other and I have tried to reduce them.

  1. NOTE: Avoid using '=' for assignment and use '<-' instead.

I checked all the files and there shouldn't be any. Do you have any suggestions on how to automatically find out where the issue is?

  1. NOTE: Cannot determine whether maintainer is subscribed to the Bioc-Devel mailing list (requires admin credentials).

I have subscribed.

  • [X] Upon acceptance, please remove github installation instructions from vignette/README and only include the BiocManager::install()-based instructions.

  • [X] Please proofread the documentation and vignette. E.g., there is a comment in the vignette "#maybe you no longer need this if we are using the general format?" which doesn't seem like it should be there. Another example is the documentation of dimensionality_reduction_plot() contains text that seems like it shouldn't be there: "Generates the dimensionality reduction plots (UMAP or tSNE) based on marker intensities. Cells are grouped by the categories under the selected column. – have you tried doing PCA on the matrix and then doing the UMAP/tSNE? Does it help? scRNAseq workflows do this. (?)" Another example: §2.5 of vignette starts with "Plot the image." but nothing further and no image.

  • [X] Please limit the use of unevaluated code chunks in the vignette (i.e. eval=FALSE). For example, there is an unevaluated code chunk in §2.1.2.1 of vignette and so the last 3 lines of the chunk aren't evaluated

  • [x] Based on the vignette, getting the phenotype format correct seems critical. Are the rules for defining cell phenotypes documented somewhere?

I have added the following details to the documentation of format_image_to_spe(). It is also stated in the vignette. The format of "Phenotype" column: For example, a cell positive for both "CD3" and "CD4" markers has the "CD3,CD4" cell phenotype. The phenotype has to be strictly formatted in such a way where each positive marker has to be separated by a comma, with no space in between, and the order of the positive markers has to be the same as the order in the assay.

  • [x] Are these rules defined by SPIAT or by the underlying assays (HALO, InFORM, etc).

By SPIAT

  • [x] Are non-binary or more complex phenotypes supported? E.g., 'CD3-low', 'CD3-mid', 'CD3-high'?

We use “Phenotype” field to only represent the combination of markers positive. Other definitions have to go to other defined columns. This can be done by define_celltypes(). This is described in the tutorial.

  • [x] I don't quite understand the image_splitter() function and it has some problems. It's not clear to me if users need access to this function or if it should be an internal function of SPIAT? (If it's intended as an internal function, then my concerns aren't so relevant.)

  • [x] image_splitter() returns a list of data frames, which doesn't seem very convenient or consistent with using SingleCelLExperiment or SpatialExperiment. (Naively, I was expecting perhaps a list of SingleCellExperiment or SpatialExperiment objects.). When might users be calling this function and does it returning a 'list of data frames' make sense for that purpose?

The response is for the two comments above. I have made this both internal and external. The external one will return a list of spe for clarity. And the internal one will return a list of data frames for convenient computation of further analyses like computing the spatial autocorrelation. We have also received issues raised from the community using this function and the current version should have fixed it.

  • [x] An optional purpose of image_splitter() is to "plot the cell positions in each sub image", but this doesn't seem to work, e.g., this errors:
suppressPackageStartupMessages(library(SPIAT))
data("simulated_image")
image_splitter(simulated_image, number_of_splits=3, plot = TRUE)
#> Error in `[.data.frame`(cell_loc, , feature_colname): undefined columns selected

The arg feature_colname has to be present in the data columns. The error occurred because the default value Phenotype is not present in simulated_image.

  • [X] knitr::include_graphics() shouldn't appear in the rendered vignette because that's not what a user would run to view the plot. If you need to include a static plot from file, then have the code in the vignette display what the user would actually need to run to generate the plot (but set eval = FALSE so that the code isn't run) and then have a hidden chunk (echo = FALSE) that is actually evaluated and inserts the image with knitr::include_graphics().

Recommended

  • [X] Good job on writing unit tests. Take a look at running covr::report() on the package to help identify functions with less testing coverage (e.g., format_image_to_sce() is relatively under-tested, which could be a problem in the long-term since it seems a key function of SPIAT).

Thanks for the suggestion! I have added more tests. And format_image_to_spe is now reformatted.

  • [X] Strongly recommend splitting the very long format_image_to_sce() function up into format-specific internal functions. Experience is that this will help code maintenance into the future. E.g., in pseudocode:

We now split this function into several functions and wrap them up in the main one. The main function format_image_to_spe() is made of format_inform_to_spe(), format_halo_to_spe(), …

  • [X] The format_image_to_sce() makes the default assay name counts; is that intentional? I ask because these immunohistochemistry assays don't necessarily return integer counts do they?

We have transformed to SpatialExperiment and use the default “data” assay.

  • [ ] format_image_to_sce(format = "Visium") uses Seurat::Read10X(). You could instead use DropletUtils::read10xCounts() or SpatialExperiment::read10xVisium() (depending on the exact input and output format requirements) for a more Bioconductor-focused workflow (and potentially to remove the dependency on Seurat).

We decided to remove this functionality because we do not have competitive tools for ST data yet. We decided to focus only on spatial proteomics at the moment.

  • [X] Proofread vignette to make sure code is visible and formatting of text is as desired (e.g., end of §2.2.1 'Structure of a SPIAT SingleCellExperiment object' has strange formatting).

  • [X] Please run a spellcheck on the vignette and documentation.

  • [X] Is print_feature() really needed? print_feature(simulated_image, feature_colname = "Phenotype") seems to be equivalent to unique(simulated_image$Phenotype) or unique(simulated_image[["Phenotype"]]). More generally, it's probably best not to re-invent the wheel by creating functions that could be achieved with short, basic R code.

I have deleted this function.

  • [X] predicted_image() creates multiple figures but the user may only see the last one if run interactively.

I have changed it to one plot.

  • [X] Legend of marker_prediction_plot() is barely legible (§2.2.5 of vignette).

I have made it clearer.

  • [ ] plot_cell_marker_levels() (§2.3.2 of vignette) produced figure with very small points on my laptop, making it very hard to interpret. To change the point size, a user needs to know a bit about ggplot2 to change this (e.g., plot_cell_marker_levels(formatted_image, "Immune_marker1") + ggplot2::geom_point(size = 2)). For comparison, the plotting functions in the scater package (which also make use of ggplot2) expose a point_size argument (as well as other arguments to help make common modifications to the plot); see help("scater-plot-args", "scater")

This is a very good suggestion. We would like to incorporate this into all of our plot functions in the next update.

  • [ ] Regarding AUC_of_cross_function(), my understanding is that usually 0 <= AUC <= 1; so what does AUC < 0 mean (this happens for the example in the vignette).

This function is calculating the difference of AUC of two curves. Hence, AUC values can be negative. We give an explanation of AUC negative values in the tutorial.

  • [X] Recommend proofreading the Description field in the DESCRIPTION file.

  • [ ] There are quite a lot of dependencies (see note from R CMD check, below). Consider the recommendation made by R CMD check and think about whether there are packages that could eliminated by using base R functionality.

We have removed redundant packages and have just kept all necessary packages.

  • [X] Not really helping the above point, but you could drop SummarizedExperiment from Depends because SingleCellExperiment already depends on it.

  • [ ] For packages that include data, we recommend not including LazyData: TRUE (https://contributions.bioconductor.org/description.html#description-lazydata)

I could not set LazyData to false - if set to false, our external data cannot be accessed by the user.

  • [X] Please update the NEWS file (it only refers to v0.2)

  • [ ] Check documentation of data. E.g., help("defined_image", "SPIAT") says the dataset is called simulated_image rather than (presumaby) defined_image.

The documentation for defined_image was correct but yes confusing. I have made it clearer.

  • [X] Consider adding a package-level man page (https://contributions.bioconductor.org/docs.html#package-level-documentation)

fuerzhou avatar Aug 29 '22 13:08 fuerzhou

Thank you for responding to the review, @fuerzhou. I'm a little behind on my reviews but I will try to finalise the review before I go on leave next Thursday.

PeteHaitch avatar Aug 31 '22 22:08 PeteHaitch

Received a valid push on git.bioconductor.org; starting a build for commit id: c36dde356a66aa66d9778181e996d9b044cede4e

bioc-issue-bot avatar Sep 05 '22 01:09 bioc-issue-bot

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to [email protected]:packages/SPIAT to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot avatar Sep 05 '22 01:09 bioc-issue-bot

Thank you for making the required changes and carefully considering the recommended changes, @fuerzhou.

I have a few final minor requests/queries before I accept SPIAT into Bioconductor:

  • [ ] Why does format_image_to_spe() emit a message Note: 4 rows removed due to NA intensity values. for the example in §2.1.1 of the vignette?
  • [ ] I suggest choosing a different background colour or 'Other' colour for plot_cell_categories(), otherwise it's grey under grey (§2.4.1 of the vignette).
  • [ ] Why is output of plot_distance_heatmap() shown twice (§2.5.2.1, §2.5.2.2)?
  • [ ] Please consider https://github.com/TrigosTeam/SPIAT/pull/17 Thanks, Pete

PeteHaitch avatar Sep 28 '22 00:09 PeteHaitch

Received a valid push on git.bioconductor.org; starting a build for commit id: dafe00a944f3cbeb978cb3b987ed849f073ad3d2

bioc-issue-bot avatar Sep 28 '22 07:09 bioc-issue-bot