software-submission
software-submission copied to clipboard
SLEPLET: Slepian Scale-Discretised Wavelets in Python (#148)
Submitting Author: Patrick J. Roddy (@paddyroddy)
All current maintainers: (@paddyroddy)
Package Name: SLEPLET
One-Line Description of Package: Slepian Scale-Discretised Wavelets in Python
Repository Link: https://github.com/astro-informatics/sleplet
Version submitted: v1.4.4
Editor: Szymon Moliński (@SimonMolinsky )
Reviewer 1: Shannon Quinn (@magsol )
Reviewer 2: Jakub Tomasz Gnyp (@gnypit )
Archive: 10.5281/zenodo.7268074
JOSS DOI: 10.21105/joss.05221
Version accepted: TBD
Date accepted (month/day/year): TBD
Code of Conduct & Commitment to Maintain Package
- [X] I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package after should it be accepted.
- [X] I have read and will commit to package maintenance after the review as per the pyOpenSci Policies Guidelines.
Description
- Include a brief paragraph describing what your package does:
SLEPLET is a Python package for the construction of Slepian wavelets in the spherical and manifold (via meshes) settings. SLEPLET handles any spherical region as well as the general manifold setting. The API is documented and easily extendible, designed in an object-orientated manner. Upon installation, SLEPLET comes with two command line interfaces - sphere and mesh - that allow one to easily generate plots on the sphere and a set of meshes using plotly. Whilst these scripts are the primary intended use, SLEPLET may be used directly to generate the Slepian coefficients in the spherical/manifold setting and use methods to convert these into real space for visualisation or other intended purposes. The construction of the sifting convolution was required to create Slepian wavelets. As a result, there are also many examples of functions on the sphere in harmonic space (rather than Slepian) that were used to demonstrate its effectiveness. SLEPLET has been used in the development of various papers.
Scope
-
Please indicate which category or categories. Check out our package scope page to learn more about our scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
- [ ] Data retrieval
- [ ] Data extraction
- [X] Data processing/munging
- [ ] Data deposition
- [ ] Data validation and testing
- [X] Data visualization[^1]
- [ ] Workflow automation
- [ ] Citation management and bibliometrics
- [ ] Scientific software wrappers
- [ ] Database interoperability
Domain Specific & Community Partnerships
- [ ] Geospatial
- [ ] Education
- [ ] Pangeo
Community Partnerships
If your package is associated with an existing community please check below:
- [ ] Pangeo
- [ ] My package adheres to the Pangeo standards listed in the pyOpenSci peer review guidebook
[^1]: Please fill out a pre-submission inquiry before submitting a data visualization package.
-
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
- Who is the target audience and what are scientific applications of this package?
Many fields in science and engineering measure data that inherently live on non-Euclidean geometries, such as the sphere. Techniques developed in the Euclidean setting must be extended to other geometries. Due to recent interest in geometric deep learning, analogues of Euclidean techniques must also handle general manifolds or graphs. Often, data are only observed over partial regions of manifolds, and thus standard whole-manifold techniques may not yield accurate predictions. Slepian wavelets are designed for datasets like these. Slepian wavelets are built upon the eigenfunctions of the Slepian concentration problem of the manifold: a set of bandlimited functions that are maximally concentrated within a given region. Wavelets are constructed through a tiling of the Slepian harmonic line by leveraging the existing scale-discretised framework. Whilst these wavelets were inspired by spherical datasets, like in cosmology, the wavelet construction may be utilised for manifold or graph data.
- Are there other Python packages that accomplish the same thing? If so, how does yours differ?
To the author's knowledge, there is no public software that allows one to compute Slepian wavelets
(or a similar approach) on the sphere or general manifolds/meshes. SHTools is a Python code used for spherical harmonic transforms, which allows one to compute the Slepian functions of the spherical polar cap. A series of MATLAB scripts exist in slepian_alpha, which permits the calculation of the Slepian functions on the sphere. However, these scripts are very specialised and hard to generalise.
- If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tagthe editor you contacted: #148
Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
- [X] does not violate the Terms of Service of any service it interacts with.
- [X] uses an OSI approved license.
- [X] contains a README with instructions for installing the development version.
- [X] includes documentation with examples for all functions.
- [X] contains a tutorial with examples of its essential functions and uses.
- [X] has a test suite.
- [X] has continuous integration setup, such as GitHub Actions CircleCI, and/or others.
Publication Options
- [ ] Do you wish to automatically submit to the Journal of Open Source Software? If so:
JOSS Checks
- [ ] The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
- [ ] The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
- [ ] The package contains a
paper.mdmatching JOSS's requirements with a high-level description in the package root or ininst/. - [ ] The package is deposited in a long-term repository with the DOI:
Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
- [x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.
Confirm each of the following by checking the box.
- [X] I have read the author guide.
- [X] I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.
Please fill out our survey
- [X] Last but not least please fill out our pre-review survey. This helps us track submission and improve our peer review process. We will also ask our reviewers and editors to fill this out.
P.S. Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
Hi @paddyroddy I'm adding the initial Editor-in-Chief checks.
The tl;dr is that we are about ready to start but I need you to do two things first:
- [ ] Add a code of conduct
- [ ] Add a "layperson" summary of the package to the docs page
- The first thing someone should read on the docs is a statement of what the package does--I wouldn't assume someone will bother to click again to go to PyPI. Provide enough background that someone outside your domain will have some understanding of how the package fits into a bigger picture, kind of like the first couple of sentences of the abstract of your thesis.
I will add one more comment below but it is not required that you address it before we start the review
Editor in Chief checks
Hi there! Thank you for submitting your package for pyOpenSci review. Below are the basic checks that your package needs to pass to begin our review. If some of these are missing, we will ask you to work on them before the review process begins.
Please check our Python packaging guide for more information on the elements below.
- [x] Installation The package can be installed from a community repository such as PyPI (preferred), and/or a community channel on conda (e.g. conda-forge, bioconda).
- [x] The package imports properly into a standard Python environment
import package.
- [x] The package imports properly into a standard Python environment
- [x] Fit The package meets criteria for fit and overlap.
- [ ] Documentation The package has sufficient online documentation to allow us to evaluate package function and scope without installing the package. This includes:
- [x] User-facing documentation that overviews how to install and start using the package.
- [x] Short tutorials that help a user understand how to use the package and what it can do for them.
- [x] API documentation (documentation for your code's functions, classes, methods and attributes): this includes clearly written docstrings with variables defined using a standard docstring format.
- [x] Core GitHub repository Files
- [x] README The package has a
README.mdfile with clear explanation of what the package does, instructions on how to install it, and a link to development instructions. - [x] Contributing File The package has a
CONTRIBUTING.mdfile that details how to install and contribute to the package. - [ ] Code of Conduct The package has a
CODE_OF_CONDUCT.mdfile. - [x] License The package has an OSI approved license. NOTE: We prefer that you have development instructions in your documentation too.
- [x] README The package has a
- [x] Issue Submission Documentation All of the information is filled out in the
YAMLheader of the issue (located at the top of the issue template). - [x] Automated tests Package has a testing suite and is tested via a Continuous Integration service.
- [x] Repository The repository link resolves correctly.
- [x] Package overlap The package doesn't entirely overlap with the functionality of other packages that have already been submitted to pyOpenSci.
- [x] Initial onboarding survey was filled out We appreciate each maintainer of the package filling out this survey individually. :raised_hands: Thank you authors in advance for setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:
Editor comments
- when I first tested that I could import the package, it downloaded a lot of data. Many people might find this surprising. I would suggest adding data directly to the package if possible--if the files are small enough--and/or deferring download of large files
@paddyroddy related to that to-do item for the docs: my understanding is that the goal for SLEPLET is to provide general access to the wavelet basis/transform you have developed for incomplete data on manifolds (feel free to correct me if I'm wrong)
Part of what we can do with this review is help you reach a broader audience. But that means we need to know where to look for editors and reviewers.
Can you please help me understand the main domains where you have applied and can apply your methods? I see you are an RSE in an astro group and that you mention cosmology in your thesis; in the papers you apply your methods to topographic data from Earth. Are these methods mainly used by people in cosmology, geophysics, computer imaging, all of the above? Could this work relate to geodesy in earth science?
As examples of places where we might look for editors/reviewers: we work with astropy and software underground. I can also imagine asking people that work on manifolds in general and geometric deep learning, and another place we might look would be users/developers of packages you rely on or mention, e.g. pyssht/py2let. But my sense is that we will want domain experts that can help you translate the methods you've developed for a broader audience.
Any more concrete detail you can give me about domains to look at for editors + reviewers would be helpful. We are looking now, but I want to get your input to make sure this review helps you.
The tl;dr is that we are about ready to start but I need you to do two things first:
* [ ] Add a code of conduct * [ ] Add a "layperson" summary of the package to the docs page * The first thing someone should read on the docs is a statement of what the package does--I wouldn't assume someone will bother to click again to go to PyPI. Provide enough background that someone outside your domain will have some understanding of how the package fits into a bigger picture, kind of like the first couple of sentences of the abstract of your thesis.
I've added these. Mine isn't particularly lay, but they are a fairly specialised application, so hopefully from the description it's obvious who the target audience are!
Editor comments
* when I first tested that I could import the package, it downloaded a lot of data. Many people might find this surprising. I would suggest adding data directly to the package if possible--if the files are small enough--and/or deferring download of large files
Annoyingly, I hadn't spotted this, as I cache after the first import. I've had a look and the following are created:
-rw------- 1 paddy staff 74M Nov 24 18:37 EGM2008_Topography_flms_L2190.mat
-rw------- 1 paddy staff 5.6K Nov 24 18:37 meshes_laplacians_basis_functions_bird_b697_eigenvalues.npy
-rw------- 1 paddy staff 15M Nov 24 18:37 meshes_laplacians_basis_functions_bird_b697_eigenvectors.npy
-rw------- 1 paddy staff 5.6K Nov 24 18:37 meshes_laplacians_slepian_functions_bird_b697_N194_eigenvalues.npy
-rw------- 1 paddy staff 1.1M Nov 24 18:37 meshes_laplacians_slepian_functions_bird_b697_N194_eigenvectors.npy
-rw------- 1 paddy staff 132 Nov 24 18:37 slepian_masks_africa_L1.npy
-rw------- 1 paddy staff 132 Nov 24 18:37 slepian_masks_south_america_L1.npy
The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of EGM2008_Topography_flms_L2190.mat.
The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of
EGM2008_Topography_flms_L2190.mat.
Fixed this in astro-informatics/sleplet#336 (v1.4.5), turns out I had introduced the bug using pydantic.Field(default= values. The data will still be downloaded as and when - but not on initial import. And, as before, it is cached so will only happen once.
@paddyroddy related to that to-do item for the docs: my understanding is that the goal for SLEPLET is to provide general access to the wavelet basis/transform you have developed for incomplete data on manifolds (feel free to correct me if I'm wrong)
Yep that's right!
Can you please help me understand the main domains where you have applied and can apply your methods? I see you are an RSE in an astro group and that you mention cosmology in your thesis; in the papers you apply your methods to topographic data from Earth. Are these methods mainly used by people in cosmology, geophysics, computer imaging, all of the above? Could this work relate to geodesy in earth science?
So I'm actually a general RSE (not strictly working on this), but was formerly a PhD student in an astrophysics group, specialising in cosmology. In reality, these methods could be used by any field in which data is measured only in certain parts of a manifold. Be that: oceanography, geophysics, solar physics, astrophysics, cosmology, computer graphics, machine learning etc. In reality, there are many fields which could use methods such as these.
As examples of places where we might look for editors/reviewers: we work with astropy and software underground. I can also imagine asking people that work on manifolds in general and geometric deep learning, and another place we might look would be users/developers of packages you rely on or mention, e.g. pyssht/py2let. But my sense is that we will want domain experts that can help you translate the methods you've developed for a broader audience.
Those all sound like reasonable places. Would be very interested to have others people's insights. Sorry, I'm not sure how else to help on that front!
The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of EGM2008_Topography_flms_L2190.mat.
Sounds like you fixed the issue overall but jic you're still figuring this out: are you familiar with pooch? A pattern a lot of core scientific Python packages seem to be converging on is "put the data in the package and use importlib-resources if it fits on PyPI; for everything else use pooch". See for example scikit-image: https://github.com/scikit-image/scikit-image/issues/5146 or librosa: https://github.com/librosa/librosa/tree/main/librosa/util/example_data
Thank you @paddyroddy for your replies about the domain, very helpful.
I am very sympathetic to developing methods that are useful across domains (FFT anyone?) but just want to make sure we're helping you reach that broader audience.
We are looking for an editor + reviewers now.
Sounds like you fixed the issue overall but jic you're still figuring this out: are you familiar with pooch?
Yes, I'm using pooch...
Whoops, guess I could've just looked 😝 😇 We can say I'm trying to be helpful
Hi @paddyroddy, happy new year's 🙂 I'm happy to report that @SimonMolinsky will be the editor for this review, I'll let him take it from here. We are now actively seeking reviewers.
Hi @paddyroddy !
From now on, I'll be responsible for the review process. :) I've looked into your package, the JOSS publication, and related publications, and everything is outstanding. I agree with @NickleDave that your package needs more usability review from potential users than a technical review. I will look for reviewers in specific communities interested in applying your algorithms in their solution.
Thank you, @SimonMolinsky! Looking forward to the next steps
@paddyroddy I'm still looking for reviewers. Hopefully, we will start the review at the beginning of next week! (We have one good soul on board, waiting for the response from other people). Thank you for your patience!
Hi @paddyroddy , we have our reviewers. We can start a review process.
Editor response to review:
Editor comments
:wave: Hi @magsol and @gnypit ! Thank you for volunteering to review for pyOpenSci! I'm here to help you with the review, so if you need any support, then let me know immediately! But first, look at the steps below :point_down: @magsol you may start as soon as you want; @gnypit will join us after 22nd of February!
Please fill out our pre-review survey
Before beginning your review, please fill out our pre-review survey. This helps us improve all aspects of our review and better understand our community. No personal data will be shared from this survey - it will only be used in an aggregated format by our Executive Director to improve our processes and programs.
- [ ] reviewer 1 survey completed.
- [ ] reviewer 2 survey completed.
The following resources will help you complete your review:
- Here is the reviewers guide. This guide contains all of the steps and information needed to complete your review.
- Here is the review template that you will need to fill out and submit here as a comment, once your review is complete.
Please get in touch with any questions or concerns! Your review is due: 2024-03-17
Reviewers: Shannon Quinn, Jakub Tomasz Gnyp Due date: 2024-03-17
Hi @magsol , @gnypit !
Please let us know how it's going :)
@SimonMolinsky Just finished the reviewer survey, and am starting to work my way through the Reviewer Guide. Hopefully will have some more detailed updates soon :) (sorry for the delay in starting-- tl;dr it's been a very chaotic past month)
Package Review
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
- [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).
Documentation
The package includes all the following forms of documentation:
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
- [x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
- [x] Vignette(s) demonstrating major functionality that runs successfully locally.
- [x] Function Documentation: for all user-facing functions.
- [x] Examples for all user-facing functions.
- [x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
- [x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a
pyproject.tomlfile or elsewhere.
Readme file requirements The package meets the readme requirements below:
- [x] Package has a README.md file in the root directory.
The README should include, from top to bottom:
- [x] The package name
- [x] Badges for:
- [x] Continuous integration and test coverage,
- [x] Docs building (if you have a documentation website),
- [x] A repostatus.org badge,
- [x] Python versions supported,
- [x] Current package version (on PyPI / Conda).
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
- [x] Short description of package goals.
- [x] Package installation instructions
- [x] Any additional setup required to use the package (authentication tokens, etc.)
- [x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
- [x] Link to your documentation website.
- [x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
- [x] Citation information
Usability
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
- [x] Package documentation is clear and easy to find and use.
- [x] The need for the package is clear
- [x] All functions have documentation and associated examples for use
- [x] The package is easy to install
Functionality
- [x] Installation: Installation succeeds as documented.
- [x] Functionality: Any functional claims of the software been confirmed.
- [x] Performance: Any performance claims of the software been confirmed.
- [x] Automated tests:
- [x] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [x] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
- [x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
- [x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
A few notable highlights to look at:
- [x] Package supports modern versions of Python and not End of life versions.
- [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)
For packages also submitting to JOSS
- [x] The package has an obvious research application according to JOSS's definition in their submission requirements.
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md matching JOSS's requirements with:
- [x] A short summary describing the high-level functionality of the software
- [x] Authors: A list of authors with their affiliations
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [x] References: With DOIs for all those that have one (e.g. papers, datasets, software).
Final approval (post-review)
- [x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.
Estimated hours spent reviewing:
Somewhere in the 2-3 hours range.
Review Comments
Sorry, it was a very busy past week + weekend. I plan on finishing this review tomorrow. Apologies for the delay!
Ok, I've finished my initial review. I have to admit up front: this is largely outside my area of expertise. While I understand the high-level use case (my research involves a lot of manifold learning and walking, and may occasionally reside in non-Euclidean space), the specifics of Slepian wavelets are not familiar to me.
That said, the package itself seems extremely well put together and maintained. I was able to run the examples and tests without any* problems, and the documentation--while clearly provided for people who understand Slepian wavelets--is well organized.
The only checkbox I couldn't verify was the presence of a repostatus badge, though given that the package itself is already two minor versions more recent than this review version, it's pretty safe to say this package is actively maintained.
* The only caveat here was the CLI example in the README--one of the Zenodo downloads failed mid-stream and crashed the whole example. However, there is already an issue open about this, and it seems like a Zenodo problem, so other than investigating other hosting options for the needed example files, it's easy to say this isn't the fault of the package itself.
Thanks, @magsol! I wasn't aware of repostatus.org, I have added one now in astro-informatics/sleplet#360.
I'm pretty happy with it! I recommend approval :) Thanks @paddyroddy! Great work!
@magsol Thank you for your review! It's great to hear that SLEPLET is a well-written package :)
@paddyroddy I heard from @gnypit, and he will review the package soon.
Dear All,
First and foremost I'd like to apologize for such a late answer. Though unprofessional and thus with no excuse, it has an explanation. I started the review in February with papers at @paddyroddy's Research Gate profile. I've been distracted from March onwards due to a combination of everything that could happen at the same time. I had to change my job, reorganise my wedding party and budget for my moving to London for the next academic year, as part of my family that was supposed to sponsor my postgraduate degree (and therefore paid a very symbolic alimony for multiple years, in spite of come court orders) started backing out on religious grounds, pressuring me to cancel my wedding altogether... It's been a very crazy couple of weeks for me, personally and professionally, taking a toll on my health as well.
Additionally, I wanted to go back to the sleplet package, as the topic itself got me interested greatly. Simon informed me that the functionalities have already been verified, which is great news! Therefore, for me as a mathematician and physicist from different niches, "user experience" and clarity of code were priorities. The overall impression is very good: the code is mainly OOP, with clear organisation of files, their names, and the names of classes, methods and functions. Though I'm no expert in this particular field, I was able to find myself in all the folders and have at least a general idea of what a given file is supposed to be used for. However, I do have some minor remarks when it comes to documentation and comments in the code. I might be wrong, as this repository is dedicated to specialists in a given field - however, from my personal experience many skilled mathematicians, physicists, and others, have very basic programming skills. Therefore guides, documentation https://github.com/astro-informatics/sleplet/blob/main/documentation/DOCUMENTATION.mdand step-by-step examples are in my view an absolute necessity and need expanding. I'm not talking about such elaborate ones, like in the case of Tensorflow, scikit-learn or Transformers, but rather a simple website, such as for PyGAD https://pygad.readthedocs.io/en/latest/ or scipy.stats. Examples could be, at this stage, in the form of Jupyter notebooks, instead of scripts with defined functions which I have found so far. It's my recommendation to create such demonstration notebooks as soon as possible. Moreover, multiple files require additional comments - even snippets in the current documentation practically don't have any. I find it a very common practice, that there are few to no comments in the majority of code and I always suggest adding more. It minimises time required to understand a given file, especially during debugging, when one wants to understand a given part of code without reading the whole file, its dependencies, and so on.
In general, global and local variables are rarely explained - "SNR" in SlepianNoiseAfrica in slepian_noise_africa.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/slepian_noise_africa.py is one of few variables (and acronyms!) that was commented on and defined for a layman, such as myself 😅 Also, _vars.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/_vars.py in slepian folder is a very short code, but it definitely needs some explanation. Even if there is an accepted practice to use "phi" symbol for a given angle, not everybody has to follow it necessarily. Or, if someone is relatively new to the topic, it makes understanding the code very cumbersome, due to time and attention required to find some context in which this or that variable is used, how it's used and what precisely may it represent. Of course these variables were explained in the current documentation, but it would be more convenient to copy-paste these into the proper script or create a link/reference. Even a comment "Explained in the documentation: >link<" would suffice, come to think of that.
In "tests" codes usually have one short comment for a whole function. Same goes for files like flm.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/flm.py, fp.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/fp.py and so on - in these cases even the files' names are quite secretive. At the same time files like noise.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/noise.py have very comprehensive comments in some places - generally I've noticed that files with "methods", "coefficients" and regarding plotting are the most clearly written and I'd suggest using them as a standard for the rest of the library. Including LaTeX formulas (e.g. in squashed_gaussian.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/squashed_gaussian.py, harmonic_gaussian.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/harmonic_gaussian.py, etc.) was a great idea and it'd be great to expand on this while creating documentation - or at least giving some references, perhaps 🤔 Of course, some files really are very clear with just two comments, like _bool_methods.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/_bool_methods.py, but they constitute a minority of the whole package. Some files have messages in them, such as slepian.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/slepian.py, which increases clarity, but specific comments explaining methods like _validate_rank in Slepian class or _compute_angles in SlepianDiracDelta class are in my view preferable.
Finally, some concepts in programming might be very new and difficult to get used to even for semi-advanced programmers in Python, like multiprocessing. I've noticed the _parallel_methods.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/_parallel_methods.py using a fork of the multiprocessing library. As there are other libraries and methods to make parallel operations in Python, also depending on the hardware one uses (e.g. CUDA), it might be a good idea to significantly expand examples and explain this part of the library in depth so that if someone uses this part of the library, it is done quite purposefully.
Overall, the biggest challenge before contributors is to create a friendly documentation and embark on a boring and tiresome adventure to add comments. There already are examples of in-depth comments, so there is a clear and present standard to strive for. Apart from this, the library is well-organised, written beautifully in OOP in Python, with some functional parts. I find it intuitive and modern. Also, introducing multiprocessing shows the author's consideration for computing cost, which is a growing necessity in today's scientific world - especially for those who have the skills but cannot afford proper hardware. Should I ever meet someone looking for tools to work with wavelets
I'll make sure with Simon to put it in a proper place as my input to the review.
With kind regards, Jakub T. Gnyp
śr., 20 mar 2024 o 20:29 Simon @.***> napisał(a):
@magsol https://github.com/magsol Thank you for your review! It's great to hear that SLEPLET is a well-written package :)
@paddyroddy https://github.com/paddyroddy I heard from @gnypit https://github.com/gnypit, and he will review the package soon.
— Reply to this email directly, view it on GitHub https://github.com/pyOpenSci/software-submission/issues/149#issuecomment-2010457477, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO4C3VRJ6FLI7UB4C5G2IDTYZHPQRAVCNFSM6AAAAAA7NMCAMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJQGQ2TONBXG4 . You are receiving this because you were mentioned.Message ID: @.***>
@gnypit I'm sorry to hear about the recent events in your life. I hope things are getting back in control.
Thank you for constructive feedback regarding comments, documentation and demos.
I have tried to automate the documentation construction as much as possible, and hence it is a bit light in some places. However, just for your information, the parts in which it is documentation-light (as you say) are the sections that are private (i.e. not exposed to the users) indicated by the leading _. This does beg the question of who @pyOpenSci is aiming to, is it users, developers or both? To me, as a user of a package, it is not often I'd be digging around the internals. As a developer, it is something I might do if the situation requires it. I believe all functions (public or private) are named fairly sensibly, and have docstrings.
Regarding comments, I feel this is personal style - I tend to put them when the step is non-intuitive (for me), but I feel it can clutter the code somewhat if it is everywhere.
As for demos, I have various demos in https://github.com/astro-informatics/sleplet/tree/main/examples. Would surrounding these with explanatory prose suffice your request? I'm not a big fan of notebooks personally, but could convert the files in the examples folder to notebooks.
Hi @gnypit , sorry to hear you have experienced such things; hopefully, your situation improves!
@gnypit Thank you for your review; just one more thing: could you copy & paste the review template into a comment in this thread and select boxes based on your review: https://www.pyopensci.org/software-peer-review/how-to/reviewer-guide.html#peer-review-template (You should be logged in to GitHub, it won't be possible from email).
@gnypit , @paddyroddy :
However, just for your information, the parts that are documentation-light (as you say) are the sections that are private (i.e., not exposed to the users) indicated by the leading _. This does beg the question of who https://github.com/pyOpenSci is aiming at: users, developers, or both?
Answering this question: docs in private functions are not required. @pyOpenSci focuses on both users and developers - as potential maintainers. Documenting private functions is nice to have but totally optional 😊
Okay @SimonMolinsky & @paddyroddy - just in case, I'm still dabbling in the functionalities on a Linux VM, but:
Package Review
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
- [x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).
Documentation
The package includes all the following forms of documentation:
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
- [x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
- [x] Vignette(s) demonstrating major functionality that runs successfully locally.
- [x] Function Documentation: for all user-facing functions.
- [x] Examples for all user-facing functions.
- [x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
- [x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a
pyproject.tomlfile or elsewhere.
Readme file requirements The package meets the readme requirements below:
- [x] Package has a README.md file in the root directory.
The README should include, from top to bottom:
- [x] The package name
- [x] Badges for:
- [x] Continuous integration and test coverage,
- [x] Docs building (if you have a documentation website),
- [x] A repostatus.org badge,
- [x] Python versions supported,
- [x] Current package version (on PyPI / Conda).
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
- [x] Short description of package goals.
- [x] Package installation instructions
- [x] Any additional setup required to use the package (authentication tokens, etc.)
- [x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
- [x] Link to your documentation website.
- [x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
- [x] Citation information
Usability
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
- [x] Package documentation is clear and easy to find and use.
- [x] The need for the package is clear
- [x] All functions have documentation and associated examples for use
- [x] The package is easy to install
Functionality
- [x] Installation: Installation succeeds as documented.
- [x] Functionality: Any functional claims of the software been confirmed.
- [X] Performance: Any performance claims of the software been confirmed.
- [X] Automated tests:
- [X] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [X] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
- [x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
- [X] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
A few notable highlights to look at:
- [X] Package supports modern versions of Python and not End of life versions.
- [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)
For packages also submitting to JOSS
- [x] The package has an obvious research application according to JOSS's definition in their submission requirements.
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md matching JOSS's requirements with:
- [x] A short summary describing the high-level functionality of the software
- [x] Authors: A list of authors with their affiliations
- [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [x] References: With DOIs for all those that have one (e.g. papers, datasets, software).
Final approval (post-review)
- [x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.
Estimated hours spent reviewing: 30h by @gnypit + time spent by @magsol
Review Comments
The overall impression is very good: the code is mainly OOP, with clear organisation of files, their names, and the names of classes, methods and functions. The documentation and examples could use a little more prose around them or comments within them, as already discussed. However, all user-faced parts of code are very thoroughly written and the private functiones, methods, etc. are clearly indicated. Nevertheless, expanding on comments and guides/tutorials might make it easier for both users and contributors to faster and better orient themselves in the library. It doesn't change the fact, that for specialists in the field who are semi-advanced in Python the clarity of the code, intrinsic messages and equations put in LaTeX, together with the required documentation, seem sufficient.
The one concept in programming might be very new and difficult to get used to even for semi-advanced programmers in Python, like multiprocessing. It concernes the private parts of the code, as the _parallel_methods.py is using a fork of the multiprocessing library. As there are other libraries and methods to make parallel operations in Python, also depending on the hardware one uses (e.g. CUDA), it might be a good idea to significantly expand examples and explain this part of the library in depth so that if someone uses this part of the library, it is done quite purposefully. It is, of course, merely a suggestion on what could be done as a part of future development of the library.
Overall, the biggest challenge before contributors is to make the documentation a little more friendly and embark on a boring and tiresome adventure to add comments or alternatives, as already discussed with author during the review. There already are examples of in-depth comments, so there is a clear and present standard to strive for. Apart from this, the library is very well-organised, written beautifully in OOP in Python, with some functional parts. It is intuitive and modern. Also, introducing multiprocessing shows the author's consideration for computing cost, which is a growing necessity in today's scientific world - especially for those who have the skills but cannot afford proper hardware.
Finally, as a personal comment, though I'm still playing around with the functionalities, as I misunderstood the scope of JOSS review, I can already recommend approving the package and will simply let yopu know here in the comments, whenever and whatever else comes up.
The one concept in programming might be very new and difficult to get used to even for semi-advanced programmers in Python, like multiprocessing. It concernes the private parts of the code, as the _parallel_methods.py is using a fork of the multiprocessing library. As there are other libraries and methods to make parallel operations in Python, also depending on the hardware one uses (e.g. CUDA), it might be a good idea to significantly expand examples and explain this part of the library in depth so that if someone uses this part of the library, it is done quite purposefully. It is, of course, merely a suggestion on what could be done as a part of future development of the library.
multiprocess is a fork of multiprocessing as you say. The reason for using it is that it uses dill rather than pickle, which makes it easier to serialise data - and much easier to work with IMO. As mentioned previously, this is in the non-public facing part of the code, so shouldn't be too important for users. However, there is definite scope for speeding up further with CUDA etc., depending on what hardware one has available.
@gnypit thank you for your review! As I understand, we're waiting for the performance checks from your side. If you need any support with the point Packaging guidelines: The package conforms to the pyOpenSci [packaging guidelines](https://www.pyopensci.org/python-package-guide). then let me know!