joss-reviews icon indicating copy to clipboard operation
joss-reviews copied to clipboard

[REVIEW]: DataAssimilationBenchmarks.jl: a data assimilation research framework

Open whedon opened this issue 3 years ago • 29 comments

Submitting author: @cgrudz (Colin Grudzien) Repository: https://github.com/cgrudz/DataAssimilationBenchmarks.jl Branch with paper.md (empty if default branch): Version: v0.2.0 Editor: @diehlpk Reviewers: @peanutfun, @tmigot Archive: Pending

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/478dcc0b1608d2a4d8c930edebb58736"><img src="https://joss.theoj.org/papers/478dcc0b1608d2a4d8c930edebb58736/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/478dcc0b1608d2a4d8c930edebb58736/status.svg)](https://joss.theoj.org/papers/478dcc0b1608d2a4d8c930edebb58736)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@peanutfun & @tmigot, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @taless474 know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @peanutfun

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@cgrudz) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [ ] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [x] A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
  • [x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [ ] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @tmigot

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@cgrudz) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [ ] Installation: Does installation proceed as outlined in the documentation?
  • [ ] Functionality: Have the functional claims of the software been confirmed?
  • [ ] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [ ] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [ ] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [ ] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [ ] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [ ] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [ ] A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
  • [ ] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [ ] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [ ] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

whedon avatar Feb 03 '22 22:02 whedon

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @peanutfun, @tmigot it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf

whedon avatar Feb 03 '22 22:02 whedon

Wordcount for paper.md is 1301

whedon avatar Feb 03 '22 22:02 whedon

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.5281/zenodo.2029296 is OK

MISSING DOIs

- 10.1175/2009bams2618.1 may be a valid DOI for title: The data assimilation research testbed: A community facility
- 10.5194/gmd-2021-306 may be a valid DOI for title: A fast, single-iteration ensemble Kalman smoother for sequential data assimilation
- 10.1080/16000870.2018.1445364 may be a valid DOI for title: State-of-the-art stochastic data assimilation methods for high-dimensional non-Gaussian problems
- 10.5194/gmd-13-1903-2020 may be a valid DOI for title: On the numerical integration of the Lorenz-96 model, with scalar additive noise, for benchmark twin experiments
- 10.1002/wcc.535 may be a valid DOI for title: Data assimilation in the geosciences: An overview of methods, issues, and perspectives
- 10.5194/gmd-13-1903-2020 may be a valid DOI for title: On the numerical integration of the Lorenz-96 model, with scalar additive noise, for benchmark twin experiments

INVALID DOIs

- None

whedon avatar Feb 03 '22 22:02 whedon

Software report (experimental):

github.com/AlDanial/cloc v 1.88  T=0.24 s (261.9 files/s, 74008.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Julia                           24           1353           3628           6036
Python                          27            971            120           3984
Markdown                         3            119              0            583
TOML                             2            108              1            451
TeX                              2             11              0            107
YAML                             3              3              4             39
Bourne Shell                     1              1              0              2
-------------------------------------------------------------------------------
SUM:                            62           2566           3753          11202
-------------------------------------------------------------------------------


Statistical information for the repository '01771c2f9ef55a9d0f8f5c70' was
gathered on 2022/02/03.
The following historical commit information, by author, was found:

Author                     Commits    Insertions      Deletions    % of changes
Colin Grudzien                  18          1572            312           26.23
Colin J Grudzien                 4            32            124            2.17
cgrudz                           7          2460            448           40.48
plinx                           17          2065            170           31.12

Below are the number of rows from each author that have survived and are still
intact in the current revision:

Author                     Rows      Stability          Age       % in comments
Colin J Grudzien              7           21.9          8.4                0.00
plinx                      5068          245.4          4.4                2.35

whedon avatar Feb 03 '22 22:02 whedon

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

whedon avatar Feb 03 '22 22:02 whedon

Hi @cgrudz 👋 I'll review your software in the next couple of days, and open up issues in the repository as I go along. Some of them might only be suggestions and not crucial to the review. Once I'm finished, I'll report back here and summarize the issues with respect to the review criteria and the checklist above.

Happy coding! 🚀

peanutfun avatar Feb 04 '22 10:02 peanutfun

Hi @peanutfun, thanks so much, I'll be looking forward to hearing your recommendations for improvement.

Cheers, Colin

cgrudz avatar Feb 04 '22 22:02 cgrudz

@peanutfun, thanks so much for these useful and detailed suggestions for improvement. I'll start working through them slowly as I'm also in a somewhat reduced service mode with this project, as I just started a new position. Cheers!

cgrudz avatar Feb 11 '22 17:02 cgrudz

@cgrudz, no worries, take your time. Then I can also take a more relaxed pace with the review 😇

peanutfun avatar Feb 11 '22 17:02 peanutfun

:wave: @tmigot, please update us on how your review is going (this is an automated reminder).

whedon avatar Feb 17 '22 22:02 whedon

:wave: @peanutfun, please update us on how your review is going (this is an automated reminder).

whedon avatar Feb 17 '22 22:02 whedon

@cgrudz, I finished my review. Let me summarize:

DataAssimilationBenchmarks.jl is a well-working framework for researching data assimilation techniques. Installation is easy and I experienced no issues when executing functions of this package. The automated tests are working well and help understanding the inner workings of the module, although code coverage is still a bit low. The README.md gives a good overview of the framework and extensively documents the public API. Finally, the paper is well written and does an especially good job in motivating the development of the framework.

My main issue with the current state of the package is that I, as a person unfamiliar with the source code, see no clear way of extending the framework with new models and DA methods. In that regard, I feel that DataAssimilationBenchmarks.jl fails its own premise because it is intended to compare DA methods against each other and include more filters and smoothers in the future. As an open-source project, it should encourage contributions by others, and therefore needs instructions on how to add such new methods to the framework.

The other points I want to address mostly concern documentation. The README.md should include an Example Usage section, that illustrates the capabilities of the framework and the intended workflow, and supplies a useful set of default function arguments. As the module functions do not return data structures but write all data into files, the structure and content of these files should be documented as well. Additionally, the repository lacks community guidelines, which is more of a technicality. And finally, I think the paper can be shortened, and some of its references can be improved.

I suggested a lot more in the issues I created during my review, which is mainly due to the fact that I see a lot of potential in DataAssimilationBenchmarks.jl. But most of them are well beyond the scope of this review. Two things I still would like to point out: First, you already documented the API extensively in the README.md. Moving this documentation over to docstrings in the code would yield a documentation that is formatted in a standardized and expected way, and should not be much work. And second, I think that reworking the analysis scripts to work on any system would be a huge selling point. Ideally, the framework would then cover everything from running a model, over benchmarking DA methods in parallel, to evaluating and analyzing their performance.

Issues for meeting review criteria

  • [x] cgrudz/DataAssimilationBenchmarks.jl#7
  • [x] cgrudz/DataAssimilationBenchmarks.jl#10
  • [x] cgrudz/DataAssimilationBenchmarks.jl#14
  • [x] cgrudz/DataAssimilationBenchmarks.jl#15
  • [x] Incorporate suggestions to JOSS paper (or conclusively reject them) cgrudz/DataAssimilationBenchmarks.jl#16

Possible improvements I strongly recommend

(These are personal suggestions to improve the package, and they are not required to meet the review criteria for publication in JOSS)

  • [x] Use docstrings to document API cgrudz/DataAssimilationBenchmarks.jl#8
  • [ ] Rework analysis scripts to be used with every experiment/benchmark cgrudz/DataAssimilationBenchmarks.jl#11

peanutfun avatar Feb 23 '22 12:02 peanutfun

@peanutfun thank you for your excellent suggestions and detailed review, I really appreciate the effort you took to go through the code as you did. I just wanted to write a quick message to let you know that I am intending to start on revisions and reading through the posted issues in about a month or two -- I am waiting in part to see if I can fund an undergraduate research assistant to help resolve the issues, as I think these will make good exercises for an RA again. However, if this funding does not go through, I do intend to resolve these issues myself. I'll keep the review process updated with my status shortly.

Cheers! Colin

cgrudz avatar Mar 01 '22 21:03 cgrudz

Hi @cgrudz ! Sorry, I am a bit late in the review. But, I saw @peanutfun already did a great job. I am checking the Functionality section.

  • Installation: Does installation proceed as outlined in the documentation?

Everything works following your suggestions. However, when adding your package the "usual" way in Julia, i.e.

add DataAssimilationBenchmark

I get the following error when testing

     Testing Running tests...
Test Summary:               | Pass  Total
Calculate Order Convergence |    2      2
Test Summary: | Pass  Total
Lorenz-96     |    2      2
Runtime 0.0925 minutes
┌ Warning: Opening file with JLD2.MmapIO failed, falling back to IOStream
└ @ JLD2 ~/.julia/packages/JLD2/k9Gt0/src/JLD2.jl:233
Error encountered while save FileIO.File{FileIO.DataFormat{:JLD2}, String}("/home/tmigot/.julia/packages/DataAssimilationBenchmarks/ZFBxn/src/experiments/../data/time_series/IEEE39bus_time_series_seed_0000_diff_0.000_tanl_0.01_nanl_05000_spin_1500_h_0.010.jld2").

Fatal error:
Time Series Generation: Test Failed at /home/tmigot/.julia/packages/DataAssimilationBenchmarks/ZFBxn/test/runtests.jl:41
  Expression: TestTimeSeriesGeneration.testGenIEEE39bus()
Stacktrace:
 [1] macro expansion
   @ ~/packages/julias/julia-1.7.1/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
 [2] macro expansion
   @ ~/.julia/packages/DataAssimilationBenchmarks/ZFBxn/test/runtests.jl:41 [inlined]
 [3] macro expansion
   @ ~/packages/julias/julia-1.7.1/share/julia/stdlib/v1.7/Test/src/Test.jl:1283 [inlined]
 [4] top-level scope
   @ ~/.julia/packages/DataAssimilationBenchmarks/ZFBxn/test/runtests.jl:39
Test Summary:          | Pass  Fail  Total
Time Series Generation |    3     1      4
ERROR: LoadError: Some tests did not pass: 3 passed, 1 failed, 0 errored, 0 broken.
in expression starting at /home/tmigot/.julia/packages/DataAssimilationBenchmarks/ZFBxn/test/runtests.jl:2
ERROR: Package DataAssimilationBenchmarks errored during testing

Maybe the error comes from Julia 1.7 or is it an issue with the add ?

In the latter case, this might be inconvenient, for instance if someone uses this package as a dependency. What do you think? Once we clarify this, I can open an issue on this topic.

  • Functionality: Have the functional claims of the software been confirmed?

As mentioned in https://github.com/cgrudz/DataAssimilationBenchmarks.jl/issues/7 the documentation and the paper lacks an illustration of what the package is capable of doing.

tmigot avatar Mar 06 '22 18:03 tmigot

@cgrudz Have you thought about making a real documentation hosted with Github Pages and using Documenter.jl?

See docs for Documenting in Julia: https://juliadocs.github.io/Documenter.jl/stable/

I have a package that has been recently published in JOSS as an example: https://github.com/JuliaSmoothOptimizers/DCISolver.jl (you can access the doc via the badges dev and stable in the readme).

This would allow clearer and more complete documentation. That could be very relevant, especially, in the perspective of expanding your package.

tmigot avatar Mar 06 '22 18:03 tmigot

@tmigot, no worries, thank you very much for your helpful suggestions and for catching the bug with the package add. As I mentioned above I will start to work on these revisions in earnest in one to two months, once I determine the availability of a research assistant to help with these tasks. I'll address these items as soon as I can. Cheers!

cgrudz avatar Mar 08 '22 00:03 cgrudz

@cgrudz how is the development process? Did you have the chance to complete what the reviewers asked for?

taless474 avatar May 13 '22 21:05 taless474

@taless474 , thanks for checking in. Things are currently in-progress where I am about ready to start clearing some of the issues (new doc pages!), and I am expecting to have a summer research assistant who will be working with the software and will help with the revision and new features for the next release. I am anticipating resolving the major items and submitting a revision after the research internship has completed around the end of summer, but I'll post updates as I feel like issues / tickets have been cleared. Cheers!

cgrudz avatar May 13 '22 22:05 cgrudz

@cgrudz Thank you for the update and good luck

taless474 avatar May 14 '22 18:05 taless474

@taless474 @cgrudz I am happy to see things are moving along here. Please note that due to the death of a close family member I will need to take some time off. I will continue the review on 19 September at the earliest. Thanks in advance for your understanding!

peanutfun avatar Sep 08 '22 17:09 peanutfun

@peanutfun, I'm so sorry to hear this, you have my condolences and full understanding. We will try to have this in re-submission in the coming few weeks, there is no rush to review this.

cgrudz avatar Sep 08 '22 17:09 cgrudz

@editorialbot assign @diehlpk as editor

:wave: folks – @diehlpk has kindly volunteered to step in as the handling editor here as @taless474 is not currently available to edit. Thanks Patrick!

arfon avatar Oct 16 '22 18:10 arfon

Assigned! @diehlpk is now the editor

editorialbot avatar Oct 16 '22 18:10 editorialbot

Hi @tmigot how is your review going?

diehlpk avatar Oct 16 '22 22:10 diehlpk

Hi @peanutfun, I hope things are going better for you. Could you please update us on your review progress?

diehlpk avatar Oct 16 '22 22:10 diehlpk

Hi @diehlpk, thanks for checking in. Yes, I am better! I understood the comment by @cgrudz that he will notify us once he thinks all issues related to this review are resolved. Therefore I put the review on hold. Did I get this wrong? I can resume working on this review at any time.

peanutfun avatar Oct 17 '22 13:10 peanutfun

Hi @diehlpk, thanks for checking in. Yes, I am better! I understood the comment by @cgrudz that he will notify us once he thinks all issues related to this review are resolved. Therefore I put the review on hold. Did I get this wrong? I can resume working on this review at any time.

@cgrudz can you please comment on that?

diehlpk avatar Oct 17 '22 13:10 diehlpk

@diehlpk, thanks for reaching out, the timing is very good actually because we were just waiting to finish off some last test cases and push a new tagged version to the Julia General Registries. I think everything is now complete with our revision, and the new tag is in-progress with the Julia tagbot action pending:

https://github.com/JuliaRegistries/General/pull/70433

Please feel free to start the next round of review, we are now satisfied with how we've addressed the original issues and how we have improved the quality of the software and its documentation.

Cheers, Colin

cgrudz avatar Oct 17 '22 20:10 cgrudz

@peanutfun can you please have a look?

diehlpk avatar Oct 17 '22 20:10 diehlpk

@peanutfun can you please have a look?

diehlpk avatar Oct 22 '22 22:10 diehlpk