joss-reviews icon indicating copy to clipboard operation
joss-reviews copied to clipboard

[REVIEW]: specieshindex: How scientifically popular is a species?

Open whedon opened this issue 4 years ago • 31 comments

Submitting author: @jesstytam (Jessica Tam) Repository: https://github.com/jessicatytam/specieshindex Branch with paper.md (empty if default branch): Version: 0.3.1 Editor: @Bisaloo Reviewers: @kamapu, @tom-gu, @DrMattG Archive: Pending

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/401f3e0de4da2e338c5903d8a96a910b"><img src="https://joss.theoj.org/papers/401f3e0de4da2e338c5903d8a96a910b/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/401f3e0de4da2e338c5903d8a96a910b/status.svg)](https://joss.theoj.org/papers/401f3e0de4da2e338c5903d8a96a910b)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@kamapu & @tom-gu & @DrMattG, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @Bisaloo know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @kamapu

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@jessicatytam) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [x] #3902
  • [ ] Functionality: Have the functional claims of the software been confirmed?
  • [ ] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [ ] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [ ] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [ ] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [ ] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [ ] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [ ] A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
  • [ ] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [ ] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [ ] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @tom-gu

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@jessicatytam) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [x] A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
  • [x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @DrMattG

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@jessicatytam) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [ ] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [x] A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
  • [x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

whedon avatar Sep 29 '21 13:09 whedon

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @kamapu, @tom-gu, @DrMattG it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf

whedon avatar Sep 29 '21 13:09 whedon

Wordcount for paper.md is 1478

whedon avatar Sep 29 '21 13:09 whedon

Software report (experimental):

github.com/AlDanial/cloc v 1.88  T=0.05 s (357.9 files/s, 141975.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Rmd                              2            271            520           2140
R                                9            175           1233           1355
TeX                              3             31              0            522
Markdown                         2            100              0            312
YAML                             1             15              1             68
-------------------------------------------------------------------------------
SUM:                            17            592           1754           4397
-------------------------------------------------------------------------------


Statistical information for the repository 'bd838b7115269e827ea9c7cb' was
gathered on 2021/09/29.
The following historical commit information, by author, was found:

Author                     Commits    Insertions      Deletions    % of changes
jessicatytam                     2            19             19          100.00

Below are the number of rows from each author that have survived and are still
intact in the current revision:

Author                     Rows      Stability          Age       % in comments

whedon avatar Sep 29 '21 13:09 whedon

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1073/pnas.0507655102 is OK
- 10.1111/tbed.12221 is OK
- 10.1111/mam.12066 is OK
- 10.1007/s11192-015-1798-9 is OK
- 10.1007/s11192-007-1859-9 is OK
- 10.1371/journal.pone.0131004 is OK
- 10.1111/mam.12038 is OK
- 10.1038/s41598-017-09084-6 is OK
- 10.1139/facets-2016-0011 is OK
- 10.1007/s11192-006-0147-4 is OK
- 10.1017/S1367943004001799 is OK
- 10.1371/journal.pone.0189577 is OK
- 10.1007/s11192-006-0146-5 is OK
- 10.1016/S0169-5347(01)02381-3 is OK
- 10.1038/s41467-018-07916-1 is OK
- 10.1111/acv.12586 is OK
- 10.1038/s41477-021-00912-2 is OK
- 10.1016/j.gecco.2018.e00423 is OK
- 10.1007/s11160-019-09556-0 is OK
- 10.1111/j.1523-1739.2006.00616.x is OK
- 10.1111/j.1523-1739.2010.01453.x is OK
- 10.1108/07378830610715473 is OK
- 10.1093/eurheartj/ehx446 is OK
- 10.1108/14684521211209581 is OK

MISSING DOIs

- None

INVALID DOIs

- None

whedon avatar Sep 29 '21 13:09 whedon

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

whedon avatar Sep 29 '21 13:09 whedon

:wave::wave::wave::wave: @jessicatytam @kamapu @tom-gu @DrMattG, this is the review thread for the paper. All of our communications will happen here from now on.

All reviewers have checklists at the top of this thread with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention openjournals/joss-reviews#3776 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use Whedon (our bot) to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@Bisaloo) if you have any questions/concerns.

Bisaloo avatar Sep 29 '21 14:09 Bisaloo

:wave: @tom-gu, please update us on how your review is going (this is an automated reminder).

whedon avatar Oct 13 '21 13:10 whedon

:wave: @DrMattG, please update us on how your review is going (this is an automated reminder).

whedon avatar Oct 13 '21 13:10 whedon

:wave: @kamapu, please update us on how your review is going (this is an automated reminder).

whedon avatar Oct 13 '21 13:10 whedon

Thanks for your comments, @DrMattG!

:wave: @kamapu, @tom-gu, could you please give us an update on your progress for the review of this package & paper? Please let me know if you encounter any issues.

Bisaloo avatar Oct 26 '21 08:10 Bisaloo

Thanks for your comments, @DrMattG!

👋 @kamapu, @tom-gu, could you please give us an update on your progress for the review of this package & paper? Please let me know if you encounter any issues.

👋 @Bisaloo, I'll submit my review by tomorrow. Sorry for the delay, it's due to the occurrence of too many unexpected events :sweat_smile:

tom-gu avatar Oct 26 '21 15:10 tom-gu

Shame on me! Though too late, I'll provide my comments. Unfortunately I don't have any access or copy to the manuscript now.

kamapu avatar Nov 10 '21 10:11 kamapu

@whedon generate pdf

Bisaloo avatar Nov 10 '21 11:11 Bisaloo

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

whedon avatar Nov 10 '21 11:11 whedon

@kamapu, you should be able to access the manuscript via the links in the previous comment :slightly_smiling_face:

Bisaloo avatar Nov 10 '21 11:11 Bisaloo

@kamapu, you should be able to access the manuscript via the links in the previous comment slightly_smiling_face

Thanks! The robot is closing my issues in the repo of specieshindex. I'm doing something wrong or it is because of been too late.

kamapu avatar Nov 10 '21 12:11 kamapu

You need to manually open issues in https://github.com/jessicatytam/specieshindex - do not use the GitHub "convert to issue" feature, as it opens issues in the wrong repository.

danielskatz avatar Nov 10 '21 12:11 danielskatz

@jessicatytam @wcornwell @Bisaloo Here is my full review, so sorry for the delay!


Package overview:

The specieshindex package is showing great potential to promote scientific merit. It is a delightfully user-friendly package with superb documentation. Nonetheless, I identified several areas that can be improved. I did not open specific GitHub issues since they are more systemic and should be addressed as the package matures. Additionally, I had a few minor suggestions and some ideas for future development.

Package structure & development:

Testing & CI

I saw that you recently added several tests and a CI (via GithubAction). It is great to know that you are adopting a more test-driven development (TDD) schema. Pivotal tests for this particular package are smoke tests to monitor API connection and to validate a proper output structure. As you know by now, constructing this for Scopus is easier than for WOS and BASE. However, please refer to this section to learn how to manage secrets (e.g., API keys) within the package.

Code duplication

The level of duplication (code & function documentation) is off the charts. It partially stems from a segregated function strategy and from not fully utilizing sub-functions (internal, non-exported functions). This is problematic from the POV of a package developer since even the smallest changes can become a real hassle. When the small things become exhausting, the big things become impossible! I’m advocating for the consolidation of similar functions by adding a parameter or two. This might slightly affect the user’s experience, but it will greatly serve package maintenance and improve its overall structure. If you decide to make the changes, please avoid putting all the functions in the same R file.

Code refactoring (mostly coding efficiency)

Going forward, I encourage you to review your coding efficiency. Few examples:

  • Is it necessary to hit the API so many times? Can post-processing of one API call achieve the same results? What will happen when the current year is 2022 or 2023? Isn’t the raw data by itself valuable?
  • When plotting all indices, why not using facet_wrap() or facet_grid() for multi-panel plots?

Defensive programming

Communicating with our users via errors, warnings, and messages significantly improves their user experience and can help mitigate knowledge gaps. Currently, the package has no implementation of defensive programming. I encourage you to embrace it. Here you can find helpful information. Side note: implementing defensive programming is much harder when you have high levels of code duplication.

Motivation for package maintenance

Package maintenance is crucial for establishing software sustainability. This aspect of package development is a demanding God. Adequately quantifying long-term maintenance is rather challenging, even more so for scientists. When trying to connect the dots, it seems that this package was mainly developed for the study of biases in mammalian. If this is the case, I wonder if the Cornwell Lab has a long-term maintenance plan with sufficient resources to support it. Please forgive me if the query seems intrusive, I’m only trying to share insights from my own experience and pain. It might be beneficial to instigate a collaboration with an organization or additional PI that is going to heavily rely on this package. Perhaps @mjwestgate will have specific leads.

Data science >> Software development

I can classify this package under ‘data scientists developing a package’ scenario. It’s clear that you completely understand your user, but do not yet fully comprehend your development needs. In order to support your package development journey, I urge you to read rOpenSci Packaging Guide and R Packages. In a way, now is the perfect time for that. Now, you can properly digest these comprehensive resources.

goodpractice output

See: https://github.com/MangoTheCat/goodpractice Please run goodpractice::gp() or see this RStudio Project for the goodpractice output.

Minor suggestions:

Installation instructions (README; paper; vignette)

Why you need all of this: https://github.com/jessicatytam/specieshindex/blob/9698952bf81d8d549b7c5bef3a4a557f342bf13d/README.rmd#L22:L31 It’s somewhat redundant, why not just:

install.packages("remotes")
remotes::install_github("jessicatytam/specieshindex")

paper.md file

Code is too long in lines 76, 109, 144-147, and 150. https://raw.githubusercontent.com/openjournals/joss-papers/joss.03776/joss.03776/10.21105.joss.03776.pdf

DESCRIPTION file

Package contributors might be missing (compared to paper authors).

git

I see that both Jessica and Will are committing directly to the master branch. After pkg release, it’s better to have a dev branch and to merge code via PR.

pkgdown

The overall level of package documentation is excellent. Creating a website via pkgdown will beautifully showcase it. Implementation is quite easy and totally worth it!

Ideas for future functionality:

  • Introducing a query ‘batch mode’ by passing a dataframe
  • Utilizing taxize to transparently generate a list of synonyms

Final words

Sharing as many insights as I can is not just my duty but also a privilege, as it makes my own hardships along the way even more worthwhile. I hope you will find this review helpful and not too preachy 😃

From one Biodiversity Jedi to another, may the foRce be with you! Tomer

tom-gu avatar Nov 10 '21 19:11 tom-gu

Review by Miguel

Sorry for the late response. It is my first review of a package and the way to proceed has been challenging for me. Also some "spontaneous" activities were kipping me busy and I just managed to finish my review.

While this package seems to be promising for its use also in the context of my own work, I see the necessity of a major revision of both, the structure of functions as well as in the documentation.

I didn't opened issues for this review. I would suggest to the authors when responding to refer every change to a commit in the repository.

I sum up my suggestions here (detailed comments below):

  • Implement a secure way to provide credentials in sessions.
  • Re-structure the documentation for the functions Count*() and Fetch*(). In the best case, define just two functions with all possible variants set through the arguments.
  • Examples in documentation and tutorials should follow the whole path from querying to assessing, discarding the use of pre-stored data sets.
  • Get rid of the pre-stored data. The user have to be able, first of repeat the query and get the results or apply the queries to own questions. Pre-stored data can still be hidden and used for making vignettes and avoid possible errors but giving examples to the users on the basis of such pre-stored data is in my opinion rather confusing.
  • Not mandatory, but a function querying the metadata of retrieved publications (e.g. for its export to BibTeX) will complete the whole story about specieshindex.

Comments and Suggestions

README

  • [ ] I also support the idea of simplifying code, for instance using following code (include the same in the vignette): remotes::install_github("jessicatytam/specieshindex", build_vignettes = TRUE, dependencies = TRUE)
  • [ ] In relation to the last, I was wondering why the the suggested installation from github is not including build_vignettes = TRUE. If you are fearing, the users may not have any LaTeX distribution, you can rather use HTML as output format. Users may still need to install Rtools and pandoc, though.

Vignette

  • [ ] Call the vignette with a topic name, not just vignette.
  • [ ] Instructions for installing (same comment as for README).
  • [ ] In chapter Workflow a brief description of the databases is required, especially regarding BASE (I didn't knew this one).
  • [ ] In chapter Other Arguments and Syntax you should invert the sorting of the sub-chapters Additional Keywords and Count, Fetch, ...* because you explain the functions and their difference in the second one and should be done at the beginning to avoid confusions. For instance I didn't noticed that there is a block using CountSpT() and CountSpTAK().
  • [ ] in FetchSpT(db = "scopus", genus = "Bettongia", species = "penicillata") I get an error:
Error in get_results(query, start = init_start, count = count, verbose = verbose,  : 
  Bad Request (HTTP 400).
  • [ ] I'm wondering why the results of FetchSp*() or CountSp*() cannot be assigned to an R object. This is very restrictive in the case that you like to implement these functions in statistical assessments or reports.

Manual

  • [ ] My impression is that the manual can be arranged in a better way and perhaps even the functions can be simplified. For instance all FetchSP*() functions should share the same documentation page. The same applies for CountSP*(). In fact, I would even be more radical and just define two function count() and fetch() (names just as examples), where everything else (either looking for species or genera, in title, in keywords, in abstract, in which database, etc.) can be set as argument in ad-hoc function parameters.
  • [ ] I failed to assign outputs of Count*() and Fetch*() to an object (e.g. by obj <- Fetch*()). I also wonder that any of the examples do assignments in the manual and all functions that are supposed to work on outputs of these functions actually work on data already installed by the package. I know that such kind of queries may fail when building the package but you have the option of writing your examples within a \dontrun{} environment. I will prefer to provide examples to the users starting by the query.
  • [ ] If addressed all previous issues, installed data will become superfluous.
  • [ ] Is there an option for extract the metadata of the queried references, for instance to export it to a BibTeX file?

Manuscript

  • [ ] The same comments for Vignette and Manual may be also applied to the manuscript.
  • [ ] I would prefer to have a concrete example of a question and apply it from the query to reporting the outcome without using pre-stored data.

General comments

  • [ ] I found quite risky the use of objects for the storage of credentials because by sharing a project you may mistakenly share a script or an object revealing your passwords and IDs. An alternative can be to use masks with tcltk and store the entries in the temporary folder (see this post) or using the package keyring (see here).

Checks

  • [ ] Current checks retrieves warnings about the use of ::: and :: which may be avoided. The second case, as far as I know, may overcome the declaration of dependencies, which is not desirable.
  • [ ] Warnings regarding no visible binding for global variable .... In this case I'll recommend to use NULLing (see this example).
  • [ ] Check and solve all issues retrieved by goodpractice::gp()

kamapu avatar Nov 28 '21 07:11 kamapu

Thanks everyone for your helpful comments! I've since updated the documentation, reorganised the functions, and fixed some of the warnings that were showing up in checks. We're currently working on adding more tests for the API.

In regards to @kamapu 's comments:

  • I've replaced all the Fetch*() and Count*() functions with Fetch() and Count() to make things simpler
  • the error Bad Request (HTTP 400) is probably a result of your institution not being a subscriber of Elsevier/Scopus (see https://github.com/jessicatytam/specieshindex/issues/46#issue-1029956107)
  • when you say obj <- Fetch() and obj <- Count() are not working, do you mean they are returning some sort of error? They are both working for me, do you want to provide a reprex?

jesstytam avatar Dec 13 '21 00:12 jesstytam

Great, @jessicatytam, please let us know when you have made progress on various points raised in the review. Feel free to open an issue for each item if helpful :relaxed:

@kamapu, I think you missed this question :wink::

when you say obj <- Fetch() and obj <- Count() are not working, do you mean they are returning some sort of error? They are both working for me, do you want to provide a reprex?

Bisaloo avatar Jan 21 '22 09:01 Bisaloo

@jessicatytam, do you have a timeline on when you think you could do the required changes? If you don't think this will be possible anytime soon, we can pause this submission or offer to resubmit later.

Bisaloo avatar Feb 14 '22 15:02 Bisaloo

Hi @Bisaloo apologies for the holdup! I've been working on multiple projects and getting organised as I'm starting my PhD. We hope to finishing addressing the comments by the end of March. Here's a list of tasks we will be working on:

  • [x] Adding API tests
  • [x] Reduce code duplication
  • [x] Use facet_wrap in plots
  • [x] Add more error messages
  • [x] Implement a secure way to provide credentials in sessions

jesstytam avatar Feb 22 '22 21:02 jesstytam

Hi @jessicatytam :wave:, could you update us on the current status of this submission please? If you need some more time, it's okay but would you have a rough estimate of when we could expect your changes? If this is going to take a while, you could also withdraw this submission for the time being and resubmit later.

We are flexible. Please let me know what is the most reasonable and convenient option for you.

Bisaloo avatar May 09 '22 10:05 Bisaloo

We have completed all of the tasks and addressed the comments, with the exception of adding tests, it's currently ~50% so we will probably need until the end of May if that's ok.

What would be the ideal % for test coverage?

jesstytam avatar May 10 '22 05:05 jesstytam

We have completed all of the tasks and addressed the comments, with the exception of adding tests, it's currently ~50% so we will probably need until the end of May if that's ok.

Sounds good :+1:

What would be the ideal % for test coverage?

There is no hard and fast rule, especially for API packages that are somewhat more difficult to write tests for. Most best practice guides say something like "major functionality is covered by tests". In my experience, around 80-85% is usually a good number to ensure tests cover most sections of your code. Higher is also great but please let me know if you have issues reaching this number.

For what it's worth, a good and lower-effort way to improve your test suite in the future is also to turn each bug (such as the ones discovered in this review) into a test so you can make sure you don't break this again in the future.

Bisaloo avatar May 10 '22 07:05 Bisaloo

We've got through adding lots of tests and almost everything is now tested except for one big issue.

We're dealing with some issues with testing the functions that query an API multiple times. Since there are limitations in the scopus API, we need >10 GET requests in a single function to retrieve all of the data while handling edge cases smoothly. We are using httptest::with_mock_api() to test the API functions currently, but with_mock_api() can only test 1 GET request per test. So to test everything in those functions is difficult given the combination of the constraints of the Scopus API and the mock_api framework. So we're a bit stuck on that unless you can see a clever solution?

Here's the link the codecov: https://app.codecov.io/gh/jessicatytam/specieshindex/blob/master/R/specieshindex.R . Check out the function FetchT_scopus which is being tested but the test exits (successfully) after the first GET call. Any help much appreciated.

jesstytam avatar May 26 '22 02:05 jesstytam

@jessicatytam, while investigating your problem, I had another look and I want to already now mention that @DrMattG's point about code duplication is not yet properly addressed in my opinion.

Most functions still include a lot of duplicated code. I have submitted a pull request (https://github.com/jessicatytam/specieshindex/pull/60) where I show how you could reduce code duplication. Please have a look and try to apply this kind of fixes throughout the code. Let me know if it's not clear.

On the topic of duplication, I notice that the documentation of example datasets is duplicated as well. You documented them in each individual files (Koala.R, Platypus.R, etc.) and at the end of the main specieshindex.R file. Could you have a look please?

I would like to emphasize that this insistence on avoiding code duplication is not just to annoy you. Code with high duplication is very hard / impossible to maintain in the long run. If you refactor the code to reduce duplication, future updates will require change in a single place, instead of throughout the codebase.

Bisaloo avatar May 30 '22 11:05 Bisaloo

Thanks for the comments! I'll have another look at the updated code again and come up with a list of things to do and a timeline.

jesstytam avatar Jun 15 '22 01:06 jesstytam

  • [ ] unused arguments (https://github.com/jessicatytam/specieshindex/issues/61#issue-1253599082)
  • [ ] storing results w/in a loop (https://github.com/jessicatytam/specieshindex/issues/62#issue-1253657243)
  • [ ] reduce code duplication (https://github.com/jessicatytam/specieshindex/pull/60#issue-1252587022)
  • [x] fix dataset documentation duplication
  • [x] replace print() with message() (https://github.com/jessicatytam/specieshindex/issues/64#issue-1271823188)

jesstytam avatar Jun 15 '22 05:06 jesstytam