submissions [Re] Biodiversity of plankton by species oscillations and chaos

Original article: Huisman, J., & Weissing, F. J. (1999). Biodiversity of plankton by species oscillations and chaos. Nature, 402(6760), 407-410.

PDF URL: https://github.com/fbarraquand/supersaturated_coexistence/blob/main/article.pdf Metadata URL: https://github.com/fbarraquand/supersaturated_coexistence/blob/main/Article/metadata.yaml Code URL: https://github.com/fbarraquand/supersaturated_coexistence

Scientific domain: Ecology Programming language: R Suggested editor: Timothée Poisot @tpoisot

Tagging co-authors @DoyenGuillaume @CoraliePicoche

Sep 09 '22 15:09 fbarraquand

Thank your for your submission, we'll assign an editor soon.

Sep 21 '22 10:09 rougier

@tpoisot Can you edit this submission (if no, suggest an editor)

Sep 21 '22 10:09 rougier

Sorry about the delay - yes I can edit this submission.

Oct 27 '22 14:10 tpoisot

EDITOR

@yoavram - can you REVIEW this submission? @opetchey - can you REVIEW this submission (as you worked on this), or suggest a reviewer?

Nov 07 '22 19:11 tpoisot

I'm game! (to review)

Nov 09 '22 21:11 swmpkim

REVIEWER 1 is @swmpkim -- thank you!

There is an overview of the review process here. You can post your review as a comment in this thread. If you want to comment on a specific place in the code, you can do it like so:

https://github.com/fbarraquand/supersaturated_coexistence/blob/main/Code/Script_Fig_2.R#L12

But also feel free to voice high-level comments about reproducibility, ask questions, and clarifications. We are expecting the review to be a discussion process aimed at improving the code of the paper!

Nov 10 '22 02:11 tpoisot

@fbarraquand @DoyenGuillaume @CoraliePicoche - can you suggest reviewers from the list or from outside the list?

Nov 25 '22 18:11 tpoisot

From the list I can suggest @pboesu and @ha0ye. Outside the list Zachary R Miller (@zacharyrmiller) and Jurg W. Spaak (@juergspaak) [to whom I mention that ReScience C is a computational science reproduction journal, if they see the message because of tagging] -- selection based on their recent works.

Nov 26 '22 08:11 fbarraquand

I am familiar with the original Huisman & Weissing paper, but won't have time to do a review until after the next 2 weeks.

Nov 26 '22 18:11 ha0ye

@fbarraquand I've seen your message. However, I'm not sure how to respond to this. Is this equivalent to a request for a review? I do know Huisman& Weissing paper, but R is not my primary coding language (but could potentially review this as well).

Nov 28 '22 10:11 juergspaak

Hi @juergspaak, thanks for asking, the review process at ReScience occurs indeed through GitHub issues -- however, as it is our paper being reviewed I can only suggest referees and @tpoisot eventually decides as editor. So I'd suggest waiting for @tpoisot's input (think of my message as the usual email received when somebody enters your name in a database of potential reviewers).

(@ha0ye on our end waiting 2 more weeks is fine but again I defer to @tpoisot regarding the way forward)

Nov 28 '22 15:11 fbarraquand

Unfortunately I am currently tied up with other reviews and editorial duties. Sorry.

Nov 29 '22 10:11 pboesu

Hi y'all, I'm one of the reviewers and haven't had a chance to look at the paper yet (the new one or the 1999 one; next week, I promise!). But I did run code (R scripts only) and wanted to note a couple of things to update, maybe before the next reviewer comes on board.

I dove into the code without even trying to figure out the ecological context: purely a question of "can I run all this code". And the answer is "almost entirely, yes". The way the files were organized, named, and discussed in the README made it really easy to figure out what to run and when. I appreciated the warning about the code for Fig. 3 in particular taking a long time to run. And best I can tell, the images I generated were the same as the authors'. So with a few updates, yes, the code was reproducible on my machine. Specifics follow.

Important code updates - things I had to change in the code to make it run. All notes are for Script_Experiment.

need a call to library(xtable) or lines 143 and 144 won't run.
extra space in line 116, towards the end of the line. There's a > = that needs to be >=. Context: (results_exp1[dim(results_exp1)[1], 2:13])> = 0),rep(T,12)
similar extra spaces in lines 135 and 206: " = = " should be "==" in the Probability = c(colSums......) portion.

Additional but less critical notes

In the README, I believe you're referring to the save_image() function here, not save.image(): "Finally, saving the plotly plots into .pdf of .png files requires the save.image() function."
It would be helpful with that same line to note that the save_image() function requires an installation of python/miniconda, since that particular function isn't available directly in the R {plotly} package. (I didn't understand why I needed the provided installation instructions until running that line in Script_Fig_1 threw errors.)
Same function, again - need to make sure the {reticulate} package is recent. ~~The CRAN version that I'd installed had a bug and I had to install the github version in order to get the miniconda installation to work correctly.~~ See issue 1297 for details on that issue. ~~Unsure when a new version will be pushed to CRAN.~~ UPDATE: as of 1/8/2023, the version of {reticulate} on CRAN has this bug fixed (see again the linked issue).
I still never got that function to work but I don't think it's the authors' job to get me through the error I couldn't get past ("NameError: name 'sys' is not defined"). I tried to do some troubleshooting but figured actually saving the images was less important than generating them.
a couple minor typos in the README, Code/ .R files section: Code/Script_Exp.R should be Code/Script_Experiment.R; Code/FScript_Functions.R should be Code/Script_Functions.R
For experiment 1, I got a few of these warning messages; not sure if this matters: DLSODA- At T (=R1), too much accuracy requested for precision of machine.. See TOLSF (=R2) In above message, R1 = 8656.66, R2 = nan
Also for experiment 1, I got a few warnings along these lines; again not sure if they are concerning but want to point them out in case. 1: In lsoda(y, times, func, parms, ...) : Excessive precision requested. scale up rtol' and atol' e.g by the factor 10 2: In lsoda(y, times, func, parms, ...) : Returning early. Results are accurate, as far as they go 3: In lsoda(y, times, func, parms, ...) : an excessive amount of work (> maxsteps ) was done, but integration was not successful - increase maxsteps
Didn't get any warnings for experiment 2.

I think that's all for now. I hope it's helpful. Thanks for letting me review this; I'm looking forward to exploring the paper and learning what all this code was for! More next week sometime.

Dec 01 '22 17:12 swmpkim

Thank you @swmpkim for this first part of the review. @tpoisot, just a reminder: the requested referee suggestions are listed above.

Dec 13 '22 08:12 fbarraquand

Thanks all - @juergspaak I would be glad if you could have a look as reviewer 2.

Dec 13 '22 17:12 tpoisot

Took a little longer than I thought to review the paper itself. After doing so, and digging back into the code, I'm even happier with how well the code was organized and documented. Good job to the authors on that. Here are my additional comments; I hope they're helpful.

Additional code thoughts:

in Script_Fig_exp_1 - turns out I hadn't run this before submitting my first comment:
- I ended up with 371 instead of 372, and 430 instead of 431. Other panels were the same as in the paper.
- On the second row, my y-axis goes up to 60 rather than 40 (maybe because of the different outputs that were selected).
Fig_exp_2 came out exactly the same.
though I was able to reproduce all figures (with the minor exceptions above for figure 5), I would recommend structuring for loop outputs differently in the future. At least in R, data frames as output for a loop are known to make loops very slow and inefficient because they have to be copied with every iteration. A list would probably work better; each iteration adds to the list and you can rbind all the pieces at the end. I wouldn't go so far as to say the current code should be restructured, because it's complicated and it works (albeit slowly) as-is. Just something to keep in mind for future, happier work - avoid data frames as loop outputs in R.

Notes on the paper:

In this article, the authors replicate Huisman and Weissing's (1999) simulations and figures of phytoplankton species abundances in a stable system with limited resources, using model parameters from the original paper. The authors then go further and perform simulations in which the the input parameters vary, and the results vary as well: to the point where the authors suggest that supersaturation is NOT common. The replication is well done and the code is well-documented. I do have misgivings though about the conclusions drawn from the new simulations, and would like to see more discussion and justification to ensure the conclusions are appropriate based on the outputs. What I'm most concerned about is calling the perturbations to the original (HW1999) parameters 'small'. I'll separate my comments into the 'replication' portion of the paper and the 'experimental' portion.

Replication portion

The paper is short and to the point. Without having the original next to me while reading this one though, it would have been hard to understand. Therefore many of my suggestions involve adding some detail and exposition to enable the current paper to stand more on its own.

in the introduction, when discussing the different "parameter scenarios" from the original paper, would be helpful to set up the context that these are parameters of the models for competition and saturation etc. Okay to refer to original paper for specifics of parameters used, but please give some examples of how this would play out (like in the original paper when they listed things like N, P, Si, etc. as limiting resources).
what was $r_i$ in the original simulations? Looks like it starts at 1 for all species in figure 4 (this is based on the code - but I'd like to see it mentioned in the methods of the article, and then talk about what $r_i$ values were used in the experiments)
please add citations for the {deSolve} package and R itself
the tables in results, with probabilities of species numbers - this wasn't entirely clear to me. Is this how likely each outcome was from the simulation results? Should the probabilities add up to 100%? If yes, I'd suggest saying something like "proportion of simulated communities with X species after 10,000 days" in the caption. However the numbers do add up to more than 100.
same tables would be easier to read in vertical form, with two columns

Experimental portion

The paper's introduction refers to the experimental perturbations of growth rates as both "mild" and "small". Please provide some justification and citations (if possible) for this - to me, going from "all species are assumed to have the same max growth rate" (all $r_i$ values were 1 in HW1999's simulations) to "all species have differing growth rates, which can vary by more than 50%" (0.7 - 1.3) is far from small. It may indeed be more realistic, but is this really a small perturbation?
In the vein of these parameter values being more realistic - is variation in actual growth rates something that's known or has been estimated? if so, please add citations to justify these choices; if not, please explain why these numbers seemed reasonable to use (I'm not arguing; just want to see clear reasoning).
After looking at some of the results from experiment 1, I don't think "number of species present after 10,000 days" adequately represents what happened with the species communities. If this really is the only metric that is important, it needs some justification. It seemed that a fair amount of scenarios had good stable oscillations for more than 5 species, sometimes even 8-10, and then everything crashed with the last species additions. I'd like to see some more detailed information about these results in the paper itself. The code was well-documented enough (again - kudos) that I could play around with it, but more information could be included in the results/discsusion without the reader having to run code. A couple of ideas:
- The simplest thing would be to include graphs from more random selections of the simulations, rather than two panels for each random selection. This could be personal preference, but I don't need to see a separate panel for the later-added species; just throw it all on one panel and I can get an idea of whether things are stably oscillating or not. And I think showing more outputs will be more helpful for the reader, rather than more detail for fewer outputs.
- I'd love to see some information about number of species present just before each addition as well - e.g. in the tables, rather than only including proportions for the number at 10,000 days, add columns (if you take my suggestion above - otherwise it would be added rows) for number present at 1,000 days (out of 5); 3,000 days (out of 8); 5,000 days (out of 10); 10,000 days (out of 12). I realize this may be a lot of work to pull out of the results. But it would be incredibly helpful and enlightening to see. This may work better as a series of histograms or density plot(s) once all that info is added - see what works, if you tackle it.
description in the text (on page 7) of figure 5 says that "it is already noticeable that ..... is sometimes already unstable" - I'd recommend describing what the reader should look for regarding stability/instability. e.g. "in many scenarios, at least a few species oscillated around stable points until additional species were introduced (e.g. figure 5, 315 and 81); but sometimes these stable oscillations were never established in the first place (e.g. top row of figure 6)." Not sure I captured that completely - but I didn't know what you meant by stable or unstable until I really dove into more code, so a verbal description would be helpful.

Dec 13 '22 17:12 swmpkim

@tpoisot yesish, I can review this paper, however I would only be able to provide feedback by the 15th of January. If that is good enough for you I'll dig into it

Dec 14 '22 08:12 juergspaak

@juergspaak thank you, this is perfect. Looking forward to your review.

Dec 16 '22 14:12 tpoisot

@fbarraquand and @tpoisot , I only just figured out that notifications for issues can be a little weird - so in case you didn't see it, I posted the other part of my review above. Cheers!

Dec 20 '22 15:12 swmpkim

In one of my review comments, I noted that the version of {reticulate} on CRAN had a bug and installation of miniconda didn't work. The CRAN version has been updated and should work now, and I've updated my comment to reflect this.

Jan 09 '23 14:01 swmpkim

@swmpkim Many thanks for checking (and more generally for your detailed comments).

Jan 09 '23 14:01 fbarraquand

Hi

I finally got the time to run the code and review the paper

Saving figures As @swmpkim I was not able to save the figures and it appears to be the same issue "NameError: name 'sys' is not defined". From my understanding, all your code works in R and you solely retuculate python code for the plotting, which to me seems a bit overkill. Especially because now reproduction requires the installation of python (and I was worried whether this installation of python may mess up with my other pyhton on computer). If not too much work, I would recommend to save these figures with R code, as both reviewers were not able to save the figures. Addtionally, for me the figures weren't shown in R itself, so I actually don't know whether they match the ones in the paper.

Running code I tried to run the Script_DataHW1999.R, but it appears to be stuck. Currently trying to compute code for Fig. 3a and 3b. I'm not sure how long this part of the code takes, however if it takes longer than say 5 minutes on your machine I would recommend to add a warning in text stating how long it takes on your machine (well aware that this is only little info for how long it will take on my machine). But currently I'm not sure whether it is stuck (certainly 10minutes + running for now). Even better, if possible, add an update stating how much of the progress is already done. Figure 3a and 3b appear to compute the same task for many different levels of K, so maybe add a progress bar of how many K are already computed, this way I know that the code is not stuck somewhere.

Same applies to the experiments where a progress and an estimation of how long this takes on your computer might help to ensure the code is not stuck.

Finally, you mention in the README.md that Matrix_C and Matrix_K contain info, but what info is contained? After looking a bit at the code it appears that these are the parameter settings for the different figures. I recommend mentioning this explicitly in the README.md, which matrix is used for which simulation and what it contains (K contains the halfsaturation if I understood correctly). Additionally, I would add column and row names to these matrices

Jan 13 '23 10:01 juergspaak

For the article

Overall, I found the article, to the point, we replicate the idea of oversaturated coexistence in pyhtoplankton communities. Interestingly, I find that the paper replicates essentially two works at the same time. The Huisman & Weising 1999 and Schippers et al 2001 (which I do not know in detail). They in principle replicate the simulations of Huisman and Weising, but then go on to replicate results of Schippers showing that Huisman and Weising, while technically reproducable, may not be very important as it is very sensitive to perturbations.

I only have a few minor comment and one potentially large recommendation:

To me it is unclear how the original paper found the parameters for the super-saturated communities. If the authors have any insight on this I would appreciate such information. However, independent of how these parameters were found (which may very well have been brute force) I wonder how realistic these parameters are for natural communities. Or, vice versa, how likely super-saturation is for realistic communities. To test the robustness of super-saturation the authors perturbed the parameters from the paper, as reviewer 1 points out, it's unclear how realistic these values remain. As a potentially quite interesting extention I would recommend to use a database of realistic phytoplankton traits, e.g. https://esapubs.org/archive/ecol/E096/202/ I happen to have a paper where I do the same (but with a more complicated model including competition for light and zooplankton predation). I would be interested in whether under these circumstances super-saturation is even a thing. I have strong suspection that this experiment will simply show that it is not, so I would understand if the authors decline this potenially extensive additional work with potentially limited additional insight.

Besides this I only have minor comments: 2. Paragraph, 1. sentence: At that this is true at equilibrium only, otherwise species may coexist. That is the Huisman & Weising did not find a loop-hole in the theory, but simply explored what happens if the assumptions are not met.

paragraph: "an hypothesis" should be changed to "a hypothesis"

Numerical experiments: You take the abundance at the end of the experiment. But the idea is exactly that the densities fluctuate. So a more reasonable approach would be some sort of time average of densities above a certain threshhold

Numerical experiments: I would find these tables in form of graphs more insight full. Reviewer 1 pointed out that your perturbations are relatively large (to which I partially agree). A really cool addition would be how these percentages change as a function of the average perturbation.

Jan 13 '23 11:01 juergspaak

@fbarraquand -- REVIEWER 1 and REVIEWER 2 have made a series of comments on the manuscript, which are all reasonable. I think the issues with installations are worth spending some time on, although I do understand that some of it may be outside your control. The comments about the content of the manuscript are all very reasonable and can be addressed.

Please do submit a revision at your earliest convenience, and we'll run it past reviewers again for a decision.

Jan 19 '23 17:01 tpoisot

Hello @tpoisot

We have now revised the manuscript (pdf, revised files have been pushed to the main branch). The response letter is attached to this comment in pdf format (too long for github comments). We are grateful to the two reviewers for constructive criticism.

doyen_et_al_response_letter_rescience.pdf

Dec 04 '23 18:12 fbarraquand

@tpoisot if you need help for the publication, just ask me.

Dec 04 '23 21:12 rougier

Thanks @rougier -- I'll have a look this week and report back if/when I run into trouble.

Dec 04 '23 21:12 tpoisot

This is an accept for me -- I tried to go through the publication process and failed at:

tpoisot@weewoo:~/Documents/articles> ./process.py --sandbox --metadata metadata.yaml --pdf article.pdf
Traceback (most recent call last):
  File "/home/tpoisot/Documents/articles/./process.py", line 60, in <module>
    article = Article(file.read())
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/tpoisot/Documents/articles/article.py", line 127, in __init__
    self.parse(data)
  File "/home/tpoisot/Documents/articles/article.py", line 183, in parse
    self.date_received = Date(dates["received"] or "")
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tpoisot/Documents/articles/article.py", line 79, in __init__
    date = dateutil.parser.parse(date)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 1368, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 640, in parse
    res, skipped_tokens = self._parse(timestr, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 719, in _parse
    l = _timelex.split(timestr)         # Splits the timestr into tokens
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 201, in split
    return list(cls(s))
                ^^^^^^
  File "/usr/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 69, in __init__
    raise TypeError('Parser must be a string or character stream, not '
TypeError: Parser must be a string or character stream, not date

Dec 05 '23 20:12 tpoisot

I can take car of the actual publication. @fbarraquand where are your latex sources?

Dec 05 '23 21:12 rougier

Hi @rougier

If by latex sources you are referring to the .tex files used to generate article.pdf, you can find them in Article folder of the repository.

BR

Dec 06 '23 07:12 DoyenGuillaume

submissions submissions copied to clipboard

[Re] Biodiversity of plankton by species oscillations and chaos

Additional code thoughts:

Notes on the paper:

For the article

submissions
submissions copied to clipboard