deep-review The future of deep review

We resubmitted version 0.10 to the Journal of the Royal Society Interface and bioRxiv. Thanks everyone for all the great work on the revisions!

I'd like input on where we want to go from here. Should we continue to accept edits to the manuscript even after a version is accepted at a journal? Should we accept only errata corrections but lock the main content?

I don't want to dissolve this amazing group of authors. However, there isn't much precedent for living manuscripts that continue to change after publication, and realistically we are all very busy with other projects. The activity dropped off considerably between the first submission and the reviewer feedback.

Jan 19 '18 16:01 agitter

I think that if we have a committed group of maintainers there is the opportunity to do something new here in the way of a living scientific manuscript that stays up to date with the field. However, we probably need more than me and @agitter to make it sustainable. Does anybody else have an interest in helping to contribute at the maintainer level?

Jan 20 '18 15:01 cgreene

One quick thought. We should probably be talking more about what we do after v1.0, which I imagine would be the accepted version at the journal. At this point I feel like we should push to that finish line. 😄

Jan 20 '18 15:01 cgreene

We should probably be talking more about what we do after v1.0

Agreed. I think we should only take pull requests on obvious typos until v1.0.

Jan 20 '18 19:01 agitter

Agreed - hit the (immediate) finish line and then worry about the future.

And when we get to that future, a few discussion points or ideas:

Is github the right place?
Is building on the paper the right place?
Should it be a monolithic paper (or whatever) or broken into subjects?
Confess that for this last stretch, I found it very hard to know what was going on. Which is not a dig at anyone, just that a blizzard of issues and alerts was hard to navigate or prioritize. Possibly more experienced github users have some insight or solution?

Jan 22 '18 12:01 agapow

I could be interested in this. It could help to define expectations for maintainer roles, but those obviously depend on a lot of other variables.

Similarly to @agapow , I felt that keeping up with the notifications was sometimes like drinking water from a fire hose. I think this was due to the intermixing of notifications related to content (i.e. tracking new references and writing) and infrastructure (e.g. administrative, repository code, formatting).

I also wonder if github is the best place for this sort of thing, or if such a platform exists.

Lastly, given the size of the paper, does shifting to a different format - one that is designed to organize information on a grander scale (e.g. a book) - make more sense long-term?

Jan 22 '18 18:01 evancofer

@cgreene @agitter I realize I'm a bit late to the conversation but fwiw, would definitely be interested in helping in a maintainer function or role.

@evancofer To your pt. about inundation, was the Projects feature (or something similar with GH integration like Trello) used to track progress? I wonder if that might be one way of making the different workstreams a little more manageable and organizing issues based on the topic or sub-topic.

Feb 05 '18 21:02 stephenra

@stephenra AFAIK there weren't any project management tools (e.g. Trello, Asana, GH Projects) used. I would guess that @cgreene lab had some internal tracking of general project status, but that probably isn't too useful for our purposes. Enabling contributors to easily subscribe to notifications for one or a few sections/topics could be useful.

Feb 05 '18 21:02 evancofer

@stephenra we used milestones within GitHub and labels (usually, not always) for some form of organization. We also ended up having a few master issues to track progress and link to related issues at various project stages (e.g. #188 and #678 ). @cgreene and I didn't really have any formal internal tracking beyond that, and I'd be open to better organization if other maintainers join in to keep this going.

Feb 05 '18 22:02 agitter

@evancofer @agitter Thanks for clarifying! I'm tool-agnostic but under the working assumption that this continues to grow in scope, it may be helpful to adopt one (my past experience with Trello and GH Projects have been overall positive but admittedly Kanban board-style project management tools aren't for everyone). Is there an estimation of roughly how many maintainers would be needed to keep the project going?

Feb 07 '18 02:02 stephenra

@stephenra I'd say the number of maintainers depends on what exactly we want to sustain. Is it an up-to-date manuscript or book? A curated list of papers, tools, and data? Something else?

Feb 07 '18 11:02 agitter

@agitter Makes sense. And apologies, I realized I'm getting ahead of the conversation given the immediate focus on v1.0.

Feb 08 '18 05:02 stephenra

@stephenra This is actually a good time to have the conversation while we still have contributors' attention after the recent round of revisions. Let us know if you have more thoughts about what form the future product or project should have.

Feb 08 '18 11:02 agitter

I agree that now is a good time to figure this out. In terms of tooling, our lab has used waffle.io for other projects and found it useful. I think the same things that it has helped us organize could aid the maintainers in planning what to include.

I also think we'd be breaking new ground on authorship, but I like the idea of a "release" occurring either every 6 or 12 months (from our own experience, I think 12 months is more reasonable). If there were project participants who would like to lead each of those releases, I think the authorship categories could accommodate a reshuffling of the author list on each release (we could stick "maintainers of previous versions" in a category that doesn't shuffle to the last positions - those could be "maintainers of the current version"). Maybe JRSI would like to publish an annual update for a few years, or maybe we could talk with other journals about future releases (imagine a Jan 2019 release date for the next one...).

If any journals are interested, feel free to chime in :)

Anyway, these are just some thoughts.

Feb 08 '18 14:02 cgreene

If you want to move on to another collaborative paper in deep learning for medicine, try:

“DiversityNet: a collaborative benchmark for generative AI models in chemistry”

Feb 11 '18 14:02 benstaf

@mostafachatillon : thanks for raising that. It might be more appropriate to raise this as a new issue since your point doesn't relate directly to the future of this paper.

Also, note that your blog post has an inaccuracy. You say:

But for writing the DiversityNet paper, GitHub is an inappropriate tool, because GitHub does not natively support math formulas. and you link https://github.com/github/markup/issues/897#issuecomment-288580903.

That is related to GitHub's native system for displaying markdown. Deep-review doesn't use that. It may also be the case that manubot, the build system for deep-review doesn't yet support formulas. However, if that's the case you should correct the inaccurate link in your blog post.

Feb 11 '18 15:02 cgreene

@agitter @cgreene Apologies for the lapse in response.

I agree that now is a good time to figure this out. In terms of tooling, our lab has used waffle.io for other projects and found it useful. I think the same things that it has helped us organize could aid the maintainers in planning what to include.

Agreed on the tooling. I've heard good things about waffle.io and had some success with Asana and Trello, which both integrate with GitHub as well. I'm not particularly opinionated on this so I would imagine whichever platform most contributors feel comfortable with or offers the lowest bar to access is the best way to go. I'd be happy to set up a survey, if that helps.

Apart from GitHub issues, I've found it helpful and more easily manageable in tracking todos and PRs to batch issues by categories (rather than just tags). I'm not sure if the lab(s) adopted this approach but, for example, the different application domains/sub-domains in the paper could be a natural way to think of structuring these categories (e.g. gene expression vs. protein secondary and tertiary structure, etc.).

I also think we'd be breaking new ground on authorship, but I like the idea of a "release" occurring either every 6 or 12 months (from our own experience, I think 12 months is more reasonable).

I favor the idea of 12 month release as well. It gives time to account for difficulties in scheduling and coordination for contributors and, given the speed of the field, it also provides time to understand a broader range of contributions and distinguish what might be meaningful work vs. flag-planting.

Feb 21 '18 04:02 stephenra

@cgreene @agitter @stephenra A yearly release sounds feasible and reasonable.

I have used Asana and Trello in the past, and I am comfortable with using both. Tentatively, I would lean towards Asana because it seemed to be (at least to me) more flexible and feature-rich than Trello. However, I am not particularly familiar with integrating either of them with GitHub. Is there a way to use any of these project management tools in an "open" manner that allows people to view the current project status without necessarily signing up for an Asana/Trello/Whatever account and so on? At least with respect to content reviews and discussion, it is probably important to maintain this project's transparency.

Obviously, the immediate goal is to finish the initial publication. The next step is to identify and enumerate the specific maintenance tasks, especially those that the current team needs the most help with. With regards to planning for long-term progress, it would also be useful to list any goals/problems that have come up but were too ambitious or not pressing enough for the initial publication.

Thoughts?

Mar 09 '18 18:03 evancofer

@evancofer I believe you can make Asana projects 'public' but this only makes the project viewable to others who are part of your organization but not necessarily a team member (as opposed to anyone, in general).

On the other hand, Trello you can make publicly viewable to anyone and the project page will be indexed on Google. I do agree on the pt. about transparency -- to this end, I've worked on or seen some projects that use some combination of GitHub, Trello, and Gitter. The code/repo is on GitHub, the (public) project management is handled by Trello, and the community and chat is on Gitter. If that's too much added complexity, perhaps GitHub and Trello might be best.

Mar 11 '18 04:03 stephenra

@stephenra Trello and GitHub seems like a good solution without too much added complexity. I'm thinking we could use Trello to track maintenance etc and keep discussion on GitHub (and use continue to use issue labels and other features to track and organize).

Mar 11 '18 04:03 evancofer

@evancofer That sounds reasonable to me. :+1:

Mar 11 '18 04:03 stephenra

If you have not played around with http://waffle.io I would encourage you to give it a shot. I made a deep-review waffle. It is an overlay on github issues, so it's convenient to work with in this framework: https://waffle.io/greenelab/deep-review

At this stage, I think we really need 2-3 committed maintainers to develop a new plan, update the README with the plan, and then start to take over the project with the goal of releasing a new release at some point in 2019.

Mar 12 '18 12:03 cgreene

I went through all issues up to 100 and I closed them if we had referenced the paper or if it was a discussion that had concluded.

Mar 12 '18 12:03 cgreene

@stephenra @cgreene The waffle.io view on the project should work fine.

Like Casey said, we should probably find some more committed maintainers interested in long-term work if this is going to be successful. Contributors were obviously a good place to start, but I am unsure where to search next?

I'll get working on an update to the README and submit a PR sometime this evening. This will probably include a status update and a new section about the future of the project.

Mar 12 '18 14:03 evancofer

It might be nice to think about an authorship model where people "decay" towards the middle after a release. The current set of authors would be the "middlest set" of the next release (unless they contribute) and new authors would bookend them. I'd imagine maintainers at the end with the other author classes on the front.

If people understand how these items will be handled, it might help to draw in new contributors. I'm also happy to promote the work towards a 2019 release, and I'll even commit to a bit of writing (though at this time I'd prefer not to be a maintainer 😄 ). It sounds like @evancofer and @stephenra might be interested. Maybe you could snag a third so that votes are resolved via tiebreak, although @agitter and I did survive the pairing.

Mar 12 '18 15:03 cgreene

It does seem prudent to get a third person. Most of the people that come to mind are in my current lab or department, so - out of fairness - I am somewhat hesitant to recommend any of them.

It may be best (in terms of ethics and effort) to, as you say, append them in a semi-randomized order. Perhaps we could do this at the end of every month (or some other period of time)? I imagine this could incentivize repeat contributions. Perhaps it would be useful to use a semi-random hierarchical grouping again? Was manually determining author hierarchies time consuming or maintanable?

Mar 12 '18 15:03 evancofer

I agree that it is important to think about authorship, how new contributors will be recognized and incentivized, and what will happen to the existing contributors in a v2.0 release. We can break precedent with the v1.0 author ordering algorithm if that makes it easier to continue deep review in the long term. I wouldn't expect to be kept in my current position if new maintainers take over, and I do see myself more as a standard contributor than a core maintainer for the next release.

However, if you don't find a third maintainer, I'd be willing to help with tie-breaking in special circumstances.

Was manually determining author hierarchies time consuming or maintanable?

We only did this twice, so it wasn't too onerous. We also kept the categories broad to help. It did require considerable manual effort because we reviewed commits as well as contributors' discussion in issues and pull requests. I was initially working toward fully automating the author randomization but stopped once Manubot because a separate project. The deep review author ordering was too specific to this collaborative process.

A fully automated ordering for Manubot should probably take some unordered author yaml with whatever extra metadata is needed for ordering, sort the authors, and pass the sorted list to Manubot as metadata.yaml.

Mar 12 '18 16:03 agitter

I have made a new issue (#833) dedicated to the discussion of author ordering.

Mar 12 '18 16:03 evancofer

@cgreene Thanks for the waffle.io for the review. LGTM, as well. And yes, count me interested!

@evancofer @agitter I could reach out to some folks who have read the review but have yet to have any direct involvement to see if they're interested (looping in @austinvhuang in case he has might have some suggestions for folks, too). Are there any hard 'requirements' as to who maintainers ought to be in terms of background, existing contributions to the project?

Mar 12 '18 16:03 stephenra

Ok - I did a bit more triage. If it's in the "Inbox" column, it hasn't been checked. If it's in "Backlog" then it exists and has not been addressed (if it's a paper, it hasn't been cited). @evancofer and @stephenra : I think that it makes sense for you two to feel free to close these issues at will if you don't immediately see a reason to cite it.

It may also be good to identify sections/themes that are not well covered (but that you wish were covered).

Finally, I've taken some issues and assigned them to myself to see if we can get those cleaned up.

Mar 12 '18 16:03 cgreene

@cgreen noted. It seems I cannot assign myself to issues (i.e. #598, #605). Anyhow, I will go through and find sections that were skimped on. I already know a few that I can add to significantly. I've opened #835 to cover this.

Mar 12 '18 21:03 evancofer

deep-review deep-review copied to clipboard

The future of deep review

deep-review
deep-review copied to clipboard