rootstock
rootstock copied to clipboard
Best way to get feedback from many people? Issues/PR vs Google Docs?
We tried out manubot for our manuscript and one part that required some manual work was receiving feedback from many people in a short time.
In our case we had finished a version of the manuscript and wanted to share it in people in the lab to get their feedback. It seemed easier (for them) to make a google doc version of the manuscript (using through the docx output), get comments there, then updating the text in the repo manually. The comments ranged from fixing typos, rephrasing sentences, asking for clarifications. If each was its own github issue that would represent a lot of issues. Also it would make difficult for a second reviewer to see the changes/comments of another person.
Do other manubot users use the pull request strategy even for these kind of situations (many minor changes by many people at the same time)?
If so, it would be nice to add to the documentation a few links to some examples of these kind of issues/PR (e.g. fomr the deep review). For example, a PR when someone added a section reviewed by a few other people, an example where many people gave feedback on style/typo, an example of an issue that lead to someone else improving a section, etc.
In #221 @slochower discusses the pros and cons of using Hypothesis for rapid, publicly visible, concurrent feedback on a draft. See especially his notes in https://github.com/manubot/rootstock/issues/221#issuecomment-494954571 about using groups or a versioned URL to help with stale comments.
Hypothesis may be the best choice for the workflow you described. I could imagine an alternative approach of collaborative pull requests for each section or subsection. Depending on the permission levels the readers/reviews have on the manuscript repository, they could either all make small commits in the same pull request or suggest changes that a maintainer would accept. I can't think of an example where that type of pull request has been tried. In the deep review, our reviews were often asynchronous and completed over several days.
I'll keep thinking about other solutions as well.
Thanks, that looks promising. It still requires someone to go through them one by one but at least there's no need to convert the manuscript to docx.
I think next time we'll try a combination, sharing the HTML and repo, and having people use: Hypothesis for quick comments or clarifications, github issues for major points, and PRs when they feel they are going to change the text a lot.
Hi @jmonlong.
Do other manubot users use the pull request strategy even for these kind of situations (many minor changes by many people at the same time)?
I've dealt with the combination of people providing marked up PDFs, filing issues, filing PRs, and highlighting via Hypothesis. I think I might recommend people using Hypothesis on a versioned URL of the manuscript for the easiest way for a group of people to comment on the manuscript. This way people can see each others' comments and your own replies, and I believe it also supports some rich text features. Handling PRs from people not familiar with GitHub requires that I ask permission to commit to their PRs before merging (or merge their PRs and do cleanup afterwards).
I think next time we'll try a combination, sharing the HTML and repo, and having people use: Hypothesis for quick comments or clarifications, github issues for major points, and PRs when they feel they are going to change the text a lot.
I think that's how I'd approach things, as well.
I could imagine an alternative approach of collaborative pull requests for each section or subsection
Maybe if we can use PR templates, this would be easier.
Handling PRs from people not familiar with GitHub requires that I ask permission to commit to their PRs before merging (or merge their PRs and do cleanup afterwards).
Could there be an alternative PR workflow? What if I as the main author create a new PR "Soliciting feedback on 02.main-text.md" and share the PR URL and manuscript with my collaborators. They make suggestions or comments anywhere in 02.main-text.md
. I can accept the suggestions or make revisions based on the comments as everyone edits. When the editing is done, I merge. If there are multiple Markdown files, create one PR per file.
One downside is that the rendered manuscript and Markdown would not remain synchronized. I'm also unsure how GitHub scopes the lines that are available for comments and suggestions in a PR. It looks like it is only the blocks of content that have been edited +/- a few.
A few of us could try out commenting and review ideas on https://github.com/manubot/try-manubot. We'd have to populate 02.main-text.md
first.
I think next time we'll try a combination, sharing the HTML and repo, and having people use: Hypothesis for quick comments or clarifications, github issues for major points, and PRs when they feel they are going to change the text a lot.
I think you are on the right track here. Using Hypothesis on a versioned URL (like this) allows everyone to comment on the same manuscript at the same point in time. I think this is similar to Google Docs with the difference that the comments are public and without "suggested changes". Note that currently the hypothesis comments on a versioned URL will only show up for that URL but this depends on https://github.com/manubot/rootstock/issues/203.
GitHub Issues are good for major points where substantive follow up discussion may be warranted.
PRs are good for major text changes but also for small text changes. This will likely depend on the user, but personally I prefer when my contributions are recorded in the commit history, regardless of how minor. Using the GitHub interface to make PRs is possible as shown in this video. I guess some users will want to make trivial PRs (ideal), while other's may not want the hassle and would prefer to leave Hypothesis comments.
We have had many individuals propose PRs to the same manuscript concurrently in the PSB Manuscript. We minimized conflicts by creating a section in the markdown content for each potential contributor.
GitHub has bad behavior when users use the web GUI editor and commits to master are made while they're editing as described in https://github.com/dhimmel/psb-manuscript/pull/7#issuecomment-451666608:
I didn't merge any while we were going because of an issue with GitHub's pencil edit. People hadn't created their own fork/branch when they clicked pencil. Instead GitHub was taking the difference between their edited file and the current master branch when they saved those changes. So their PRs would end up deleting anything that had been committed in between when they initially clicked pencil and saved.
You may have tested this in https://github.com/jmonlong/manu-vgsv/pull/104. But in short, this often means it's best to wait until after everyone has submitted their edits via PRs to make any merges when having a concurrent editathon.
What if I as the main author create a new PR "Soliciting feedback on 02.main-text.md" and share the PR URL and manuscript with my collaborators.
This is another possibility but takes a bit more setup. Worth considering. Make a PR from master
to a branch that has no content
(possible reset to an earlier commit). This will make every line of content reviewable with GitHub's PR interface (and new suggestions interface). Then suggestions can be accepted and revisions made thereby updating the master branch. @agitter was that your idea?
@dhimmel I didn't have any specific idea for how to make every line reviewable, yours is the first. I was thinking of adding comments with special characters between paragraphs (or at the end of every line) and then removing them before merging, but that's messy and error prone.
I think there is no, as of yet, one-size-fits-all approach. Generally, when I solicit feedback, I receive a mix of "here are my thoughts on such and such"-type comments and "here is an edit I want to make"-type comment. I don't think there is a good way to do that with one single method. In the early round of my most recent manuscript, I used Hypothesis for comments, which are definitely good for the "here are my thoughts on such and such" pieces of feedback. But once I thought I had things nailed down, I asked for PRs. This works well, but then I ended up with PRs containing commentary like so:
data:image/s3,"s3://crabby-images/2a8ea/2a8ea56c761acfc1f3c051f82641b091de2a0936" alt="image"
Obviously those comments are not things that I want to merge into master. I can start a discussion in the PR based on those comments -- which is good! -- but it would probably be even better if they didn't get incorporated into a commit first.
Thank you for sharing these insights about some people preferring Google Docs. Another disadvantage I see with Hypothesis is the friction of needing to sign-up for those who do not already have Hypothesis accounts. Tons of people have Google accounts. Few have Hypothesis accounts (relatively speaking).
Pandoc does impressive high-fidelity two-way conversion between markdown and docx. Useful would be a utility that manages the synchronization of a markdown file in git and a .docx file in Google drive.
I just did some quick tests and I can predict that two-way conversion will be very annoying for any non-trivial use of LaTeX. Simple inline math should be OK. It looks like basic text should work very well.
@agitter and @dhimmel, are you aware of any utilities that help manage the above synchronization?
Here are some resource that could be useful for anybody attempting such a utility:
For easy partial solution: https://stackoverflow.com/a/63354000/2420027
For fancier full solution: https://developers.google.com/drive/api/v3/reference/files/get https://developers.google.com/drive/api/v3/reference/files/update
I might be attempt this.
I'm not aware of any such utilities, but I haven't looked. I'm tagging @rando2 because she has been interested in two-way conversion between markdown and docx for collaborative manuscripts that include less-technical writers.
@castedo do you know what docx comments look like when the docx is converted to markdown?
@castedo do you know what docx comments look like when the docx is converted to markdown?
In my test with pandoc the comments were not converted and just dropped and then the suggested edits where automatically applied (to the output markdown).
In more words and detail: there were two kinds of changes I made in Google Docs into the .docx file: 1) a suggested edit and 2) just a comment to a highlighted span of text. Pandoc conversion to markdown dropped 2) and included 1) as though the suggested edit had been accepted.
Another possibility I think worth considering for more user-friendly feedback and participation mechanisms is some kind of integration with https://hedgedoc.org/. It looks like it supports multiple auth services: https://docs.hedgedoc.org/guides/auth/oauth/ including Google.
Thank you for tagging me @agitter! @castedo, this is a question I am very interested in. For the COVID-19 review (https://github.com/greenelab/covid19-review/), we have over 50 authors, mostly biologists and clinicians with limited GitHub experience. I've done a lot of copying edits over like you describe. We have identified two barriers for our audience:
- People need/want a more WYSIWYG style interface to write/edit in than a simple text editor
- The docx/track changes momentum is very hard to break
The former is probably fixable by getting people to adopt a markdown editor. The latter is more technically difficult. However, because docx stores everything in XML, it seems possible to automate the conversion of a docx file to a PR. For example:
is rendered in the document.xml file as:
Comments are also stored in comments.xml.
I think this *might* be possible for projects that use a one-sentence-per-line style (as we do on the COVID-19 project). I would really like to build a pipeline or at least experiment with converting these changes to markdown, but I haven't gotten to it yet.
@rando2 Thank you for sharing your experience here! I don't totally follow what you mean by docx/track change momentum being hard to break. Perhaps your thoughts on the following workflow will help me understand.
I am curious what pros and cons you see for the following workflow. This workflow could be completely automated, or mostly manual. Let's imagine that one of the files in a git repository is a markdown file called intro.md. It's the source for just one part of the overall document.
- intro.md is auto-formated by pandoc for all commits so that it roundtrips cleanly via pandoc (that is
pandoc intro.md -t markdown
outputs the same file as intro.md) - For each git commit on main that generates an online document, a new branch is created, let's image today it gets named
200502-docx-intro
. -
intro.md
is converted via pandoc to a new .docx file which is uploaded to a Google Drive and then shared (as a new file). - A "git hater" :-) can then make their edits via Google on that new Google Drive .docx file
- When the git hater is done, they request a pull request
- The Google .docx file changes to be shared read-only, then converted back to markdown via pandoc and becomes a new commit on the
200502-docx-intro
branch, and a pull request is made for that branch.
Pandoc will treat all edits/comments as accepted when it converts back to markdown. The track changes feature of Google Docs is essentially not really used. Git is used for tracking changes.
In your experience, what kind of pros and cons would this have?
Hi @castedo! My experience on this project has been that people feel most comfortable editing in track changes, so it's hard to convince them to adopt markdown-based approaches (or at least, it's a lot easier to engage them if you allow contributions via track changes!)
I am working on this on the side, and I'll update when I have a prototype to handle steps 4-5.
and I'll update when I have a prototype to handle steps 4-5.
Thanks! I'm very curious how parts of my theoretical suggestions hold up to reality. In my dreams they work great! :smile:
BTW, in hindsight, my statement
The track changes feature of Google Docs is essentially not really used. Git is used for tracking changes.
is too ambiguous. I realize now that "track changes" can be used: A) per new temporary branch or B) on a on-going bases, across many merges and branches, with docx track change data getting synced with a git repository somehow.
B) sounds really really difficult, whereas I'm hoping A) is relatively easy.
Very much makes sense to use track changes ("Suggesting" in Google) in the above temp new branch with docx idea. Although the tracked changes only show up for that temp new branch. I hope git haters aren't terribly annoyed by tracked changes getting thrown away after one of these temp new docs branches is merged.