owid-grapher
owid-grapher copied to clipboard
Admin for managing Google Docs articles
(original description by @mlbrgl)
As part of the migration from Wordpress to google docs (#1597), we're replacing the Wordpress block editor with Google docs.
Beyond editing documents, Wordpress provides some non-editor features, such as publication workflows, media management, document management or category management.
Wordpress block editor ≡ Google docs
Wordpress admin ≡ ?
This issue aims at filling some of this gap, starting with document management. Other candidate CMS features of the current set-up can be found in the CMS glossary, which will serve as a basis for migration decisions around CMS features.
A possible approach has been developed in the https://github.com/owid/owid-show experiment and might be ported over to https://github.com/owid/owid-grapher. Some other approaches will be considered as well, in anticipation of the port of all remaining CMS features into this new foundation.
Cycle 2022.5 (Aug 29 -> Sep 16)
Wireframes, background work
Cycle 2022.6 (Sep 26 -> Oct 14)
Completed on branch, awaiting PR review
- [x] gdocs refresh flow
- [x] API routes
- [x] content checks
- [x] UI, dialogs
- [x] wiring with Matt's work
- [x] feedback rounds
- [x] integration of additional metadata (e.g. authors, date)
- [x] UX polishing, probably around document status and available actions
- [x] preview in new tab?
Merged and shipped ⏳
Additional scope
- [ ] integration with feed (and newsletter)
- [ ] integration with Algolia
Gsheet-based publication workflow
This is an exploration of using Gsheet as the replacement fro Wordpress admin features, starting with the publication workflow.

An example spreadsheet for managing articles and their publication
sequenceDiagram
Author ->> Gdocs: write draft
Author ->> Gsheet: register article
Author ->> Gsheet: publish article
Gsheet ->> Gsheet: run appscript
Gsheet ->> Baker: send baking request w/ document ID
Baker ->> Gdocs: get document content
Gdocs -->> Baker: receive document content
Baker ->> Baker: process ArchieML
Baker ->> Grapher DB: save JSON
Baker ->> Baker: bake site
Authors write ArchieML documents in Gdocs, then list them with their metadata in the companion spreadsheet.
Conclusion
Gsheet as an admin for managing articles is not convincing. Favour tighter integration in bespoke admin.
Rationale
With most of the article metadata pushed to the document frontmatter, there is little to no content left in this sheet. This makes the value of spreadsheet functions and scripting features equally limited. Versioning and commenting are probably among the most salient features spreadsheet offer (incidentally not specific to the format), but fail to outweigh the benefit of tighter relationships between charts and articles (including referential integrity), a visible and well-known limitation of the current system.
Pros and cons of Gsheet as an admin interface for articles
Roughly by order of importance
Pros
- versioning / rollback
- contextual (per cell) commenting + notifications
- little code / maintenance effort to benefit from spreadsheet features
- sheets can be enhanced through scripting (e.g. bulk operations)
Cons
- lack of native referential integrity for manual, non content graph, article <-> chart relationships (e.g. originURL, categories (if kept))
- wiring required between independent systems (gsheet and baker)
- filtering / searching article content and frontmatter not in the sheet requires API calls (to Algolia or Grapher)
- appscript DX?
- preferable to export to grapher DB in case of API unavailability
Admin screen for managing google docs based articles (lo-fi wireframe)
👉 Interactive prototype


Done with Lo-Fi Wireframe Kit
Wireframe update:
- replace "Published" toggle by button: not appropriate for tasks with delays or dialogs
- add latest publication date
- create a "danger zone" by colocating "unpublish" and "delete"
- swap "publish" for "update" when document already published
- remove inactive / mutually exclusive buttons, see decision diagram below:
graph LR
A{Is published} -->|Yes| D
A -->|No| B[fa:fa-cloud-upload-alt Publish]
D{Has updates} -->|Yes| C[fa:fa-redo-alt Update]
D{Has updates} --> F[Unpublish]

Wireframe update:
- add confirmation dialog with content checks on publication and update actions
Wireframe update:
- move "danger zone" actions (unpublish, delete) behind a dropdown. Frequency of use doesn't require them to be always visible, making space for other more recurring actions.

- add publication settings screen. Will be triggered before initial publish, and when pressing "Settings". Not shown on updates (assuming unchanged settings always validate)

An argument for the metadata source of truth to be the admin DB
In order to simplify the publishing workflow for authors, we want the slug to be automatically derived from the title upon publishing - When the title changes later on, the slug should remain unchanged to avoid creating broken links - The slug can be manually changed if necessary - When a slug is manually changed, we should create a redirect between the old and the new slug
Source of truth
With the slug edited in the admin, we can...
- check for uniqueness
- implement character validation
- regenerate it from the current title
... in a more straightforward way than if the slug was typed in as text in the document frontmatter. It also promotes a tighter feedback loop for errors or suggestions.
With the slug edited in the admin, authors can carry out all publication workflow related tasks in the same place, instead of going back and forth between the document (for editing metadata) and the admin (for validation).
Status update (Tues 13th)
With the biggest part of the UI / UX work done on Admin for managing Google Docs articles, I'm moving more confidently on the implementation of the basic publication flow this week, including:
- gdocs refresh flow
- API routes
- content checks
- UI, dialogs
- wiring with Matt's work
I'm aiming to get the happy path handled by the end of the week, to run through a first feedback cycle during cooldown. The rest will most likely span over 2/3 of the next cycle, giving one (maybe two) additional opportunities for feedback. Some of the "source of truth" discussions haven't fully landed yet. This brings a layer of uncertainty which we should account for.
Status update (Sep 14)
A demo a of the (now abandoned) early implementation of the wireframes above (commit from Sep 14)
https://user-images.githubusercontent.com/13406362/191674624-f9c8c201-a728-4690-baf5-ef98a13428f6.mp4

Feedback from individual sessions with @larsyencken (Sept 20) and @eoo-owid (Sept 22):
- in draft mode, the "Publish" button might be too prominent for the frequency at which it will be used. In that sense, it is not a primary action, but "Save draft" is. That prominence might also attract accidental clicks. This suggests that it should be managed, either by making the button look more "dangerous" or by adding a confirmation dialog.
- given how prominent the title is, it might be desirable to invite people to edit it as they preview the document for the first time
- it is confusing that the slug error disappears when opening the settings, without any changes being made (the slug is automatically filled from the title if none present, but only when opening the settings)
- comments and suggestions are not visible in preview: that might be an important information to surface when deciding to go ahead with publishing. A more restrictive version of the workflow could prevent publishing when the source document has unresolved comments or suggestions
- if A edits and previews at the top of a document and doesn't notice B is also editing the document, A might go ahead and publish B draft changes without knowing
- surfacing the changes that are about to be published would help with the problem above, but is also probably a valuable sanity check. Content diffs are already available in google docs, but there is no indication of which version is published. Named versions could help but are manual
- additive suggestions are being considered part of the text and transparently integrated: this is probably not desired. Subtractive ones are being ignored, which is less of an issue, but should still be surfaced (see above)
- title of the google docs file vs title of the article: should these be synced? In case the title of the document changes later on, the difference might create confusion. To note, the google docs title might currently be used for metadata, e.g. with prefixes ("-final", "-v3").
- we are creating two independant organization systems: google drive (team and/or private), and the CMS. How will that scale? What is the reference?
- status update following demo
- demo slides
Status update Thu 29

Content metadata vs publication metadata
After experimenting with editing the title in the admin, I'm bringing it back to the google doc. The split is roughly:
- content metadata (e.g. title, subtitle) --> google doc
- publication metadata (e.g. slug, publication context) --> admin
By keeping the content metadata in the google doc, authors get to save more complete drafts of published articles, including the content metadata. Otherwise, when the content metadata is saved in the admin, authors have to publish straightaway or discard their changes - as changes to published articles cannot be saved. Keeping content metadata in the google doc might also make it easier to leave comments related to these fields, although this has been deemed a minor concern thanks to the :skip section.
The slug will remain in the admin.
Another dimension for these source of truth considerations is the complexity of the data being entered. Titles are simple strings but authors or categories might benefit from a tighter integration in the admin, to provide a better editing UX as well as enforcing referential integrity between e.g. articles and their categories. As an example, this would prevent deleting a category used by an article and also allow for renaming categories across all articles.
Merging https://github.com/owid/doc-to-archieml with codebase
- easier to debug
- in-line with codebase coding style + typescript
- as it was shipping with OWID-specific code (e.g. {ref} handling), it looked less like a reusable library
Status update on current cycle scope - Oct 7 For @larsyencken, @mathisonian
- [x] gdocs refresh flow
- [x] API routes
- [x] content checks
- [x] UI, dialogs
nice to have
Notify when leaving the page with unsaved edits in non-gdocs metadata. Gdocs metadata is saved either way in gdocs, so nothing is lost if the preview is closed with unpublished changes - which is not the case for non-gdocs metadata. - [ ] wiring with Matt's work Baker and Wordpress integration operational, but no publish on change yet
- [x] feedback rounds 2 formal meeting to date with Joe and Mallika (and Hannah for one of them), in addition to informal feedback from other team members (see above). Recurring meetings set up on Tuesday PM.
- [ ] integration of additional metadata (e.g. authors, date) Dateline, publication date, byline, excerpt added. Missing publication context.
- [x] UX polishing, probably around document status and available actions
- [x] preview in new tab?
Thanks for the update Matthieu, looks like it's coming together.
Can you please update the issue summary at the top, and perhaps reorganise the checklist into things that are completed on-branch vs things that have passed review and shipped?
For this project, it would probably look like this:
- [ ] Prepared PR for review
- [ ] Feature 1
- [ ] Feature 2
- [ ] ...
- [ ] Merged and shipped
Status update - Oct 13
- [x] wiring with Matt's work (see lightning update announcement)

This was the last remaining item in scope. PR opened: https://github.com/owid/owid-grapher/pull/1703