jekyll icon indicating copy to clipboard operation
jekyll copied to clipboard

tools to facilitate editorial tasks + discussion about yaml fields & naming images

Open rivaquiroga opened this issue 2 years ago • 3 comments

Hi, everyone!

The Spanish team is currently working on creating tools to facilitate our editorial tasks. While developing these tools, we realized that there are some parts of our workflow that we need to standardize. The aim of this ticket is to share what we have been working on, and to start (or continue) the discussion around some parts of our workflow.

The tools

They are all in their first iteration, so any suggestions on how we can improve them are very welcome.

  • @nivaca wrote a script to rename images of original lessons when they do not comply with our standards (lesson-slug01, lesson-slug02, etc.): https://github.com/nivaca/phimages.
  • I developed a R package that does the same: https://github.com/rivaquiroga/phtools. It also has a function to check if there are broken links in the lesson.
  • With @nivaca we are currently working on a YAML validator: https://github.com/rivaquiroga/phyaml.

Things we need to discuss

1. Older lessons are not using our current naming pattern for images

The naming convention for images (lesson-slug1.png, lesson-slug2.png, lesson-slug3.png, etc.) became our standard when the EN author guidelines were updated. This means that all the lessons published before that moment have images named in very different ways. This is an issue for translations (and is confusing for translators), because you have to deal with two different naming conventions. It might be a good idea to update the image names of all the lessons :grimacing:.

2. We haven't officialy agree on how to name translated images

When translating a lesson, we sometimes change some of the images of the original lesson (e.g., when the tool UI has a version in the translated language, when we remake plots with a new/translated dataset, etc.).

There is no official convention on how to name the translated images. What we have done in the ES team is using the convention of original lessons. So a translated lesson will have two possible patterns: original-lesson-slug1.png and translated-lesson-slug1.png. We store the translated images in the same folder of the ones from the original lesson. What have the FR and PT team done in similar cases? It would be great if we can agree on the best way to solve this (also because it will allow us to use the tools for checking how images are named when editing translated lessons).

3. Required YAML fields

The YAML validator seems to be useful in two moments: before uploading the file to ph-submissions, and before moving the file to jekyll.

YAML fields of lessons that are starting the editorial workflow:

This is a question for @programminghistorian/technical-team: what are the required fields for a lesson to be rendered properly on ph-submissions? We are not sure which are all the fields we need the validator to check in this initial step. Our lesson templates have a lot of stuff, but not everything is needed at the beginning.

Lessons ready to be published:

It looks like we have 5 possible scenarios regarding YAML headers:

  • Original lesson
  • Translated lesson
  • Original lesson that is part of a sequence
  • Translated lesson that is part of the sequence
  • Any of the above that uses formulas.

Those are all the possibilities, right?

rivaquiroga avatar Nov 06 '21 16:11 rivaquiroga

Hi Riva, on point 2: here is some information on the FR experience. In the very first case that remains unique, we followed the logic found in the file for the EN and ES image to be in harmony with the rest. You can see the file here and the folder of the lesson here.

In the following lessons we just kept the EN name file and added -fr in the end. You can see how this looks in the file of the lesson on Juputer notebooks or on Twitter bots.

According to the pattern you propose for images of translated lessons, just an observation: as frequently only some images of the original are produced for the translated lesson, attention should be paid to keep the original sequence in the file name. So, if say only image 2 of the original is provided in a translated version, then name translated-lesson-slug2 this image even if it is the first one in the sequence of the translated lesson (so that we know easily it corresponds to the image 2 of the original).

On point 1: We do not have a standard currently (we have the file names the authors gave), but we can apply the instructions of the new guidelines (we just translated them) to the new original lessons we will receive. Otherwise, I would not be in favor of renaming image files of the lessons preceding the new guidelines. This could be tricky and have perhaps unexpected consequences to deal with.

spapastamkou avatar Nov 15 '21 22:11 spapastamkou

Hello @rivaquiroga. Thank you for making time to share these ideas with me yesterday!

As we discussed, it would be interesting to consider an extension of these tools to support typesetting across all four languages. The aspects of typesetting work which I think could be automated:

  • Numbers: ensuring that numbers one to nine are written as words, numerals 10 and above
  • Headings: ensuring that header 1 is not used (because this is reserved for the title). We could also make a decision to ensure consistency at other heading levels, for example, header 2 for sections, header 3 for sub-sections, and header 4 not used
  • Lists: ensure that bulleted or numbered list items each start with a capital letter, and that they are not followed by a full stop/period or comma

We also spoke about a further use-case for a similar tool to support the generation of archival links. You explained that one of the tools you are working on includes a function to check if there are broken links in the lesson. This could automate the first step of the Perma.cc workflow I have been working on. In addition, the script could:

  • output a list of links* (one URL per line) to facilitate the batch generation of Perma links
  • help us to replace the original links in the .md file with the new Perma links

*I think this would need to exclude links that are included within code. There are also specific instances where we do not want an archival link, e.g., if we are directing readers to download some software so some human decisions/checking may still be needed.

@ZoeLeBlanc expressed interest in your & @nivaca's work at yesterday's Tech Team meeting. She wonders if eventually the tools you've created could be integrated into checks that the submissions repository could run on new uploads.

anisa-hawes avatar Dec 16 '21 12:12 anisa-hawes

Thanks for tagging me @anisa-hawes and thanks @rivaquiroga and @nivaca for all your hard work with these tools!

If you have time in the new year it would be good to chat about long term plans for these tools (should we migrate them to PH repos for example?) and whether we could incorporate them more directly into the ph-submissions build pipeline so editors wouldn't have to download them separately (similar to our jekyll site and likely some sort of Ruby make script to run them). Let me know your availability and thanks again for all your hard work on these!

ZoeLeBlanc avatar Dec 16 '21 15:12 ZoeLeBlanc

  • The questions raised in this Issue on the subject of naming images are resolved by an Image Naming Convention (#2578). This page represents part of a series of guideline documents I am developing to support our publishing team with technical aspects of lesson set-up and file management.

  • The difficulties rendering Lesson Previews can be lessened/resolved by a proposed standardisation and simplification of our editorial workflow (#2760), in which I/the publishing team will take care of Phase 1 Submission (from uploading new lesson materials, renaming files and images, to setting up the lesson file, and creating YAML metadata).

--

Meanwhile:

  • The opportunity to collaborate on an automated tool or script to support the generation of archival links could still be welcome, and I propose that Riva and I discuss this again at our forthcoming meeting to decide upon whether we prioritise this, or review it again at a later date. The script could:

    • output a list of links* (one URL per line) to facilitate the batch generation of Perma links
    • automate the replace of original links in the .md file with the new Perma links

As noted above, this would need to exclude links that are included within code blocks. There are also specific instances where we do not want an archival link, e.g., if we are directing readers to download software, or access interactive content so some human decisions/checking will still be necessary.

anisa-hawes avatar Apr 28 '23 09:04 anisa-hawes