jekyll
jekyll copied to clipboard
Technical Infrastructure Renewal - update and next steps
I write as Chair of ProgHist Ltd with an update on the work to develop our infrastructure. Apologies for the long message. I've tried to be as brief as I can but I hope you appreciate this is a complex and important topic.
As you will be aware, last year we secured funds from Jisc and the Corporation for Digital Scholarship to gather community needs, to turn those into requirements, and to use that work as a basis for making a decision about how best to transition to a new publishing infrastructure. These funds were secured on the basis that our current infrastructure, whilst well loved by many in our community, is no longer fit for purpose: it is built on a static-site generator (Jekyll) that is no longer in active development and the complexity of our site has resulted in build times (7-8 mins) that both discourage us from using it as a platform to develop DH skills (which it was earlier in the project) and slow down our editorial work.
Many of you participated in requirements gathering. Thank you for doing so, and to Anisa for co-coordinating both the requirements gathering and synthesis. The resulting report and recommendations based on this work can be found here 2025_01-ph-tech-phase-1-requirements-recommendations.pdf.
Over the last month or so this has been boiled down to the following requirements:
- Our infrastructure needs simplified navigation, crisper typography, advanced search, the ability to facilitate offline use, and structured article metadata for harvesting by aggregators.
- Our Publishing Manager should coordinate future site development (drawing on recommendations in the Technical Infrastructure Renewal Phase 1 Report) in the form of community development sprints.
- A new technical infrastructure is needed for our site.
On 3., two options have emerged. The first is building a new custom site using a modern static-site generator, such as Hugo. This gives us maximum flexibility for development and customisation, and will mean retaining our existing publishing workflows (submission, peer-review, etc) but also introduces risk to continuation of service without mutualising development expertise across our editorial community. The second is moving to an existing publishing platform, such as Janeway. This gives us less flexibility for development and customisation (though much more than Janeway did when last evaluated 4-5 years ago) and would shift our workflows closer to a 'normal' journal, but with the benefit of uptime, plugins, security etc being taken care for us and a secure foundation on which to build our infrastructure as digital research/humanities evolves.
Anisa and I have evaluated the options and are leaning towards recommending Janeway. The costs here would be small compared with our running costs (publishing staff, specialist translators/copyeditors, services such as netlify, etc, though of course Janeway wouldn't replace investments in things like publishing staff and specialist translators/copyeditors, we'd still need them), and in return we'd get access to support transitioning our back catalogue to a new infrastructure, a partner committed to championing open infrastructure and diamond OA publishing, and a development platform (built in django/python) that provides our community with opportunities to contribute to a shared open infrastructure, both for the benefit of our community and for the benefit of all Janeway users.
Based on benchmarking against the way we operate right now, Janeway can more or less accommodate all our publishing needs (e.g. quirks we have around article difficulty and types of contributor role), with a few areas needing more discussion (principally, modification of articles). Moving to Janeway would also significantly change our publishing workflows, but it would not require a change to the values that underpin those workflows: open, efficient, documented, scalable, accessible, and sustainable publishing in and for the digital humanities.
After so many years finessing how we work, any change would be a big change. In turn this would require significant investment in time and energy to manage and make happen (the remaining Jisc and Corporation for Digital Scholarship funds would be invested here to mitigate impact on editorial teams). But I hope you all agree that something has to be done, that stasis isn't a viable option. Whilst Anisa and I are leaning towards Janeway, no decision has been made and your views are vital. What I would like to know from you now is how this proposal lands with you, the questions you have, and the questions you anticipate will arise. From there, Anisa and I can build a picture of community sentiment and use that as a basis for putting forward a formal recommendation.
FAQ
Based on initial feedback and questions, I provide some additional notes and points of detail below:
Do we have an estimated cost for importing the back-catalogue?
Based on an initial conversation with Janeway, support for that would be included. We should also be able to arrange for a test instance to be created with a handful for articles, so we can see the import before make any decision.
Has anyone reviewed their current capabilities against PH requirements?
Yes. I have a list of PH requirements that I went through with a member of the Janeway team. Only versioning/modification came up as a slight mismatch between our way of working and Janeway as an architecture. That said, they have versioning in their pre-print server architecture, so we should be able to reuse that. I got the strong sense from speaking to the Janeway team that if partners have development needs that broader communities will benefit from, they get added to the roadmap. For example, Janeway are implementing CRediT at the moment in a way that means contributor roles not in that taxonomy, which we’d need, can be implemented. And - to reiterate the point - as Janeway is a development platform (built in django/python)l, as things come up that we need, Janeway provides opportunities for our community to contribute to an open community infrastructure, both for our benefit and the benefit of all Janeway users (rather than just directing that effort at our own infrastructure).
Will Janeway count PH as a single title or one per language? Will a cost-per-title change PH future planning?
Subject to negotiation, we will be one publisher with four journals. We have aligned values with Janeway (some of you may have seen Open Journals Collective launch this week, which is spearheaded by the same people), share some supporters with OLH, and have a professional publishing staff that would handle author, editor, and peer reviewer relations (which is in contrast to how Janeway typically work with 'just' journals), which taken together means I'm hopeful we can have a sensible conversation about price. I will be clear in those conversations that we have ambitions to host more language journals that publish under the PH banner, but that that has been limited by our existing infrastructure rather than ambition.
Have we investigated about financial sustainability with Janeway?
We would of course ask and investigate this before any decision is made or contract signed. My understanding is that Janeway is intimately connected to Open Library of Humanities. OLH has over 300 geographically dispersed supporters, who invest in their work in a way comparable to our IPPs.
How would moving to a publishing platform (rather than building our own) change PH's identity as a "brand"? Will this mean that we are just a publisher of journals rather than a project?
We’d need to pay attention to this if we moved to a journal-centric technical infrastructure like Janeway. We are a project and a community not 'just' a publisher of journals. Moving to a service like Janeway will not mean removing and/or replacing our wonderful staff. They will still have plenty to do and we will continue to employ them so long as we have the IPP/financial backing to do so. Freeing our staff time from technical troubleshooting will, I hope, give us more capacity to build, maintain, and support our community of editors, authors, peer reviewers, translators, copyeditors, maintainers, etc. This - IMO - is a better investment in the DH community than time spent on technical troubleshooting.
What would happen to our publications if/when Programming Historian is no longer active? What happens if we run out of money to pay Janeway for a hosted service?
I will ask, though note that Janeway is open source so can be self-published if needed. In this scenario we revert back to an entirely volunteer run organisation. At present, this is a worst case scenario. This scenario is, of course, something that the Trustees of ProgHist Ltd - the publisher that underpins PH - are working hard to avoid: nobody wants to lose our wonderful staff. And we welcome ideas and suggestions from all members of the community on income sources and partnerships that can sustain our work.
In an important sense, this infrastructure proposal is also about sustaining our future: the scenario that keeps me up at night is that the site breaks in some way, it goes down or we can't publish any edits/changes/lessons, we don't have the technical capability to get it back up (because we've largely stopped developing the site we built for reasons described above), our supporters stop investing in us because we are no longer able to publish, and then we are no longer able to pay our staff. Janeway (it seems) takes that risk away, even if it adds the risk of having some bills to pay for a hosted service. But if we can’t pay that bill, then a lot would have already gone wrong.
Are there examples of Janeway websites we could look at?
Yes. The best way to find examples is to browse their news feed, which is hosted on Open Library of Humanities. For example, UCL and Leiden have made substantial investments in moving their journals to Janeway.
PH basic requirements:
- PH publishes four journals.
- PH operates in four languages: Spanish, French, Portuguese, and English, with the ambition to expand to right-to-left languages).
- PH journal front-ends are in the language that matches their publications, as - ideally - should be publishing workflow.
- current publishing estate is over 250 articles: this includes both articles published in the language submitted by the author and translated/localised articles.
- PH interface has pathways to move between articles translated/localised between publication (e.g. https://programminghistorian.org/en/lessons/preserving-your-research-data)
- PH uses open peer review.
- PH has a DOI supplier.
- PH uses ORCID.
- PH uses perma.cc for links.
- PH articles are available in an open/FAIR format (e.g. markdown)
- PH has a repository that hosts data used in articles.
- PH uses a taxonomy of author roles and credit (e.g. editors, reviewers, translators) that expands on the Contributor Role Taxonomy (CRediT).
- PH maintains contributor lists
- PH articles include difficulty levels in their metadata.
- PH articles are updated when links break or processes need revision. Last modified dates are included in article metadata. We have a policy on what constitutes an edit that trigger a new DOI.
- PH articles are retired when they are longer functional. They remain published with a banner explaining the decision. See Lesson Retirement Policy.
- PH articles include code block. None of that code is runable within the article.
Dear James, dear Anisa, dear all,
Thank you for your work and the efforts you put in this project.
Sorry if this is not the place to ask, I have a small question, and sorry if it's been asked already: if PH switches to Janeway, what will happen to the source files of each lesson ? What will be the workflow?
I like the idea that lessons can be studied easily (using distant reading techniques, for example). Markdown is ideal for this, and easier than scrapping a webpage.
And the second question has to do with the idea that PH is something more than a journal. In particular (I have some obsessions), I dream of a multilingual glossary of DH terms and I think Programming Historian can be the right place to publish it. I think it could align with the goal to make the site fully searchable, if we can link these terms in all the translation of a lesson for instance (something we started exploring last year with some of you and it was fun!)
Would it (or any other idea that can materialize in the form of a publishable web resource) still be possible with Janeway ?
Best
Matthias
@matgille thanks for the questions.
On your first point, I agree that our articles should be accessible in an open/FAIR format, and that .md is ideal for this. I've added this to the list of requirements above.
On your second point, Janeway is a software platform for article publishing but that shouldn't preclude PH publishing other types of content that fit our goals and meet the needs of our community. Janeway is flexible, open, and can be developed, so if we have the time, energy, and money to pursue things like a multilingual glossary and to embed that into our discovery systems, then I don't see why it couldn't go onto a development pipeline.
Note that whether or not we move to Janeway, our Publishing Manager (so, @anisa-hawes) will coordinate future site development, with the initial idea to have development sprints coordainted with editors and our wider community. Hopefully that'll provide ways to follow your dreams :)
Does that reassure you?
Dear James,
I apologize for this very late reply. Thank you for your answer. I think I was more curious than afraid!
Actually I'm not sure I have an opinion on the matter, in the absence of any real knowledge/experience of the issues at stake in terms of project management.
All the best,
Matthias
I have an update on this for everyone, but especially for our MEs (@hawc2 @jenniferisasi @marie-flesch @ericbrasiln).
After lots of discussion we are edging towards a consesus on next steps, which at a high level is:
- Move our website and publishing estate over to a site built on Janeway (e.g. all DOIs resolve to a new site)
- Continue to run peer review, copyediting etc parts of our publishing workflow as we do now.
- In the publishing workflow, only use the Janeway backend for the final publishing stage.
- If the workflow is determined by our Publishing Manager to meet our values and if publication wants to (and here of course publications have pull editorial control), test the full Janeway publishing workflow for a handful of articles at a later date (once site migration is complete).
This version of moving to Janeway strikes me as the pragmatic option. It gives us a stable web front end that we can develop (and in so doing contribute to open infrastruture), whilst retaining what I know most of our editorial community love about our submission, article development, and peer review workflow.
I'm keen to start moving forward with this (that is, going back to Janeway to start discussing practicalities rather than actually signing contracts or moving our website and publishing estate!) within in the next two weeks, so I'd be keen to know your views.
Thank you for those who have contributed to this process. I will start exploring the practicalities of moving to Janeway.