ph-submissions
ph-submissions copied to clipboard
Scraping the UK Web Archive with Boilerpipe
The Programming Historian has received the following tutorial on 'Scraping the UK Web Archive with Boilerpipe' by Caio Mello @caiocmello and Martin Steer @martysteer. This lesson is now under review and can be read at:
http://programminghistorian.github.io/ph-submissions/en/drafts/originals/scraping-the-uk-web-archive-with-boilerpipe
Please feel free to use the line numbers provided on the preview if that helps with anchoring your comments, although you can structure your review as you see fit.
I will act as editor for the review process. My role is to solicit two reviews from the community and to manage the discussions, which should be held here on this forum. I have already read through the lesson and provided feedback, to which the author has responded.
Members of the wider community are also invited to offer constructive feedback which should post to this message thread, but they are asked to first read our Reviewer Guidelines (http://programminghistorian.org/reviewer-guidelines) and to adhere to our anti-harassment policy (below). We ask that all reviews stop after the second formal review has been submitted so that the author can focus on any revisions. I will make an announcement on this thread when that has occurred.
I will endeavor to keep the conversation open here on Github. If anyone feels the need to discuss anything privately, you are welcome to email me.
Our dedicated Ombudsperson is (Ian Milligan - http://programminghistorian.org/en/project-team). Please feel free to contact him at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudsperson will have no impact on the outcome of any peer review.
Anti-Harassment Policy
This is a statement of the Programming Historian's principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.
The Programming Historian is dedicated to providing an open scholarly environment that offers community participants the freedom to thoroughly scrutinize ideas, to ask questions, make suggestions, or to requests for clarification, but also provides a harassment-free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience. We do not tolerate harassment or ad hominem attacks of community participants in any form. Participants violating these rules may be expelled from the community at the discretion of the editorial board. Thank you for helping us to create a safe space.
Hello @lizfischer. Just to let you know that I've sent @martysteer an invitation to join ph-submissions as an outside collaborator for the duration of this lesson's review. (@caiocmello already has write access because they're working on other lessons at the moment).
This means that both authors will be able to make direct edits to this lesson within our work-in-progress repo, without using the PR system 🙂
Hello again @lizfischer,
I've made a couple of adjustments to the YAML header, and added in the liquid syntax we require to display images on our site (example): {% include figure.html filename="file-name.png" alt="Visual description of figure image" caption="Caption text to display" %}
. I've plotted in the minimum, and we can return to add descriptive alt-text during the review process.
I've also made a couple of small typesetting tweaks, and removed the author bios + suggested citation (these are generated automatically).
I'll paste the author bios below (and Alex will slot them into ph_authors.yml
when we reach publication):
- name: Caio Mello
team: false
orcid: 0000-0000-1111-1111
bio:
en: |
Caio Mello is a PhD student in Digital Humanities at the School of Advanced Study, University of London. His main research interests lie in the field of digital methods, Natural Language Processing techniques, data visualisation, media studies, urban studies and digital activism.
- name: Martin Steer
team: false
orcid: 0000-0000-1111-1111
bio:
en: |
Martin Steer is Technical Lead in the Digital Humanities Research Hub at the School of Advanced Study, University of London. He works with humanities and social media data, data infrastructure, web archives and fiction.
A live preview of the lesson is now available: http://programminghistorian.github.io/ph-submissions/en/drafts/originals/scraping-the-uk-web-archive-with-boilerpipe
Thank you @caiocmello. I actually made a small change to Liz's initial comment, as we've recently replaced the in-thread Permission to Publish statement with an authorial copyright declaration form. So that step of our workflow has changed.
At the end of the review, when I've copyedited the lesson and made any final typesetting adjustments, I'll reach out and ask you to fill in the form which clarifies that you retain unrestricted copyright of your work, while granting us first permission to publish it under a CC-BY 4.0 License.
Thank you, @caiocmello and @martysteer. Our workflow has recently changed, to replace these Permission to Publish statements with a more formalised authorial copyright declaration form. At the very end of the workflow, when I've copyedited the lesson and made any final typesetting adjustments, I'll reach out and ask you to fill in the form.
Hi @anisa-hawes. Thanks very much. Is there any update on this lesson? Is there anything we (authors) have to do at this stage? Best wishes, Caio
Thank you for getting in touch, @caiocmello. There's nothing further we need from you until the reviews are complete.
Hello @lizfischer and @hawc2, Are you able to post an update to this Issue? It would be great to hear if the reviews are underway.
Hi @caiocmello, so sorry for the delay! I am working on getting reviewers currently-- it may take a little while for me to hear back from folks given the time of year. I will update here as soon as those are confirmed.
Hi folks, just a quick update-- I'm still on the hunt for reviewers! Hopefully we will have more luck in the post-holiday season :)
Hello @lizfischer is there any updates on the reviewing process?
hi @caiocmello, unfortunately Liz has had to step away from editing this lesson, so I'm currently in the process of finding you a new editor. I believe the first set of reviewers Liz reached out to did not respond, so we will have to try a new set. If you have any recommendations, let me know, and please feel free to email me at [email protected] if you have additional questions while we identify you a new editor from the English team
Hi @hawc2, thanks very much for your answer. I am going to email you with two recommendations of editors soon.
Hi @caiocmello, I haven't heard from you since an email I sent a few months ago noting some concerns about the sustainability of this lesson with such an outdated library. Programming Historian in English is preparing to reorganize how we accept submissions, and I would like to invite you to resubmit a revised version of this proposal when we do a call for papers in September. The lesson will need to change from the version here to address some of those concerns, and we can aim to more adequately review this proposal and edit it in a timely fashion next year. Please feel free to email me with any remaining questions about this proposal, but for now I am going to close this issue ticket. Thanks for your consideration, and apologies for the winding road this particular proposal has taken. I hope we can still find a way to publish a version of this more attuned to our current and future needs as a journal.