ph-submissions Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification, Part 2

Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification, Part 2

Open hawc2 opened this issue 3 years ago • 52 comments

The Programming Historian has received the following proposal for a lesson on 'Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification, Part 2” by @davanstrien. This lesson, which is in two separate parts, is now under review. This ticket is only for Part 2 which can be read here:

http://programminghistorian.github.io/ph-submissions/en/drafts/originals/computer-vision-deep-learning-pt2

@nabsiddiqui is reviewing Part I. This review of Part II will take into consideration the feedback provided on the Issue for Part I, available here: https://github.com/programminghistorian/ph-submissions/issues/342

Please feel free to use the line numbers provided on the preview if that helps with anchoring your comments, although you can structure your review as you see fit.

I will act as editor for the review process. I will work with @nabsiddiqui, editor of Part I, to synchronize the review feedback for the two parts. My role is to solicit two reviews from the community and to manage the discussions, which should be held here on this forum. I have already read through the lesson and provided feedback, to which the author has responded.

Members of the wider community are also invited to offer constructive feedback which should post to this message thread, but they are asked to first read our Reviewer Guidelines (http://programminghistorian.org/reviewer-guidelines) and to adhere to our anti-harassment policy (below). We ask that all reviews stop after the second formal review has been submitted so that the author can focus on any revisions. I will make an announcement on this thread when that has occurred.

I will endeavor to keep the conversation open here on Github. If anyone feels the need to discuss anything privately, you are welcome to email me.

Our dedicated Ombudsperson is (Ian Milligan - http://programminghistorian.org/en/project-team). Please feel free to contact him at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudsperson will have no impact on the outcome of any peer review.

Anti-Harassment Policy

This is a statement of the Programming Historian's principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.

The Programming Historian is dedicated to providing an open scholarly environment that offers community participants the freedom to thoroughly scrutinize ideas, to ask questions, make suggestions, or to requests for clarification, but also provides a harassment-free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience. We do not tolerate harassment or ad hominem attacks of community participants in any form. Participants violating these rules may be expelled from the community at the discretion of the editorial board. Thank you for helping us to create a safe space.

Jan 20 '21 19:01 hawc2

@hawc2 thanks for editing this lesson 😀

Let me know if you or the reviewers need anything from me. The Kaggle notebooks are currently private but I can add anyone who needs access - I'd just need a username.

Jan 22 '21 14:01 davanstrien

@davanstrien I’m providing preliminary feedback on Part 2 of your Computer Vision tutorial, in light of feedback @nabsiddiqui gave you in Issue #342 for Part I: https://programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt1.

I agree with Nabeel, this is a solid tutorial, and your explanations of complex machine learning methods is quite excellent, usually very readable and clear. For doing revisions on your two part tutorial, you should start by integrating Nabeel’s feedback for Part 1 before turning to my comments for Part 2: https://programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt2.

In line with what Nabeel says about Part 1’s introduction,my first thought is that during your overview of Part I’s structure you should foreshadow what Part 2 will do. A broader overview of the two parts and how they interconnect will help the reader walk through each part. Part 1 should overview it all, and Part 2 should briefly rehearse (and link to) what was discussed in Part 1, before outlining what will be covered in Part 2. Make sure to link Part 2 in the intro to Part 1, and again its conclusion. On this note, the conclusion to Part 1 could include a few more sentences clarifying the transition between the two parts. Part 2 provides a useful introduction reciting what was covered in Part 1; I don’t think the intro to Part 2 needs much more elaboration, but a little more of an explanation at the start of Part 2 explaining the basic terms established in Part 1 and describing why someone would do this very long tutorial and method would help situate it for the reader.

I would also say that what Nabeel suggests about explaining concepts before introducing relevant code throughout your tutorial applies equally well to Part 2, although maybe less often.

For Part 2, I have detailed my revision suggestions below by section and paragraph:

[x] In the Looking at the Data section, the sentence “This step of trying to understand the data you will be working before training a model is often referred to as ‘exploratory data analysis’ (EDA)” needs a “with” between “working” and “before”. Ideally sentences like this (which appear in multiple places) would be reworked to not have dangling participles, in this way: “This step of trying to understand the data with which you will be working before training . . .”
Also note, immediately after this sentence, you have a note to the reader that is separated into a different formatting - this separate formatting doesn’t seem necessary. The point you make could be integrated more normally into your discussion.
[x] The section, Using a Model to Assign Labels, could be renamed to Comparing Classification versus Assigning Labels or something more apt to the lengthy discussion here. This discussion seems central to your lesson, but it could be clarified a bit. Perhaps you could explain a little more clearly why you decide to use a model to assign labels, and what advantages this offers over an unsupervised classification approach, if I am understanding this right? It would be helpful if the discussion here was more clearly referenced later in the lesson - with sign posts to when we will move on to a classification stage. I can’t quite say yet the best way to clarify this section, but it does strike me as needing more delineation in terms of classic terminologies around machine learning. Since this lesson is already too long, I’d say keeping it condensed will only help the reader follow along.
[x] In paragraph 19, you state: “We will tell Matplotlib to use a different style using the style.use method.” Can you explain why you pick the “style” you pick, namely, “seaborn”?
[x] Paragraph 20, you can simplify the first sentence to simply say: “Let’s now take a look at the dataframe.”
[x] Paragraph 22: can you explain why the maximum size for training datasets is 2002?
[x] Paragraph 22, minor syntax change, add a colon after “We can also see three columns”
[x] Paragraph 26 - let’s start a new section before this paragraph, called something like “Wrangling the Data”. Your explanation of list comprehension is very good, but it would be helpful to delineate this as a separate step from “Looking at the Data”. You could use different size/levels of headings to delineate Wrangling the Data as a subsection.
[x] Paragraph 29 - add a period to end of this sentence.
[x] Paragraph 31 - would be useful to start a new subsection of Looking at the Data here focused on Counting the Labels, following the sentence “We now have a single list of individual labels.”
[x] Paragraph 34 - make sure Python is capitalized everywhere. It isn’t in this sentence
[ ] Paragraph 34 - for the following chunk of code, is it possible to condense this code to include the titling in the definition of the plot function?
[x] Paragraph 35 - again, it doesn’t seem necessary to format this note separately from the standard formatting for text explication
[x] Paragraph 36 - the heading above this paragraph seems unnecessary.
[x] Paragraph 45 - like the other notes you have warning the reader, I think this one could be integrated into the narrative to more clearly say what Metrics do for us. It’s not really necessary to say it doesn’t directly impact the training process, as in some sense it may impact the human’s choice on using a model or not. More generally, I would recommend integrating all such notes into your narrative discussion or cutting them.
[x] Paragraph 48 - your diagrams are excellent. The start of this paragraph has fastai in lowercase. Could you start with another word so it looks like a grammatically correct sentence? If fastai is lowercase because it’s a package, it should be formatted as such.
[x] The section on Image Augmentations is a little hard to understand. Can you try to explain in more colloquial terms what it means, or at least link to another resource with a more thorough explication? I found the following part clearest, but it wasn’t obvious if this was an actual definition of Image Augmentation: “Image transforms or augmentations are useful because they allow us to artificially increase the size of our training data.” Are image transforms and augmentations the same thing?
[x] Paragraph 80 - you say “Again, we have seen this at a high level before, and most things will remain the same as in our previous advert example.” When are you referring to? Can you cite that instance more specifically? Especially since this is a two-part lesson, it might seem like you’re referring to Part 1. These lessons will be a lot for readers to take in, so being explicit with all internal citations is necessary.
[x] Paragraph 82 - explain more what the differences and advantages are of different Imagenets
[x] Paragraph 86 - remove the phrase “As a reminder”. Can you also explain “loss” more explicitly, or link to a secondary source?
[x] Paragraph 88 - cite where numerically on the x-axis the loss shifts in significant ways, such as starts to go up radically
[x] Paragraph 92 - be more explicit about the ‘ads’ example you’re citing, and since this is the first time you mention ImageNet, explain what it is and provide a link.
[x] Paragraph 102 - the word “network” kinda comes out of nowhere here. I see you’ve used it a couple other places, but never in conjunction with ‘neural’ - its conceptual significance and definition should be made explicit for the reader. How is this different from, say, a network analyzed with a network analysis tool? What is a ‘neural network’/how does it relate to computer vision?
[x] Paragraph 103 - looks like your wikipedia citation has some missing Markdown syntax.
[x] Paragraph 105 - the heading above this paragraph is centered unlike all the others
[x] Paragraph 112 - let’s start a new subsection here entitled Exploring Predictions Using Scikit-learn or something similar
[x] Paragraph 114 - a couple minor typos here. New sentence for “This can give us . . .” A space before “which tells us. . .”. And make sure quotation marks come after the period if they appear at the end of a sentence, as in the last sentence in this paragraph.
[x] Paragraph 119 - in the last sentence of this paragraph, start a new sentence at “however in our particular dataset”
[x] Paragraph 120 - missing word here: “In comparison to the label ‘animal’, which mostly an easy for the human annotator of this dataset to identify,”
[x] Paragraph 122 - the figure doesn’t appear - I can’t remember for sure but I think it was missing from what was originally submitted, and I noticed this when cleaning up the lesson in the fall. I’d also say the heading above this figure is a little unclear - what do you mean by “realistic” here? In the following heading, it appears in the center again, and seems too large. More importantly, “Sucking Pigs and Sirens” just seems out of place, too weird, in comparison to the language for every other section heading. The rest of the heading is fine
[ ] Paragraph 124 - this is a tough paragraph, the last sentence could be less convoluted. It is confusing who you are citing - Borges, Foucault, Wilkins? Part of the confusion is the “The Analytical Language of John Wilkins” needs quotations around it. Generally I think you can more slowly bring the reader into this theoretical discussion, perhaps first rehearsing a bit of what you’ve covered over this two part lesson, before diving into Borges and Foucault. I like you bringing in Foucault here, but it is a little out of nowhere. I wonder if you can cite some of this at the beginning of Part 2, and for symmetry sake, you should engage with similar theoretical references in Part 1 - I say this having not gone back and looked through Part 1 in detail for this. WIth that said, bear in mind you’re already over the standard 8K word limit PH sets, which is ok but your lessons shouldn’t grow too much longer.
[ ] Paragraph 125 - I’m also not sure it is true to say Borges is implying all categorization is arbitrary - I see what you are getting at, but it seems more complicated to me than that. Another useful reference for you is his essay, “The Total Library.”
[ ] Paragraph 126 - add quotation marks for the essay cited here as well. You should also provide the date of publication in parentheses for each of these published works. This citation is also confusing in the same way as the last, in that you are citing Borges citing someone else - maybe an unavoidable peril when engaging with Borges, but nevertheless this is going to be confusing for readers who have, up until now, been closely following technical discussion of steps in using code for machine learning. To help address this problem, consider also breaking these paragraphs up into shorter paragraphs to help guide the reader through the steps of your discussion
[x] Paragraph 127 - the use of “not only” here would be easier to read if, instead of a period, you had a comma, and “but also” following immediately thereafter
[x] The next section, “Next Steps,” should be titled something different, like “Further Reading and Resources.” It’s important at this point you clarify for the reader that the tutorial Part 1 and 2 is concluded. Again, summarizing everything you’ve covered will help wrap it up. The further resources are helpful, but it’s alot and some of it could be cut. Some of the concepts you cite could be introduced earlier in the tutorial and discussed as part of machine learning - bias in image analysis in particular seems something worth mentioning as a precaution, especially considering there are famous cases of these models mistaking humans for animals and vice versa, in particular in relation to ethnicity. Ditto for the GPU needs - that seems worth stating up front. Perhaps save this last part for just a few examples of Further Readings and Other Tools worth investigating.

Thanks for this great two-part lesson. Once you've revised both Part 1 and Part 2, we'll send it out for external peer review.

Mar 08 '21 16:03 hawc2

@davanstrien I’m providing preliminary feedback on Part 2 of your Computer Vision tutorial, in light of feedback @nabsiddiqui gave you in Issue #342 for Part I: programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt1.

Thank you for this super thorough feedback, it's really appreciated 😀 I have time blocked out later this week to work on this so I'm hoping I will be able to integrate the current feedback from part 1 and part 2 by the end of the week (I have been known to be overly optimistic with estimating these things...)

Mar 09 '21 10:03 davanstrien

@davanstrien I’m providing preliminary feedback on Part 2 of your Computer Vision tutorial, in light of feedback @nabsiddiqui gave you in Issue #342 for Part I: programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt1.

Thank you for this super thorough feedback, it's really appreciated 😀 I have time blocked out later this week to work on this so I'm hoping I will be able to integrate the current feedback from part 1 and part 2 by the end of the week (I have been known to be overly optimistic with estimating these things...)

I was very optimistic with this time estimate... I have hopefully addressed the majority of the comments now. I am waiting for some of my co-authors to respond to the questions relating to the Foucault section since they wrote that. Hopefully, I can respond to those sections fairly soon.

I have included more sign-posting between the two lessons but I suspect that this will need a final review following any other suggestions from the reviewers.

Let me know if there is anything else I need to do in the meantime and thank you again for the detailed suggestions for this lesson (I realise it's a long one...)

Apr 09 '21 10:04 davanstrien

@hawc2 we have updated the conclusion section with the aim of improving the clarity. Hopefully, this addresses all of the editorial suggestions you've made. Please let me know if I have missed anything that needs looking at before this goes out for peer-review.

Apr 15 '21 09:04 davanstrien

davanstrien, do you have a link to the Kaggle notebook for the second lesson?

Feb 06 '22 18:02 cderose

Hello all,

Please note that this lesson's .md file has been moved to a new location within our Submissions Repository. It is now found here: https://github.com/programminghistorian/ph-submissions/tree/gh-pages/en/drafts/originals

A consequence is that this lesson's preview link has changed. It is now: http://programminghistorian.github.io/ph-submissions/en/drafts/originals/computer-vision-deep-learning-pt2

Please let me know if you encounter any difficulties or have any questions.

Very best, Anisa

Feb 06 '22 19:02 anisa-hawes

Hi @anisa-hawes, thank you for reaching out. I have the link to the lesson, but I don't see a link to the Kaggle notebook that has the data and code. Or is there not a notebook that goes along with this lesson like there was for part 1?

Feb 06 '22 21:02 cderose

Hello @cderose. I have only moved the .md file in this case.

I understand that @davanstrien's Kaggle notebooks are hosted externally and they are set to private. Please connect with @davanstrien so they can grant you access !

@hawc2 is best placed to help with this aspect if you have any further questions.

Feb 06 '22 22:02 anisa-hawes

Hi @anisa-hawes, thank you for reaching out. I have the link to the lesson, but I don't see a link to the Kaggle notebook that has the data and code. Or is there not a notebook that goes along with this lesson like there was for part 1?

The notebook should be available here: https://www.kaggle.com/davanstrien/02-programming-historian-deep-learning-pt2-ipynb. Let me know if there are any issues geting acess.

Feb 07 '22 11:02 davanstrien

Thank you @davanstrien! Hello @cderose, here's the link ^^

Feb 07 '22 11:02 anisa-hawes

@davanstrien and @anisa-hawes, thanks very much! I was able to copy the notebook over successfully and will share my notes this weekend.

Feb 08 '22 02:02 cderose

@davanstrien and @anisa-hawes, thanks very much! I was able to copy the notebook over successfully and will share my notes this weekend.

Many thanks! Shout if anything I can help with at my end.

Feb 08 '22 11:02 davanstrien

Another great lesson, @davanstrien! This one is, appropriately, the more technical of the two lessons. It very effectively highlights the many different steps and choices that are involved in preparing a dataset, training a model, and evaluating the model. Like with lesson one, I appreciate that you intersperse different examples of when and why you might do x versus y, and you link to a number of great resources for people who want to dive further into the decisions that have to be made.

My main thoughts are summarized below, but I would be happy to chat further or clarify any of my notes.

Main suggestions

As you rightly emphasize, the decisions we make when preparing a dataset and training a model should be closely tied to our end goals, but there isn't an explicit goal that we're training toward in this lesson (unlike in part one). Given the findings at the end, it seems like the use case for this lesson could be a meta one, where we're training a model with an eye toward studying how an imbalanced dataset (more images of humans than animals) might affect our model's learning. Stating a goal like that early on in the lesson can help situate the work we go on to do.
The dataset we use in the lesson has a lot of unlabeled images in it, which is very realistic. It would be helpful at a few points in the lesson if you would clarify how the unlabeled images are (or aren't) impacting the results of our model. Are we removing the unlabeled images? If we're not, how are they being validated? I tried to signal below a few spots where some of that discussion might happen.
Regarding Kaggle - in order to create the model, users will need to have enabled internet access in the notebook. To enable such access, they have to give Kaggle their telephone number. For this reason, it could be worth considering switching to a Colab notebook so that people don't have to share personal information. If internet access hasn't been enabled, users will see an error for the code in p86. Here's the StackOverflow page that helped me when I got the error:

Minor edits/thoughts

[x] p1 - include a link to the Kaggle notebook, along with a reminder of how to switch over to the GPU
[x] p3 - no comma between "data" and "to training"
[x] p5 - hyphenate "high-level" in the caption
[x] p7 - add a comma after "machine learning model"
[x] p10 - are the labels we'll be using ones that humans applied to the images or a computer?
[x] p12 or p15 - you might add a sentence that touches on the limitations of labeling models, too
[ ] p20 - the Kaggle notebook provides a little more information about the format of the dataset right before the line of code; include that in the lesson on the Programming Historian, too
[x] p25 - does the double label (separated with |) indicate that some images have two labels applied to them in this dataset? Will that matter for our purposes?
[x] p27 - typo in the comment to the code, should be: "labels". I would also put it in quotes for clarification: 'create a variable "labels" to store the list'
[x] p28 - add a "but" after the first comma: "Now we have the labels in a list, but we still..."
[x] - add an "a" before list comprehension: "If you haven't come across a list comprehension before"
[x] p33 - you might add a paranthetical here or in the next paragraph to say that the '' represents images that weren't given a label - you do that in p38, but might also mention it here or in p24, which has the first output that hints at the presence of unlabeled images.
[x] p34 - I think this count refers to the total number of images in the dataset rather than the number of labels? If so, update the first sentence accordingly.
[x] p38 - Since we're in the data exploration phase, you might mention that depending on the goals of the project, researchers may want to remove unlabeled images, or they might want to try to assign labels to those images before using them. They also might want to remove some of the human images to have more of an even balance across labeled images.
[x] p40 - The example you offer in p41 is excellent for demonstrating why you shouldn't rely on the metric if you take the dataset as is. For curious readers, if researchers trimmed down the dataset with an eye toward having a more even distribution across the labels, would accuracy potentially be an approrpiate metric to use again? (As you show later on, even in such a case it likely shouldn't be the only metric that's used.)
[x] p45 - this is a really helpful distinction between precision and recall. I would split these sentences up and put them next to where you first define the terms in p43 and p44 since they are more concrete and make the difference between the two clearer.
[x] p46 - you might give a quick example of a situation where you'd prefer recall at the expense of precision or vice versa
[x] p48 - really great reminder
[x] p57 - You might start the paragraph by saying "In Part 1, we saw an example..." since we haven't seen an example yet in this lesson.
[x] - Also, for the labels with semicolons (such as human;human-structure), is that how fastai is handling the labels that had a pipe separating them? In other words, is fastai preserving the doubly labels? You might state how that will or won't impact the model's training.
[x] p60 - did we remove the unlabeled images, then, or why isn't there a fifth label for them?
[x] p62 - use parentheses instead to separate the two parts for clarity: "Since our data is made up of two parts (the input images and the labels), one_batch() will return two things."
[ ] p70 - does this mean that there are still unlabeled images in the dataset? If there are, how is that going to impact the model and the precision/recall metric?
[ ] p73 - the code in the notebook is different from the code in the Programming Historian lesson (functionally, it looks the same, but the code should match)
[x] p81 - add "that" after "Now"
[ ] p81 - we haven't said much about what we're training for - what's our example end goal in this case?
[x] p84 - add "use" before "an existing model"
[x] p85 - change to "fewer compute resources"
[x] p86 - you might say F1ScoreMulti has round brackets or parentheses for clarification
[ ] p86 - this line of code will throw a not immediately decipherable error if the internet hasn't been enabled for the Kaggle notebook. Include a quick note here about how users can do that (Settings -> Internet slider) - first, though, they'll have to create accounts with Kaggle and provide their phone number, if they haven't done so already.
[x] p87 - "Now that" and typo: "let's look"
[ ] p88 - might include a screenshot of the output since you helpfully do that for the other steps
[x] p96 - great example that circles back to why it's good to look at a few measures
[x] p102 comment - you can remove the two commas in the sentence that begins: "You will be able to see..."
[x] p104 - change to "an unfrozen model"
[x] p105 - add a period after the parentheses and before "When"
[x] p109 - rephrase the first sentence: "Our model is not performing super well yet."
[x] p115 - "Now that"
[x] p117 - slightly rephrase: "We also pass in an average, which determines how our labels are averaged, to give us more control..."
[x] p121 - this paragraph is another really great callback that reinforces some of the cautions earlier in the lesson around the uneven distribution of examples in the dataset
[x] p123 - fantastic example that raises questions about the labels we use in machine learning and the impact that decision has on the training process and how much data may be required for the model to perform well
[x] p124 - it looks like there's a footnote or paragraph link that's not working: [^7]
[ ] p128 - typo: "used to asking fundamental questions" - but actually, I would remove this sentence. The lesson is technical, but you also raise important fundamental questions along the way. Plus, the lesson isn't unusually technical in The Programming Historian context.
[x] p131 - I would remove the parenthetical "especially humanists" and leave it at classification being a concern for anyone.
[x] p135 - "how the deep learning model 'learns' from the data"
[ ] p136 - add a colon: "These steps included:"
[x] - add a comma: "that some of our labels performed better than others, showing"

Feb 14 '22 02:02 cderose

@cderose, thanks so much for this review. I plan to work through the listed suggestions either this week or early next week. For some of the other overarching points:

As you rightly emphasize, the decisions we make when preparing a dataset and training a model should be closely tied to our end goals, but there isn't an explicit goal that we're training toward in this lesson (unlike in part one). Given the findings at the end, it seems like the use case for this lesson could be a meta one, where we're training a model with an eye toward studying how an imbalanced dataset (more images of humans than animals) might affect our model's learning. Stating a goal like that early on in the lesson can help situate the work we go on to do.

I will add some more contextual information at the start of this lesson to point to these aims. One of the reasons we chose this example was to try and prepare people for the typical outcome of ML models not working as expected. Many teaching materials focus only on the 'happy path' where the model trains well and the dataset is perfect. We wanted to bring this up in the lesson since this will be even less likely to be a situation when applying ml to humanities data. I will make this a bit more explicit upfront, though, so the fact that the model doesn't perform well in some cases isn't so abrupt in the lesson.

The dataset we use in the lesson has a lot of unlabeled images in it, which is very realistic. It would be helpful at a few points in the lesson if you would clarify how the unlabeled images are (or aren't) impacting the results of our model. Are we removing the unlabeled images? If we're not, how are they being validated? I tried to signal below a few spots where some of that discussion might happen.

I will add some more discussion and integrate this into the places you suggest.

Regarding Kaggle - in order to create the model, users will need to have enabled internet access in the notebook. To enable such access, they have to give Kaggle their telephone number. For this reason, it could be worth considering switching to a Colab notebook so that people don't have to share personal information. If internet access hasn't been enabled, users will see an error for the code in p86. Here's the StackOverflow page that helped me when I got the error

Thanks for pointing this out. I think I may be able to get around the need to have internet-connected by having bundling the resnet model weights as a dataset in the Kaggle space. I think it makes sense also to have a Colab option. This shouldn't be too hard to add, but I'll hold off until the content is (almost) final before doing this.

Thanks again for this review

Feb 16 '22 17:02 davanstrien

Things for @davanstrien to check:

[x] LaTeX rendering working correctly i.e. $a$ is rendered correctly.
[ ] ~brief explanation/description of negative sampling~
[ ] sync notebook and lesson code

Feb 17 '22 13:02 davanstrien

Hi @cderose , thanks again for your comments. I have changed to fix most of the more minor comments. I will hold off on a few suggestions until I've got comments from @mblack884. I also plan to try and make the Kaggle process more straightforward and also add a Colab hosted version of the notebook. I will do that towards the end so it's easier to keep all of the content between the notebooks and the lesson in sync.

Feb 22 '22 11:02 davanstrien

I really enjoyed reading and working through this lesson. The discussion of methodological decisions was generally clear, and the lesson did well in integrating short explanations of the more technical/theoretical aspects of the work. The many links throughout work well to offer users deeper reading on those topics while keeping the focus of the lesson on implementing its method.

A good overview of lesson objectives or learning outcomes at the start of the lesson, like the list found at the top of Part 1, would be helpful. I try to provide them for longer, complex units in the classes I teach to emphasize the relationship between theory & practice. They would also have the added benefit of making the specific contributions of the lesson more visible to people who are browsing through the PH's collections.
Kaggle notebook loaded and ran correctly with this lesson. I could step through the process as I read and follow the output. In the interest of usabiltiy, should add some reminder about setup (sign-in, copy/edit netbook, enable Internet under settings) to ensure that the notebook will proceed past p31. Not familiar enough with Kaggle to know if its possible to pre-download and store into the notebook itself to avoid having to simply setup and/or avoid phone verification for those who don't want to give out their #.
I like the conclusion as it helps to pivot the lesson's methods towards answering big research questions. But I think it might be more effective if you were to foreground the question of human/animal/landscape boundaries earlier. I'd imagine that most readers would have a project in mind when seeking out this lesson, but having a short paragraph noting that you're going to explore the fuzziness of cultural labels or boundaries between concepts using computer vision at the outset may help readers better understand the trajectory of your decisions at each stage of the process.

Other minor suggestions: p38: Briefly mention what should be improved here. No need to get into how or why here as that would distract too much from the lesson. If you want to include some recommendation of how to handle the images without labels, I'd suggest a short note or appendix at the end.

p45: Short, conceptual definition of F-beta would be helpful here. There's a more technical definition provided in p51 (which is fine as is), but I could see some potential confusion given that the topic indicated by the section heading isn't directly acknowledged until several paragraphs into the section.

p68: You start using the yellow textbox here to mark off clarifications. I was intially confused because there were similar remarks above that didn't use it. May be a good way to include short explanation of what needs to be improved in the figure near p38.

Mar 04 '22 16:03 mblack884

Thanks @mblack884. @davanstrien, let me know what you are thinking time line wise in terms of revision.

Mar 04 '22 16:03 nabsiddiqui

@mblack884 thanks so much for doing this review 🙂 @nabsiddiqui I should be able to incorporate these changes early next week. I'll let you know if there are any issues with that.

Mar 04 '22 17:03 davanstrien

@nabsiddiqui I have incorporated the suggestions into the lesson content and made a few formatting fixes. The next step is to sync the lesson and notebooks on Kaggle and create a Colab version. I will do that tomorrow unless you think any other changes need to happen to the structure of the lessons.

Mar 07 '22 16:03 davanstrien

Chiming in as co-editor here, we should have a discussion about what role the Kaggle and Colab notebooks will play. We've discussed as an editorial board how this is relatively new terrain for Programming Historian, and our current thinking is that the Kaggle and Colab notebook versions shouldn't replicate the PH lesson. Most of the commentary doesn't need to be included, just the basic section guides, steps, and code. Does that make sense?

On Mon, Mar 7, 2022, 11:38 AM Daniel van Strien @.***> wrote:

@nabsiddiqui https://github.com/nabsiddiqui I have incorporated the suggestions into the lesson content and made a few formatting fixes. The next step is to sync the lesson and notebooks on Kaggle and create a Colab version. I will do that tomorrow unless you think any other changes need to happen to the structure of the lessons.

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/343#issuecomment-1060889615, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXF4ED5KWRSMWYMO2D7PS3U6YWHDANCNFSM4WLFEP7A . You are receiving this because you were mentioned.Message ID: @.***>

Mar 07 '22 16:03 hawc2

@hawc2 my main reason for including the full text is to help avoid people having to switch between windows but I'm happy to defer to the editorial board on this. If you prefer to keep things separate I will make a new version of the notebooks that strip out most of the prose. I will leave some in to help provide a bit of signposting at least. Let me know if that sounds okay to you?

Mar 08 '22 11:03 davanstrien

That sounds good. Feel free to make a spearate copy of the pared down notebooks, and we can present a near final copy of your lesson with the notebooks to the PH team. We may use it as an example in the future about how authors can balance between the two options.

I see why you want to make a Colab notebook, but I wonder if it's necessary, or if there's at least a way to link all extraneous resources to the tutorial in one place like the Kaggle environment, just so readers don't get decision fatigue . . .

On Tue, 8 Mar 2022 at 06:40, Daniel van Strien @.***> wrote:

@hawc2 https://github.com/hawc2 my main reason for including the full text is to help avoid people having to switch between windows but I'm happy to defer to the editorial board on this. If you prefer to keep things separate I will make a new version of the notebooks that strip out most of the prose. I will leave some in to help provide a bit of signposting at least. Let me know if that sounds okay to you?

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/343#issuecomment-1061689306, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXF4EEGEBUHHAJOA6LYRO3U644DBANCNFSM4WLFEP7A . You are receiving this because you were mentioned.Message ID: @.***>

*Alex Wermer-Colan, PhD *

Digital Scholarship Coordinator

Temple University, Scholars Studio

Mar 08 '22 13:03 hawc2

@hawc2 I have made a version of the lesson one notebook with the prose removed (https://www.kaggle.com/davanstrien/cleaned-01-progamming-historian-deep-learning-pt1). I have kept the headings to signpost where the code fits in the lesson. If this seems like a good approach to you/PH team I can do the same for part 2 and update the link in the lessons to point to that version of the notebook.

Mar 08 '22 17:03 davanstrien

Yeah, this looks good. We can do a final assessment of it once your revisions on both lessons is complete. It's possible a small amount of commentary to the code could be reinserted.

Does it make sense combine Part 1 and 2 of the lessons into a single Kaggle notebook?

One concern I have with Kaggle as the host for the data, and the notebook, is sustainability. For the code, do you foresee separate maintenance issues for your notebooks, besides the lesson itself? For the data, is it possible to also store the data through either Github's large file storage option, or Zenodo? A single GIthub repo that serves as a hub for these secondary resources may also be useful.

On Tue, 8 Mar 2022 at 12:06, Daniel van Strien @.***> wrote:

@hawc2 https://github.com/hawc2 I have made a version of the lesson one notebook with the prose removed ( https://www.kaggle.com/davanstrien/cleaned-01-progamming-historian-deep-learning-pt1). I have kept the headings to signpost where the code fits in the lesson. If this seems like a good approach to you/PH team I can do the same for part 2 and update the link in the lessons to point to that version of the notebook.

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/343#issuecomment-1062003240, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXF4EAXK23OX6LAFYN6ZDLU66CHTANCNFSM4WLFEP7A . You are receiving this because you were mentioned.Message ID: @.***>

*Alex Wermer-Colan, PhD *

Digital Scholarship Coordinator

Temple University, Scholars Studio

Mar 08 '22 19:03 hawc2

Does it make sense combine Part 1 and 2 of the lessons into a single Kaggle notebook?

Happy to combine both parts into a single Kaggle notebook.

One concern I have with Kaggle as the host for the data, and the notebook, is sustainability. For the code, do you foresee separate maintenance issues for your notebooks, besides the lesson itself?

For the Kaggle, everything should be fairly stable. The Kaggle kernel will have a pinned docker file so all the underlying dependencies for the lesson will remain fixed. There is a chance that the fastai API will change but since the lesson is mainly using high-level APIs this isn't very likely.

More generally, the risk is that Kaggle as a platform no longer exists. I think this is fairly unlikely in the short to medium term but it is a possibility. My preference for using Kaggle is that it gets people up and running for this type of work fairly quickly. Working with some type of cloud is often a prerequisite for doing deep learning work. Obviously, we don't get into it in much detail here but I think it's setting people up with the right expectations about what is going to be required if they pursue deep learning further.

For the data, is it possible to also store the data through either Github's large file storage option, or Zenodo?

Yes, these datasets are both on Zenodo (https://doi.org/10.5281/zenodo.5838410 / https://doi.org/10.5281/zenodo.4487141).

A single Github repo that serves as a hub for these secondary resources may also be useful

We started something like that here: https://github.com/davanstrien/Programming-Historian-Computer-Vision-Lessons-submission I will add the links to the Zenodo there.

From my side, I think I have addressed all of the major reviewer comments/suggestions, and the remaining tasks are on the practical setup side. If you are happy to suggest Kaggle as a first option, I can get all of that in place and update the Git repository with relevant Zenodo links. I anticipate this only taking half a day or so; hopefully, we'd be ready to publish after that.

Mar 09 '22 13:03 davanstrien

This all sounds great, Daniel. I'll chat with @nabsiddiqui https://github.com/nabsiddiqui and we'll get back to you with final steps

On Wed, 9 Mar 2022 at 08:40, Daniel van Strien @.***> wrote:

Does it make sense combine Part 1 and 2 of the lessons into a single Kaggle notebook?

Happy to combine both parts into a single Kaggle notebook.

One concern I have with Kaggle as the host for the data, and the notebook, is sustainability. For the code, do you foresee separate maintenance issues for your notebooks, besides the lesson itself?

For the Kaggle, everything should be fairly stable. The Kaggle kernel will have a pinned docker file so all the underlying dependencies for the lesson will remain fixed. There is a chance that the fastai API will change but since the lesson is mainly using high-level APIs this isn't very likely.

More generally, the risk is that Kaggle as a platform no longer exists. I think this is fairly unlikely in the short to medium term but it is a possibility. My preference for using Kaggle is that it gets people up and running for this type of work fairly quickly. Working with some type of cloud is often a prerequisite for doing deep learning work. Obviously, we don't get into it in much detail here but I think it's setting people up with the right expectations about what is going to be required if they pursue deep learning further.

For the data, is it possible to also store the data through either Github's large file storage option, or Zenodo?

Yes, these datasets are both on Zenodo ( https://doi.org/10.5281/zenodo.5838410 / https://doi.org/10.5281/zenodo.4487141).

A single Github repo that serves as a hub for these secondary resources may also be useful

We started something like that here: https://github.com/davanstrien/Programming-Historian-Computer-Vision-Lessons-submission I will add the links to the Zenodo there.

From my side, I think I have addressed all of the major reviewer comments/suggestions, and the remaining tasks are on the practical setup side. If you are happy to suggest Kaggle as a first option, I can get all of that in place and update the Git repository with relevant Zenodo links. I anticipate this only taking half a day or so; hopefully, we'd be ready to publish after that.

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/343#issuecomment-1062931373, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXF4ECGOF47CBUGT3BEEK3U7CS3LANCNFSM4WLFEP7A . You are receiving this because you were mentioned.Message ID: @.***>

*Alex Wermer-Colan, PhD *

Digital Scholarship Coordinator

Temple University, Scholars Studio

Mar 09 '22 14:03 hawc2

@davanstrien are your Zenodo datasets linked in both parts of the lesson tutorials? Just want to make sure everything is synced up

Apr 06 '22 18:04 hawc2

@davanstrien are your Zenodo datasets linked in both parts of the lesson tutorials? Just want to make sure everything is synced up

The datasets are hosted in Kaggle so the Kaggle notebooks load the data directly from Kaggle. For the Colab version, I will grab the data from Zenodo.

Apr 06 '22 18:04 davanstrien

ph-submissions ph-submissions copied to clipboard

Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification, Part 2

Anti-Harassment Policy

ph-submissions
ph-submissions copied to clipboard