apify-docs icon indicating copy to clipboard operation
apify-docs copied to clipboard

feat: kick off the Python course

Open honzajavorek opened this issue 1 year ago โ€ข 4 comments

Kick off of the Python course, yay! ๐Ÿ

Important notes

  • I'm strategically skipping devtools lessons. For now they'll be there only as stubs, sending people to read the original lessons from the JS course, as the contents will be the same, perhaps just refurbished.
  • The aim of this PR is to prepare an initial bulk of content for a start. After this gets merged, I plan to do continuous delivery with small iterative additions. This has been discussed in https://github.com/apify/apify-docs/issues/954
  • The link check fails. It seems to check the PR content against the live website, so it comes to the conclusion that my links to Python lessons are broken, because at docs.apify.com they (obviously) result in 404s. If we want the link check to work under PRs like these, it would need to start a local instance of the docs and rewrite local links on the fly so they hit against that instance.
  • I'm following the model of having a flat structure of lessons. They work with a pretty much isolated learning environment. At the end of each lesson there is a set of exercises which motivate the student to touch the real world. They should be also a challenging, allowing those students who do the exercises to learn more stuff aside of the main line of the tutorial. If these exercises break because the underlying real world website breaks, they should be pretty easy to fix or replace.
  • I'm very open to any criticism, corrections, suggestions! Tear this apart!

Progress

  • [x] intro
  • [ ] ~~devtools~~ (stub)
  • [ ] ~~devtools locating elements~~ (stub)
  • [ ] ~~devtools extracting~~ (stub)
  • [x] python downloading html
  • [x] python parsing html
  • [x] python locating elements
  • [x] python extracting
  • [ ] ~~python saving data~~ (out of scope of this PR)
  • [ ] ~~crawling... etc.~~ (out of scope of this PR)

Related PRs

  • [x] https://github.com/apify/apify-docs/pull/1076

honzajavorek avatar May 21 '24 16:05 honzajavorek

Thank you all for the thorough reviews! ๐Ÿ™ It's a big chunk and that's never easy to go through. You found many things I didn't notice. There's quite a few improvements I can do.

There's one repeating concern and that's whether we should focus purely on scraping, or whether we should add a bit more context. And also, if we decide to add this context, how it should be done.

The Czech PyLadies chapter has this open source course of Python basics. Specifically this one for self-learners is extremely popular among many beginners here. I believe other courses, such as Codecademy and similar ones, have similar scope. Being someone who is around PyLadies and beginners for a few years, I thought it would be a good idea to position the scraping course as something suitable as an advanced course for people who have the basics, where "the basics" means the scope of naucse.python.cz. As mentioned in the requirements of the scraping course intro:

Requirements

  • macOS, Linux or Windows machine with a web browser and Python installed
  • Familiar with Python basics: variables, conditions, loops, functions, strings, lists, dictionaries, files, classes, exceptions
  • Comfortable importing from the Python standard library, using virtual environments, and installing dependencies with pip
  • Familiar with how to run commands in Terminal or Command Prompt

Question: I write the course with someone like a typical beginner in Python in my mind. Is such approach okay with you? Or do you want to throw more stuff to the requirements and make it more advanced (perhaps easier to write the lessons, but less accessible)?

I chose that audience deliberately, and I thought it's good, because it still caters to a lot of beginners. That way it can be a follow-up to any 'basics of Python' course, but it's a higher bar than what the current JS scraping course has, so we won't need to explain things like for loops or installing editor ๐Ÿ˜…

That said, I want to stress that the text I produced can be made less explanatory even with a typical beginner in Python as an audience. I wasn't sure myself about the exceptions or parsing-as-text sections. I like the build-up aspect, but as you point out, maybe it's too much? I don't know. Instead of thinking about it too much, I decided to wait for your second, third, and fourth pair of eyes. I'll revise the chapters, edit certain parts. So thanks for the feedback, it's much appreciated.

Also we shouldn't be afraid to remove text, even whole sections or lessons. If it doesn't fit, let's just trash it. You have no idea how much text I've already trashed when writing the lessons ๐Ÿ˜„

honzajavorek avatar Jul 16 '24 11:07 honzajavorek

I like the idea of writing it with beginners in mind, and I like having that explanatory content. ๐Ÿ‘ My only concern is that we now have only one "Python Scraping Basics" tutorial, that will be consumed both by Python beginners, as well as Python experts, who just need a quick intro into web-scraping-specific libs and thought processes.

So I think we should either:

  • make the course for beginner devs and let the experts conveniently identify and skip the beginner sections

or

  • make it for experts, and add the beginner stuff as links to other content where the topics are explained in more detail

The only strong opinion I have is that we should choose one, and not a mixture of both.

Btw, I think we should apply the chosen strategy to the JS course as well.

mnmkng avatar Jul 16 '24 15:07 mnmkng

@mnmkng I agree this is what we need to fine-tune now. I don't think I'm able to decide or suggest what I like more at this moment though. I'll roll up my sleeves, play with the sections according the feedback under this PR, and see what comes out, what seems to work bestโ€”whether it's closer to one or the other way.

Once it ships, I can also send a few beginners and a few experts (with no scraping knowledge) to go through the materials, collect feedback on what annoys them in the course, and fix it.

honzajavorek avatar Jul 17 '24 08:07 honzajavorek

@mnmkng @metalwarrior665 I think I've addressed everything. Can I ask you for a second (quick?) scan of the PR? I tried to identify and cut out as many unnecessary explanations as possible. I tried to focus the main content on scraping, and then:

  • Use TIP admonitions for stuff which is for beginners. I tried to be brief. The attitude of the tips is something like "you may want to recap this..." or "if you happen to do this for the first time...", so although it's for beginners, it shouldn't annoy more experienced people - it's more like "let's level the knowledge, we don't know where are your knowledge gaps". Senior people can just skip those.
  • Use INFO admonitions for advanced stuff, which isn't necessary at all and could be deleted from the course, but I thought is interesting and might serve as something to explore for curious minds. Junior people can just skip those. Well, anyone can just skip those.

In the end I didn't move anything to a separate page under concepts, I didn't feel like any of the things deserves it. I thought they're better removed or shortened to admonitions.

Do you think such approach could work?

honzajavorek avatar Jul 18 '24 15:07 honzajavorek

I ran the docs locally and read the course in its natural habitat and IMO it looks pretty good! (I didn't test the code)

mnmkng avatar Aug 30 '24 13:08 mnmkng

TODO for myself:

  • [x] rebase
  • [x] changes in the patch file should now go to the config https://github.com/apify/apify-docs/pull/1176/files#diff-a038096cbdea434999e1dce5ab497212f1fe18204dde1a027ce3bdd663261a2a

honzajavorek avatar Sep 09 '24 08:09 honzajavorek

This PR will continue here: https://github.com/apify/apify-docs/pull/1197 I'll document all additional changes in the description.

honzajavorek avatar Sep 10 '24 08:09 honzajavorek