python-scraperlib Create/publish the user documentation

At this stage this seems necessary to publish a user documentation. Https://Readthedocs.com seems a good candidate.

Jan 07 '23 17:01 kelson42

Thanks to @benoit74 for reporting the problem

Jan 07 '23 17:01 kelson42

I'm not sure that we need to transform existing documentation. Adding a paragraph on all scrapers README with a link pointing to existing one would be sufficient from my PoV.

Jan 14 '23 19:01 benoit74

Probably a good first step

Jan 14 '23 19:01 rgaudin

@benoit74 Your link is not the scraperlib documentation. Explaining the goals, the paradigm of the scraperlib and the dev facing API of the scraperlib is the puropose of this ticket.

Jan 14 '23 19:01 kelson42

I don't think this is the correct way to address what I believe the underlying issue is. What we need is a general Make/Contribute-to/ a Scraper documentation. This would fit with the other documents referenced above and would be actually useful while a standalone scraperlib doc won't be of much help to do so.

Sure we could gen the auto-doc off the source code but the value vs just browsing the code on Github would be minimal. Just afraid this would close this ticket without helping onboarding.

Jan 14 '23 21:01 rgaudin

@rgaudin Not 100% sure I get you right. But I can only disagree with a comment which in a nutshell says "no need to document library API, devs who want to use it should read the code".

For the rest, general (cross repo) documentation can have a place in the "overview" repo. But, how to use the scraperlib does not match this definition IMO.

Lets discuss this in detail at the next meeting.

Jan 14 '23 21:01 kelson42

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

May 26 '23 16:05 stale[bot]

As someone who just created a new scraper, here are my thoughts:

It was difficult for me to understand what capabilities existed in the scraperlib versus what I was expected to build.
It wasn't clear to me when the documents/code here, the GitHub wiki, or the OpenZim wiki was preferred.

I like automatically updating docs based on current source code e.g. GitHub pages triggered on release/merge because it's a faster way for me to understand the shape/intent of a library without diving into the details.

Sep 24 '24 16:09 josephlewis42

I like automatically updating docs based on current source code

We all do I believe and I think that scraperlib would be a good candidate! Do you want to take a look at it? I think we mostly need this setup, with the appropriate actions and the pointers to the right docstring convention.

We'd host it at readthedocs I think, as we do for other projects.

I believe libzim uses sphinx but I have also used mkdocs in the past and I find it easy to configure, flexible, readable and good looking. It's an active and popular tool.

Here's a good example of config for it: https://github.com/pawamoy/aria2p/blob/main/mkdocs.yml

Sep 25 '24 07:09 rgaudin

@rgaudin sure, this is something I can look at probably in four-ish weeks assuming the issue is still open.

I've got a bit more to do with the DevDocs scraper to improve the UX after the release, once that's done I can scoot over to this assuming no other show-stoppers.

Sep 25 '24 17:09 josephlewis42

I've published the documentation at https://python-scraperlib.readthedocs.io/en/latest/

Obviously, only current version is published because it misses the proper mkdocs code / configuration to build doc for former version. I don't think it is worth it / possible to invest time in publishing doc for previous versions. Once a new version is published, it should start to pin these documentation (in addition to "latest" which we already have).

Dec 12 '24 09:12 benoit74

Can you please review online website and advise on what is left to do from your PoV?

Aside from fixing docstrings so that everything fits nicely, I don't see much "needed" changes. Probably many things can (and hopefully will) be fine-tuned, but nothing mandatory so far.

Dec 12 '24 09:12 benoit74

This is great 🚀 I've looked at it briefly and I think the overall doc is OK. Content is beyond the scope of this and will be improved overtime.

Dec 12 '24 10:12 rgaudin

I suppose this is enough for now then, since there is no more feedback, let's close this issue as done

Dec 19 '24 16:12 benoit74