sphinx icon indicating copy to clipboard operation
sphinx copied to clipboard

[WIP] Add GitLab CI workflow for incremental HTML build

Open picnixz opened this issue 1 year ago • 16 comments

Closes #11556.

@dbitouze could you check whether the workflow works in production using some fresh project (I don't have any way to test it now).

picnixz avatar Aug 15 '23 09:08 picnixz

Closes #11556.

@dbitouze could you check whether the workflow works in production using some fresh project (I don't have any way to test it now).

Yes, I'll do it (see below).

Since I encountered some issues when I followed the tutorial, I'll write them here:

  • [ ] At this step, it would be nice for the sphinx-build's version to be updated.

  • [ ] At this step, there are currently (sphinx 7.1.2) some daunting warnings:

    /home/bitouze/.local/lib/python3.8/site-packages/sphinxcontrib/applehelp/init.py:24: RemovedInSphinx80Warning: The alias 'sphinx.util.SkipProgressMessage' is deprecated, use 'sphinx.util.display.SkipProgressMessage' instead. Check CHANGES for Sphinx API modifications. from sphinx.util import SkipProgressMessage, progress_message /home/bitouze/.local/lib/python3.8/site-packages/sphinxcontrib/applehelp/init.py:24: RemovedInSphinx80Warning: The alias 'sphinx.util.progress_message' is deprecated, use 'sphinx.http_date.epoch_to_rfc1123' instead. Check CHANGES for Sphinx API modifications. from sphinx.util import SkipProgressMessage, progress_message /home/bitouze/.local/lib/python3.8/site-packages/sphinxcontrib/htmlhelp/init.py:26: RemovedInSphinx80Warning: The alias 'sphinx.util.progress_message' is deprecated, use 'sphinx.http_date.epoch_to_rfc1123' instead. Check CHANGES for Sphinx API modifications. from sphinx.util import progress_message

  • [ ] At this step, too bad make + TAB is completed only into make help.

  • [ ] At this step, we're facing the following warning:

    the "version" configuration parameter cannot be empty for EPUB3.

  • [ ] At this step, not sure whether this code is supposed to be the whole content, or added (and, in that case, when it isn't specified, where?) to the content, of the file (here docs/source/usage.rst). Same remark for next code snippets.

  • [ ] This line has different colors than what I see e.g. on Firefox.

  • [ ] At this step, it is not clear where the file lumache.py is supposed to be created. If it is at the root of the docs directory, this step returns:

    Failed example: lumache.get_random_ingredients() Exception raised: Traceback (most recent call last): File "/usr/lib64/python3.8/doctest.py", line 1336, in __run exec(compile(example.source, filename, "single", File "<doctest default[1]>", line 1, in lumache.get_random_ingredients() NameError: name 'lumache' is not defined

    Hence, the rest of this section cannot be tested. And I don't test the “Automatic documentation generation from code” chapter.

  • [ ] At this step, the README.rst file is mentioned as if it is known but it wasn't introduced before. And, once again, it is not clear where the file lumache.py is supposed to be located (ah, OK, understood: it should be in the parent directory of docs: should be specified). In fact, an archive of the whole project would be welcome at this stage.

Now, about the new “GitLab Pages (incremental build)” section.

  • [ ] The code of the .gitlab-ci.yml file introduced at line 293, cannot be seen here.
  • [ ] The job failed because of the missing furo module.
  • [ ] In order to be able to see what are the files rebuilt, I changed make html into sphinx-build source build/html -j auto -vv (the -j auto option could help to speed up the build in case of a lot of changed sources files).
  • [ ] It took me a lot of time to understand why your .gitlab-ci.yml didn't work. The reason is the official sphinxdoc/sphinx Docker image and I don't know why but the mgasphinx/sphinx-html works nicely. You can see this with the following two pipelines triggered after just a change in the .gitlab-ci.yml which:

Here is a working .gitlab-ci.yml file:

stages:
  - deploy

pages:
  cache:
    paths:
      - docs/build
  stage: deploy
  # image: sphinxdoc/sphinx
  image: mgasphinx/sphinx-html
  before_script:
    - apt-get update
    - apt-get install --no-install-recommends -y make git-restore-mtime
    - pip3 install --upgrade pip
    - pip3 install --upgrade furo
  script:
    - git restore-mtime
    - cd docs && sphinx-build source build/html -j auto -vv
  after_script:
    - cp -r docs/build/html/ ./public/
  artifacts:
    paths:
      - public
  rules:
    - if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH

dbitouze avatar Aug 15 '23 15:08 dbitouze

Thank you for your feedback.

@AA-Turner Do you have some ideas why our official docker image may not work for this case but another one (which only seems to remove /tmp/* files in addition) seems to work?

picnixz avatar Aug 16 '23 07:08 picnixz

Closes #11556.

IMHO, this doesn't close this issue which is about “Document how sphinx's change detection works” since, AFAICS, this remains unclear.

dbitouze avatar Aug 17 '23 06:08 dbitouze

Ah, actually the issue should be renamed to something like "make gitlab CI workflow incremental" (since it was your main concern eventually)

Concerning the way changes are detected, it's probably another task that should get its own issue imo.

picnixz avatar Aug 17 '23 07:08 picnixz

Ah, actually the issue should be renamed to something like "make gitlab CI workflow incremental" (since it was your main concern eventually)

Concerning the way changes are detected, it's probably another task that should get its own issue imo.

Agreed, but the current fix for this issue is pretty weak: changing the Docker image embedding sphinx can easily break it. And, AFAIU, this is related to (misunderstanding of) how sphinx change detection works.

dbitouze avatar Aug 17 '23 14:08 dbitouze

Sorry for the late reply.

  1. I still want to keep the original Sphinx image (at least it won't add an extra dependency or it doesn't rely on an external repository that may be closed one day).
  2. Change detecting algorithm is something that should be more on the developer side I think. A normal user doesn't really bother on that and it's quite an advanced topic. I'm willing to write down what I explained on the issue but I think that you wouldn't have asked how changes are detected if the original CI/CD script worked, right? that's why I said that this PR should have fixed your original issue.
  3. I won't be available until late September / mid october so there won't be any progress soon.

picnixz avatar Aug 25 '23 10:08 picnixz

Sorry for the late reply.

You're welcome!

1. I still want to keep the original Sphinx image (at least it won't add an extra dependency or it doesn't rely on an external repository that may be closed one day).

You asked @AA-Turner:

Do you have some ideas why our official docker image may not work for this case but another one [...]?

Are there some progress on this subject?

2. Change detecting algorithm is something that should be more on the developer side I think. A normal user doesn't really bother on that and it's quite an advanced topic.

Indeed. BTW, why relying on date rather than on md5sum fingerprint?

I'm willing to write down what I explained on the issue but I think that you wouldn't have asked how changes are detected if the original CI/CD script worked, right?

Right! :smile:

that's why I said that this PR should have fixed your original issue.

Okay.

3. I won't be available until late September / mid october so there won't be any progress soon.

No pressure! :wink:

dbitouze avatar Aug 25 '23 10:08 dbitouze

Indeed. BTW, why relying on date rather than on md5sum fingerprint?

Because there are projects with 70k+ RST files and even maybe even more HTML files. The speed for computing md5 also depends on the size of your file. So this would slow the whole process a lot. The date approach is faster (but a bit tricky to implement as we see).

picnixz avatar Aug 25 '23 10:08 picnixz

Just FYI, I also wrote a workflow for incremental HTML build (but for GitHub), it is based on git-restore-mtime too. I am not sure whether it can be added to Sphinx's documentation.

Besides, not only the timestamp of rst files need to be restored, timestamp of HTML themes files need to be restored too: https://github.com/sphinx-notes/pages/blob/1ef210dab7429dfcbdb06346d279b746573d147a/main.sh#L64

SilverRainZ avatar Dec 25 '23 04:12 SilverRainZ

@SilverRainZ Thank you for what you linked. Do you think it's possible for you to take over my PR and use your approach since it appears to work?

picnixz avatar Dec 30 '23 12:12 picnixz

I am glad to help with this. But still have some questions:

  • As #11556 said, document how incremental HTML build works. Once it is documented, I think Sphinx should provide compatibility guarantee at some level, does this meet Sphinx team’s expectations?
  • Do we need to provide an out-of-box Gitlab workflow (or only provide only description of the mechanism), I am not so familiar with Gitlab workflow, so it may take some time.

SilverRainZ avatar Jan 01 '24 15:01 SilverRainZ

By compatibility guarantee, do you mean in terms of OS, sphinx versions, VMs, servers etc?

Also, I think what users want in priority is a working script they can just use if they use gitlab (but we should probably tell them which gitlab version is supported; if possible I want it to be version-agnostic).

As for documenting, it should be a separate task. As I said, people don't bother about how it works until it fails. The issue is titled "document" but I think it should be more "make it work".

Don't worry about the time because, while it's a reasonable feature, I don't know whether there are many people asking for it. I picked it up at that time since I had some time but I wouldn't tell you to lose time if you don't have much.


Now, IIRC the main issue was with the docker image. I still don't know whether the official image is synced or not and I don't know whether Adam is willing to use an external one actually.

And if we choose one image we should stick to it I guess. Note that I am no expert in container and the likes so I don't really know why pur image fails.

Finally, you could indicate in the workflow "hey put some additional command here if you want to specify your own requirements" (in my example the furo module was missing because I assumed that the docs being built have no deps).

picnixz avatar Jan 01 '24 15:01 picnixz

As for documenting, it should be a separate task. As I said, people don't bother about how it works until it fails. The issue is titled "document" but I think it should be more "make it work".

@picnixz Feel free to retitle it :smiley:

dbitouze avatar Jan 01 '24 16:01 dbitouze

By compatibility guarantee, do you mean in terms of OS, sphinx versions, VMs, servers etc?

I mean the compatibility of code implementation of "Sphinx's change detection", once it is changed, the related workflows will break.

The issue is titled "document" but I think it should be more "make it work".

I get it.


How about writing a small Python script to do this instead of writing shell script in .yml files? Two advantages:

  • We need to restore the mtime of sphinx HTML theme files (themes may be newly installed from PyPI so they have fresh mtimes), python can find these files easily.
  • We can reuse it for both GitHub Actions and Gitlab workflows, and even more CI/CD systems.

SilverRainZ avatar Jan 02 '24 03:01 SilverRainZ

Just FYI, I will try to work on this next week :D

SilverRainZ avatar Feb 02 '24 14:02 SilverRainZ

Ah thank you. Don't worry about deadlines because as I said, I'm only passing by today. I'll be back to business in 2 weeks or so.

picnixz avatar Feb 02 '24 15:02 picnixz