sage icon indicating copy to clipboard operation
sage copied to clipboard

Doc builds failing due to space issue

Open whoami730 opened this issue 2 months ago • 6 comments

Steps To Reproduce

https://github.com/sagemath/sage/actions/runs/18127830359/job/51587089717?pr=40923

[plotting ] Traceback (most recent call last):
[plotting ]   File "/usr/share/miniconda/envs/sage-dev/lib/python3.11/site-packages/matplotlib/sphinxext/plot_directive.py", line 552, in _run_code
[plotting ]     exec(code, ns)
[plotting ]   File "<string>", line 1, in <module>
[plotting ]   File "<string>", line 47, in sphinx_plot
[plotting ]   File "sage/plot/plot3d/base.pyx", line 1938, in sage.plot.plot3d.base.Graphics3d.save
[plotting ]     self.save_image(filename, **kwds)
[plotting ]   File "sage/plot/plot3d/base.pyx", line 1860, in sage.plot.plot3d.base.Graphics3d.save_image
[plotting ]     self._save_image_png(filename, **kwds)
[plotting ]   File "sage/plot/plot3d/base.pyx", line 1823, in sage.plot.plot3d.base.Graphics3d._save_image_png
[plotting ]     scene.preview_png.save_as(filename)
[plotting ]   File "/usr/share/miniconda/envs/sage-dev/lib/python3.11/site-packages/sage/repl/rich_output/buffer.py", line 311, in save_as
[plotting ]     f.write(self.get())
[plotting ] OSError: [Errno 28] No space left on device [docutils]
[plotting ] The HTML pages are in src/doc/html/en/reference/plotting.
Error building the documentation.
Traceback (most recent call last):
  File "/home/runner/work/sage/sage/src/build-docs.py", line 11, in <module>
    main()
  File "/home/runner/work/sage/sage/src/sage_docbuild/__main__.py", line 548, in main
    build()
  File "/home/runner/work/sage/sage/src/sage_docbuild/builders.py", line 669, in _wrapper
    getattr(DocBuilder, build_type)(self, *args, **kwds)
  File "/home/runner/work/sage/sage/src/sage_docbuild/builders.py", line 142, in f
    runsphinx()
  File "/home/runner/work/sage/sage/src/sage_docbuild/sphinxbuild.py", line 324, in runsphinx
    sys.stderr.raise_errors()
  File "/home/runner/work/sage/sage/src/sage_docbuild/sphinxbuild.py", line 255, in raise_errors
    raise OSError(self._error)
OSError: /usr/share/miniconda/envs/sage-dev/lib/python3.11/site-packages/sage/plot/point.py:docstring of sage.plot.point.point:47: WARNING: Exception occurred in plotting point-8
ninja: build stopped: subcommand failed.
INFO: autodetecting backend as ninja
INFO: calculating backend command to run: /usr/share/miniconda/envs/sage-dev/bin/ninja -C /home/runner/work/sage/sage/builddir doc-html
Error: Process completed with exit code 1.

Checklist

  • [x] I have searched the existing issues for a bug report that matches the one I want to file, without success.
  • [x] I have read the documentation and troubleshoot guide

whoami730 avatar Sep 30 '25 12:09 whoami730

I can't reproduce this (https://github.com/sagemath/sage/pull/40950), so not sure what can be done about this.

user202729 avatar Oct 03 '25 16:10 user202729

the issue happens more often now.

user202729 avatar Nov 15 '25 06:11 user202729

It looks like every build on every PR creates new cache entries, and the cache key is timestamped. https://github.com/sagemath/sage/actions/caches

I've experienced out of space issues on my own Sage fork, where the only person using the CI is myself.

Is there any reason not to do the following:

  1. Have 1 cache per system (or per system/python combination) on develop, depending on space maybe only for the most important systems (i.e. Ubuntu, Fedora, Mac, Windows)
  2. PRs restore the cache from develop, and have to recompile anything that has changed. For PRs that only change Python code, restoring the cache from develop should be fine. For PRs that change small amounts of Cython code (unless it is in a heavily imported file like element.pyx) the recompilation time should be minimal as most of the cache will work. For PRs that make major changes to Cython code or update important files, this would slow down subsequent builds.
  3. PRs do not save their own cache

This would slow down builds for PRs that make major changes to Cython code (or make minor changes to important files like element.pyx), or that update dependencies, but those are fairly infrequent. This would greatly reduce our space usage. I guess hypothetically if something went wrong with the build on develop then all PR builds would be slow because they can't use the cache until develop is fixed.

vincentmacri avatar Nov 17 '25 21:11 vincentmacri

Least recently used caches will be automatically evicted to limit the total cache storage to 10 GB.

I don't think it really cause issue. Each build only fetch about one cache item (roughly 60 MB) anyway.

Note that there's an inherent "race condition" where you may have two actions fetch cache X, test for 30 minutes, then concurrently write to cache X.

user202729 avatar Nov 18 '25 00:11 user202729

@tobiasdiez any idea if any recent changes to doc-html may cause this?

user202729 avatar Nov 18 '25 00:11 user202729

We had out-of-space issues for a very long time. My guess (but it's really not more than a guess) is that it's due to the symlinks / multiple generation of the static files. In that case, https://github.com/sagemath/sage/pull/41156 would solve this issue (and a couple of other problems as well).

tobiasdiez avatar Nov 18 '25 04:11 tobiasdiez

#41248 needs review.

kwankyu avatar Dec 02 '25 13:12 kwankyu