sphinx icon indicating copy to clipboard operation
sphinx copied to clipboard

Regression: restore extraction of data-URI images from source for builders whose output formats do not support them natively.

Open jayaddison opened this issue 1 year ago • 2 comments

Feature or Bugfix

  • Bugfix

Purpose

  • Restore extraction of images from data: URIs even when the builder has the supported_remote_images flag set to False (the default).

Ambiguity

I think the change could seem counter-intuitive. If I saw this changeset I might ask: if remote images are disabled, then why should we parse and extract an image from a URI on an image node?

The way I've resolved that question while developing it is:

These data-URI images can be considered 'local' to the source -- they're found in the input reStructuredText content, so we don't need to check the builder's supported_remote_images attribute at all during the DataURIExtractor.match method; these are not remote images.

Detail

  • Removes a conditional check that caused a regression when it was fixed to compare against a compatible Python type in 0a76199f994ea07ad8429fab3e1798fe01cdde80.

Expected behaviour description

  • When post-transforming source image nodes using the DataURIExtractor:
    • If the builder's file formats supports data: URIs natively, leave the URI reference on the image unmodified, and allow the builder to process it normally.
    • Otherwise, if the URI begins with data:, and the contents are declared as having a supported MIME type, then decode the contents and write them to a hash-based filename with the corresponding filetype extension suffixed.

Relates

  • Resolves #12331.

jayaddison avatar May 02 '24 21:05 jayaddison

I'll remove the shoutiness from the pull request description; that was written at a moment where I'd emerged from confusion about what the supported_data_uri_images flag on a builder means.

jayaddison avatar May 08 '24 22:05 jayaddison

@chrisjsewell if+when you have time, I'd appreciate review on this changeset:

  • It resolves what seems like a regression: some output images began not to appear in (LaTeX builder) output.
  • The images were specified in source documents using data: URIs (image encoded inline in-source).
  • The fix is essentially to allow reading those inline images regardless of what the target output format is.
  • The config setting supported_data_uri_images is interpreted as "output format natively supports data: URIs" -- so in those cases we don't extract the inline image and write it to file, instead we simply pass the data: URI as-is into the written output format.

jayaddison avatar Jun 28 '24 11:06 jayaddison

Thanks @AA-Turner!

jayaddison avatar Jul 03 '24 09:07 jayaddison