quire icon indicating copy to clipboard operation
quire copied to clipboard

Image Directory through CDN: src URL is being converted to a relative URL

Open Erin-Cecele opened this issue 1 year ago • 7 comments

Discussed in https://github.com/thegetty/quire/discussions/861

Originally posted by rrolonNelson October 10, 2023 We are delivering our images from a CDN with images stored in a blob storage, but have encountered an issue where the src URL is being converted to a relative URL.

config.yaml imageDir: 'https://blobcdn.nelson-atkins.org/wpmediablob/Starr'

figures.yaml src: figs/5224_c1.png

desired outcome https://blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png

actual outcome https:/blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png which resolves to {URL of current page}/https:/blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png

In build and preview, we have the same issue when you look at the image src URL but in preview, the images will be visible. (We are using the fix for subdirectories in a build where we had to change the file in the node_modules)

Preview: preview-path-src

Build: build-path-src

I think the issue comes from using path.join() which will remove any '//'. Based on the code in image-tag.js it looks like there is a conditional that checks if the src from figures.yaml starts with an HTTP. To resolve the issue in the light box and inline images we opted to have all the image sources in figures.yaml to have the absolute URL however this seems to break the images on the table of content cards since in table-of-contents/item/image.js the check for if the src starts with HTTP is not there so it will join the imageDir and src regardless of if it starts with an HTTP and make it a path also.
thumbnail_Screen Shot 2023-10-04 at 3 50 07 PM

thumbnail_Screen Shot 2023-10-04 at 3 50 42 PM

config.yaml imageDir: 'https://blobcdn.nelson-atkins.org/wpmediablob/Starr'

figures.yaml src: figs/5224_c1.png

desired outcome https://blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png

actual outcome on table of contents https:/blobcdn.nelson-atkins.org/wpmediablob/Starr/https:/blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png which resolves to {URL of current page}https:/blobcdn.nelson-atkins.org/wpmediablob/Starr/https:/blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png Build: toc-image-with-absurl-src

Is there a correct way to use images from a blob/cdn or should we endeavor to add the HTTP check to all the places where image src are generated or add a package to join an absolute URL with a path without turning them into relative URLs?

Erin-Cecele avatar Oct 11 '23 17:10 Erin-Cecele

Thank you @rrolonNelson, for starting this discussion #861. I have now moved it to an official issue, which we will review with our software engineers.

Thank you also to @cbutcosk for your contribution to helping us understand this issue better:

+1, that section of image-tag.js also causes issues if you're pointing to an internal IIIF resource like a staticInlineImage, which also won't be in the asset-images directory and so joining the path produces image 404s

https://github.com/thegetty/quire/discussions/861#discussioncomment-7253418

Erin-Cecele avatar Oct 11 '23 17:10 Erin-Cecele

As a straw man to capture requirements I wrote the following example test cases

const assert = require('node:assert')
const path = require('node:path')
const test = require('node:test')

// "mock" Quire configuration
const config = {
  baseURL: 'https://nelson-atkins.org',
  cdnURL: 'https://blobcdn.nelson-atkins.org/wpmediablob/',
  imageDir: '/assets/images/'
}

/**
 * Resolve the full path for an image
 * Any valid src URI, including using a 'file:' scheme, is passed through unchanged.
 * Relative src paths are qualified with a base URL: either the baseURL or cdnURL.
 *
 * Nota bene: this code is a "strawman" for handling image path use cases,
 * the pattern for integrating this into the current Quire code is TBD...
 */
function resolveImageSrc (src) {
  const cdn = src.startsWith('cdn:')
  src = cdn ? src.slice(4) : src
  let url;
  try {
    url = new URL(src)
  } catch (TypeError) {
    const base = cdn 
      ? config.cdnURL
      : path.join(config.baseURL, config.imageDir);
    url = new URL(src, base)
  } finally {
    return url
  }
}

test('fully qualified URI', () => {
  const src = resolveImageSrc('https://blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png')
  assert.equal(src.href, 'https://blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png')  
})

test('path using file: scheme', () => {
  const src = resolveImageSrc('file:Starr/figs/5224_c1.png')
  assert.equal(src.href, 'file:///Starr/figs/5224_c1.png')  
})

test('relative path', () => {
  const src = resolveImageSrc('Starr/figs/5224_c1.png')
  assert.equal(src.href, 'https://nelson-atkins.org/assets/images/Starr/figs/5224_c1.png')  
})

test('path using a pseudo-scheme', () => {
  const src = resolveImageSrc('cdn:Starr/figs/5224_c1.png')
  assert.equal(src.href, 'https://blobcdn.nelson-atkins.org/wpmediablob/Starr/figs/5224_c1.png')  
})

mphstudios avatar Oct 12 '23 01:10 mphstudios

@mphstudios Thanks for a peak into a possible solution. That said from my perspective this seems too much complexity for an img tag.

Amending the existing but buggy check in the image tag to only prepend the image asset path for relative file paths would have solved this issue for me when I encountered it (which coincidentally also involved a possibility of emitting actual file:// URLs in img tag src attributes, so please don't overload that URL protocol!).

cbutcosk avatar Oct 27 '23 14:10 cbutcosk

I am working on this today. I added the check from image-tag.js to table-of-contents/item/image.js and am now able to have the images visible in the table of contents images. I will get a PR up later today for review. However, I am unsure if there are other areas where the img src is being built like this that might cause issues in other places.

rrolonNelson avatar Oct 27 '23 15:10 rrolonNelson

I made a pull request and am open to feedback. I tried to also get the alt text to be added in but it doesn't seem to want to work https://github.com/thegetty/quire/pull/874

rrolonNelson avatar Oct 27 '23 20:10 rrolonNelson

@rrolonNelson #874 looks good, thank you for submitting the fix; I opened #878 to set the figure image alt text.

mphstudios avatar Nov 15 '23 02:11 mphstudios

Citing @mphstudios's comment in PR #874

@rrolonNelson reviewing this solution locally we discovered that the change will break images for the epub output; until we can revisit the solution, if you do not need epub output, the change in this pull-request can be made directly to a local Quire publication.

This issue has not been resolved for all formats and will remain on backlog for the time being.

Erin-Cecele avatar Nov 30 '23 00:11 Erin-Cecele