image-process icon indicating copy to clipboard operation
image-process copied to clipboard

"make publish" fails when the plugin attempts to process the first file with images

Open netllama opened this issue 2 years ago • 11 comments

I'm trying to setup my first pelican based blog on a Fedora Linux system using pelican-4.8.0 with pelican-image-process-3.0.3.

If I run make html or make regenerate everything works great. However once I'm ready to generate the final production content with make publish it blows up on the first file that has image references with IsADirectoryError: [Errno 21] Is a directory: '/home/netllama/stuff/llamaland/content/'. That's literally the content directory with all of my markdown files, so I'm not at all sure why this is an issue.

I re-ran with --debug:

pelican --debug /home/netllama/stuff/llamaland/content -o /home/netllama/stuff/llamaland/output -s /home/netllama/stuff/llamaland/publishconf.py

and got the following output, but I don't understand why its failing and expecting the content directory to be anything other than a directory:

           INFO     Writing /home/netllama/stuff/llamaland/output/western-usa.html                                                                                                                                 writers.py:212
           DEBUG    [image_process] harvesting '/home/netllama/stuff/llamaland/output/western-usa.html'                                                                                                      image_process.py:266
           CRITICAL IsADirectoryError: [Errno 21] Is a directory: '/home/netllama/stuff/llamaland/content/'                                                                                                       __init__.py:566
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/__init__.py:562 in │
│ main                                                                                             │
│                                                                                                  │
│   559 │   │   │   watcher = FileSystemWatcher(args.settings, Readers, settings)                  │
│   560 │   │   │   watcher.check()                                                                │
│   561 │   │   │   with console.status("Generating..."):                                          │
│ ❱ 562 │   │   │   │   pelican.run()                                                              │
│   563 │   except KeyboardInterrupt:                                                              │
│   564 │   │   logger.warning('Keyboard interrupt received. Exiting.')                            │
│   565 │   except Exception as e:                                                                 │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/__init__.py:127 in │
│ run                                                                                              │
│                                                                                                  │
│   124 │   │                                                                                      │
│   125 │   │   for p in generators:                                                               │
│   126 │   │   │   if hasattr(p, 'generate_output'):                                              │
│ ❱ 127 │   │   │   │   p.generate_output(writer)                                                  │
│   128 │   │                                                                                      │
│   129 │   │   signals.finalized.send(self)                                                       │
│   130                                                                                            │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/generators.py:703  │
│ in generate_output                                                                               │
│                                                                                                  │
│   700 │                                                                                          │
│   701 │   def generate_output(self, writer):                                                     │
│   702 │   │   self.generate_feeds(writer)                                                        │
│ ❱ 703 │   │   self.generate_pages(writer)                                                        │
│   704 │   │   signals.article_writer_finalized.send(self, writer=writer)                         │
│   705 │                                                                                          │
│   706 │   def refresh_metadata_intersite_links(self):                                            │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/generators.py:605  │
│ in generate_pages                                                                                │
│                                                                                                  │
│   602 │   │                                                                                      │
│   603 │   │   # to minimize the number of relative path stuff modification                       │
│   604 │   │   # in writer, articles pass first                                                   │
│ ❱ 605 │   │   self.generate_articles(write)                                                      │
│   606 │   │   self.generate_period_archives(write)                                               │
│   607 │   │   self.generate_direct_templates(write)                                              │
│   608                                                                                            │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/generators.py:474  │
│ in generate_articles                                                                             │
│                                                                                                  │
│   471 │   │   │   self.hidden_translations, self.hidden_articles                                 │
│   472 │   │   ):                                                                                 │
│   473 │   │   │   signals.article_generator_write_article.send(self, content=article)            │
│ ❱ 474 │   │   │   write(article.save_as, self.get_template(article.template),                    │
│   475 │   │   │   │     self.context, article=article, category=article.category,                │
│   476 │   │   │   │     override_output=hasattr(article, 'override_save_as'),                    │
│   477 │   │   │   │     url=article.url, blog=True)                                              │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/writers.py:269 in  │
│ write_file                                                                                       │
│                                                                                                  │
│   266 │   │   │   # no pagination                                                                │
│   267 │   │   │   localcontext = _get_localcontext(                                              │
│   268 │   │   │   │   context, name, kwargs, relative_urls)                                      │
│ ❱ 269 │   │   │   _write_file(template, localcontext, self.output_path, name,                    │
│   270 │   │   │   │   │   │   override_output)                                                   │
│   271                                                                                            │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/writers.py:216 in  │
│ _write_file                                                                                      │
│                                                                                                  │
│   213 │   │   │                                                                                  │
│   214 │   │   │   # Send a signal to say we're writing a file with some specific                 │
│   215 │   │   │   # local context.                                                               │
│ ❱ 216 │   │   │   signals.content_written.send(path, context=localcontext)                       │
│   217 │   │                                                                                      │
│   218 │   │   def _get_localcontext(context, name, kwargs, relative_urls):                       │
│   219 │   │   │   localcontext = context.copy()                                                  │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/blinker/base.py:263 in     │
│ send                                                                                             │
│                                                                                                  │
│   260 │   │   │   │   │   │   │   '%s given' % len(sender))                                      │
│   261 │   │   else:                                                                              │
│   262 │   │   │   sender = sender[0]                                                             │
│ ❱ 263 │   │   return [(receiver, receiver(sender, **kwargs))                                     │
│   264 │   │   │   │   for receiver in self.receivers_for(sender)]                                │
│   265 │                                                                                          │
│   266 │   def has_receivers_for(self, sender):                                                   │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/blinker/base.py:263 in     │
│ <listcomp>                                                                                       │
│                                                                                                  │
│   260 │   │   │   │   │   │   │   '%s given' % len(sender))                                      │
│   261 │   │   else:                                                                              │
│   262 │   │   │   sender = sender[0]                                                             │
│ ❱ 263 │   │   return [(receiver, receiver(sender, **kwargs))                                     │
│   264 │   │   │   │   for receiver in self.receivers_for(sender)]                                │
│   265 │                                                                                          │
│   266 │   def has_receivers_for(self, sender):                                                   │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/plugins/image_proc │
│ ess/image_process.py:280 in harvest_images                                                       │
│                                                                                                  │
│   277 │   │   context["IMAGE_PROCESS_COPY_EXIF_TAGS"] = False                                    │
│   278 │                                                                                          │
│   279 │   with open(path, "r+", encoding=context["IMAGE_PROCESS_ENCODING"]) as f:                │
│ ❱ 280 │   │   res = harvest_images_in_fragment(f, context)                                       │
│   281 │   │   f.seek(0)                                                                          │
│   282 │   │   f.truncate()                                                                       │
│   283 │   │   f.write(res)                                                                       │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/plugins/image_proc │
│ ess/image_process.py:354 in harvest_images_in_fragment                                           │
│                                                                                                  │
│   351 │   │                                                                                      │
│   352 │   │   elif d["type"] == "responsive-image" and "srcset" not in img.attrs:                │
│   353 │   │   │   # srcset image specification.                                                  │
│ ❱ 354 │   │   │   build_srcset(img, settings, derivative)                                        │
│   355 │   │                                                                                      │
│   356 │   │   elif d["type"] == "picture":                                                       │
│   357 │   │   │   # Multiple source (picture) specification.                                     │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/plugins/image_proc │
│ ess/image_process.py:446 in build_srcset                                                         │
│                                                                                                  │
│   443                                                                                            │
│   444 def build_srcset(img, settings, derivative):                                               │
│   445 │   path = compute_paths(img, settings, derivative)                                        │
│ ❱ 446 │   if not is_img_identifiable(path.source):                                               │
│   447 │   │   logger.warn(                                                                       │
│   448 │   │   │   "%s Skipping image %s that could not be identified by Pillow",                 │
│   449 │   │   │   LOG_PREFIX,                                                                    │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/plugins/image_proc │
│ ess/image_process.py:438 in is_img_identifiable                                                  │
│                                                                                                  │
│   435                                                                                            │
│   436 def is_img_identifiable(img_filepath):                                                     │
│   437 │   try:                                                                                   │
│ ❱ 438 │   │   Image.open(img_filepath)                                                           │
│   439 │   │   return True                                                                        │
│   440 │   except (FileNotFoundError, UnidentifiedImageError):                                    │
│   441 │   │   return False                                                                       │
│                                                                                                  │
│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/PIL/Image.py:3092 in open  │
│                                                                                                  │
│   3089 │   │   filename = fp                                                                     │
│   3090 │                                                                                         │
│   3091 │   if filename:                                                                          │
│ ❱ 3092 │   │   fp = builtins.open(filename, "rb")                                                │
│   3093 │   │   exclusive_fp = True                                                               │
│   3094 │                                                                                         │
│   3095 │   try:                                                                                  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
IsADirectoryError: [Errno 21] Is a directory: '/home/netllama/stuff/llamaland/content/'

netllama avatar Oct 16 '22 18:10 netllama

ImageProcess is trying to open an image file, but for some reason, the path it was given points to a directory. The faulty path.source is computed inside compute_path()

│ /home/netllama/stuff/llamaland_virtenv/lib64/python3.10/site-packages/pelican/plugins/image_proc │
│ ess/image_process.py:446 in build_srcset                                                         │
│                                                                                                  │
│   443                                                                                            │
│   444 def build_srcset(img, settings, derivative):                                               │
│   445 │   path = compute_paths(img, settings, derivative)                                        │
│ ❱ 446 │   if not is_img_identifiable(path.source):                                               │
│   447 │   │   logger.warn(                                                                       │
│   448 │   │   │   "%s Skipping image %s that could not be identified by Pillow",                 │
│   449 │   │   │   LOG_PREFIX,                                                                    │

Since you only have problems with make publish, could it be that your SITEURL in Pelican settings is improperly set?

patrickfournier avatar Oct 17 '22 18:10 patrickfournier

@patrickfournier thanks for the reply. I'm not sure what you mean by SITEURL is improperly set. It looks correct, and if I disable the image-process plugin, then make publish completes without any errors.

How is image-process using/consuming the value of SITEURL?

netllama avatar Oct 17 '22 20:10 netllama

I was looking at the code of compute_paths(). My impression is that the problem lies there. It uses SITEURL in some cases, so I was wondering if this could be the problem.

Another possibility is that one of your image src attribute is wrong.

If you are familiar enough with Python, you could try adding some logger.debug() in compute_paths to see how the source value is computed.

patrickfournier avatar Oct 17 '22 21:10 patrickfournier

Thanks. My SITEURL is: SITEURL = 'https://netllama.linux-sxs.org/ll'

I inserted logger.debug(f'site_url = {site_url}\tsite_url_path = {site_url_path}\tsrc_path = {src_path}') as line 406, which gave me:

site_url = ParseResult(scheme='https',                 image_process.py:406
                    netloc='netllama.linux-sxs.org', path='/ll',                               
                    params='', query='', fragment='')   site_url_path = ll                     
                    src_path =  

So src_path appears to be null. That got me thinking that perhaps the problem is the image src attribute. In my markdown, I have all images referenced as follows:

![](pix/trips/2020-07_us-west/slides/slide_IMG_9856_-_IMG_9879.jpg){: .image-process-large-photo}
![](pix/trips/2020-07_us-west/slides/slide_IMG_9689.JPG){: .image-process-large-photo}

When pelican creates the html, it looks like this:

<img alt="" class="image-process-large-photo" src="pix/trips/2020-07_us-west/slides/slide_IMG_20200702_143131.jpg">
<img alt="" class="image-process-large-photo" src="pix/trips/2020-07_us-west/slides/slide_IMG_9692.JPG">

Is that relative path to the images what is causing this to blow up?

netllama avatar Oct 18 '22 00:10 netllama

Most probably. Try putting a / at the start of your image paths:

![](/pix/trips/2020-07_us-west/slides/slide_IMG_9856_-_IMG_9879.jpg){: .image-process-large-photo}
![](/pix/trips/2020-07_us-west/slides/slide_IMG_9689.JPG){: .image-process-large-photo}

patrickfournier avatar Oct 18 '22 01:10 patrickfournier

That change breaks all the image links, as they don't exist via that path. pix is a subdirectory inside of content. The change that you're suggesting implies that pix is a directory hanging off of / (but it is not).

On Mon, Oct 17, 2022 at 6:18 PM Patrick Fournier @.***> wrote:

Most probably. Try putting a / at the start of your image paths:

{: .image-process-large-photo} {: .image-process-large-photo}

— Reply to this email directly, view it on GitHub https://github.com/pelican-plugins/image-process/issues/68#issuecomment-1281686474, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAARLSKIKZNPVI46WH4KFT3WDX3EXANCNFSM6AAAAAARGOO22M . You are receiving this because you authored the thread.Message ID: @.***>

netllama avatar Oct 18 '22 01:10 netllama

I use Pelican for my site but unfortunately, I am not in a state where I can test things. However, I saw in my pelicanconf.py that I added the directory containing my pictures to STATIC_PATH. You may try adding this to your pelicanconf.py

STATIC_PATHS = ['pix']

Of course, if you already have a STATIC_PATH defined, just add pix to it.

patrickfournier avatar Oct 18 '22 01:10 patrickfournier

Setting STATIC_PATHS = ['pix'] does fix the original problem.

However, it also results in copying over everything in the pix directory, which is undesirable (there are hundreds of GB of images there that are not used or referenced by any blog).

netllama avatar Oct 18 '22 03:10 netllama

Does setting STATIC_PATHS = ['pix/derivatives'](wherederivativesis the value ofIMAGE_PROCESS_DIR`) work? Probably not but it is worth trying.

The way image_process works is that is adds a subdirectory (derivatives) to the directory that contains the pictures. It assumes that the directory containing the original pictures will be part of the website. This is something that could be improved.

patrickfournier avatar Oct 18 '22 22:10 patrickfournier

Unfortunately, that doesn't work

netllama avatar Oct 19 '22 00:10 netllama

I was able to work around this locally by catching IsADirectoryError exceptions on line 440 of image_process.py:

except (FileNotFoundError, UnidentifiedImageError, IsADirectoryError):

netllama avatar Oct 21 '22 00:10 netllama