python-markdownify Handle relative image URLs

Add ability to process non-full image url, such as 'path/to/img.png' or '/path/to/img.png'

Jun 23 '24 14:06 kaichen

Hey, thanks for your contribution! Any reason why the base_url gets cut into host and protocol, instead of using it as-is as prefix? Maybe the user wants to prefix their URLs with a full locator.

Nov 24 '24 16:11 AlexVonB

@kaichen - could you provide an example use case for this feature? I don't fully understand it from the pull request description.

Jan 01 '25 18:01 chrispy-snps

could you provide an example use case for this feature? I don't fully understand it from the pull request description.

Some webpages might use relative paths for their image URLs. When using this library to download HTML and convert it to Markdown, need the full image URLs to ensure the images render correctly.

Jan 02 '25 01:01 kaichen

Hey, thanks for your contribution! Any reason why the base_url gets cut into host and protocol, instead of using it as-is as prefix? Maybe the user wants to prefix their URLs with a full locator.

just want to make sure base_url join relative correctly.

Jan 02 '25 01:01 kaichen

I have mixed feelings about this.

On one hand, I always appreciate a pull request contribution. And on the surface, this provides a nice convenience for this use case.

But on the other hand, Markdownify's job is to render the provided HTML to Markdown, and as the Unix mantra says, "do one thing and do it well." Modifying link content is content modification, not content rendering, which feels more like source preprocessing before Markdownify is called.

Two more random thoughts:

<a> links should be given similar consideration.
Another approach is to use a link-formatting function in process_img() and process_a():
```
def format_link(link_text):
    return link_text;  # default is to use link text as-is
```
then allow the user to override this, either by an option that takes a callback function, or by a subclassed function override.

Or maybe I am overthinking it, and this is simply a nice convenience that we should implement. :)

@AlexVonB, what are your thoughts on this?

Jan 02 '25 11:01 chrispy-snps

python-markdownify python-markdownify copied to clipboard

Handle relative image URLs

python-markdownify
python-markdownify copied to clipboard