django-markdownfield
django-markdownfield copied to clipboard
Integrating internal links with relative file path in .md files
I picked up this package for my demo site and was really struggling to include internal links into markdown files, at least in my development environment. I want to be able to include relative URL file paths, i.e. /djangoapp/view/, into my .md files, rather than absolute file paths, i.e. http://localhost:3000/djangoapp/view.
My settings are divided into local.py and production.py. In my local.py, I have SITE_URL = "http://localhost:3000", but when I set a link in my .md files such as [link](/djangoapp/view/), it saved the HTML as an external link <a ref="/djangoapp/view/" target="_blank" class=" external" rel="nofollow noopener noreferrer">djangomodel</a>. If I put the absolute file path in the .md file, [link](http://localhost:3000/djangoapp/view/), the link was rendered as a local link. However, I want to use my .md files in development and production without modifying all the absolute file paths.
I first tried a solution whereby I include Django url tags ({% url "djangoapp:view" %} in my .md files and render the .md HTML equivalents as Template objects in my view, which I then served to the context. This seemed messy. Then I figured out the following one that seems pretty hacky and not that performant, but does work like I want it to.
- I copied a Stack Overflow answer to get all of my apps URLs.
from django.urls import URLPattern, URLResolver
def list_urls(lis, acc=None):
#https://stackoverflow.com/questions/1275486/how-can-i-list-urlpatterns-endpoints-on-django#answer-54531546
if acc is None:
acc = []
if not lis:
return
l = lis[0]
if isinstance(l, URLPattern):
yield acc + [str(l.pattern)]
elif isinstance(l, URLResolver):
yield from list_urls(l.url_patterns, acc + [str(l.pattern)])
yield from list_urls(lis[1:], acc)
- I overwrote the django-markdownfield format_link function and modified it to check all links registered as external for membership in a list of internal links.
def format_link(attrs: Dict[tuple, str], new: bool = False):
"""
This is really weird and ugly, but that's how bleach linkify filters work.
"""
try:
p = urlparse(attrs[(None, 'href')])
except KeyError:
# no href, probably an anchor
return attrs
if not any([p.scheme, p.netloc, p.path]) and p.fragment:
# the link isn't going anywhere, probably a fragment link
return attrs
if hasattr(settings, 'SITE_URL'):
c = urlparse(settings.SITE_URL)
link_is_external = p.netloc != c.netloc
else:
# Assume true for safety
link_is_external = True
if link_is_external:
# create a list of all the app's URLs and check if the hyperlink path is in that list
urlconf = __import__(settings.ROOT_URLCONF, {}, {}, [''])
app_urls = ["/" + ''.join(url_part_list) for url_part_list in list_urls(urlconf.urlpatterns)]
if p.path not in app_urls:
# link is external - secure and mark
attrs[(None, 'target')] = '_blank'
attrs[(None, 'class')] = attrs.get((None, 'class'), '') + ' external'
attrs[(None, 'rel')] = 'nofollow noopener noreferrer'
return attrs
- I overwrote the django-markdownfield MarkdownField to substitute the new format_link function.
class OverwrittenMarkdownField(MarkdownField):
def pre_save(self, model_instance, add):
value = super().pre_save(model_instance, add)
if not self.rendered_field:
return value
dirty = markdown(
text=value,
extensions=EXTENSIONS,
extension_configs=EXTENSION_CONFIGS
)
if self.validator.sanitize:
if self.validator.linkify:
cleaner = bleach.Cleaner(tags=self.validator.allowed_tags,
attributes=self.validator.allowed_attrs,
css_sanitizer=self.validator.css_sanitizer,
filters=[partial(LinkifyFilter,
callbacks=[format_link, blacklist_link])])
else:
cleaner = bleach.Cleaner(tags=self.validator.allowed_tags,
attributes=self.validator.allowed_attrs,
css_sanitizer=self.validator.css_sanitizer)
clean = cleaner.clean(dirty)
setattr(model_instance, self.rendered_field, clean)
else:
# danger!
setattr(model_instance, self.rendered_field, dirty)
return value
- Use the OverwrittenMarkdownField in my model field definitions, as opposed to the MarkdownField natively provided by the package.
This results in the desired behavior whereby relative internal links are saved and rendered with the HTML for an internal link (no target="_blank").
Appreciate any feedback.
Brief update for anyone who finds their way here. I rewrote the format_link function to utilize Django's URL resolver. Now it checks if a link can be resolved with the internal URL structure, if a Resolver404 error is raised, it checks if a trailing slash is missing and raises a ValueError, otherwise the Resolver404 error is passed because the link is external. This allows for URLs that require parameters to be put into the .md files and still be marked as internal if they resolve.
from django.urls import Resolver404, resolve # type: ignore
def format_link(attrs: dict[tuple, str], new: bool = False):
"""
This is really weird and ugly, but that's how bleach linkify filters work.
"""
try:
p = urlparse(attrs[(None, "href")])
except KeyError:
# no href, probably an anchor
return attrs
if not any([p.scheme, p.netloc, p.path]) and p.fragment:
# the link isn't going anywhere, probably a fragment link
return attrs
if hasattr(settings, "SITE_URL"):
c = urlparse(settings.SITE_URL)
link_is_external = p.netloc != c.netloc
else:
# Assume true for safety
link_is_external = True
if link_is_external:
# I have overwritten this to allow for internal links to be written into markdown
# agnostic to my development and production environment. This is a hacky solution
# but it works for now. Internal urls must follow the pattern app_name/url_stuff/
# Try to resolve the link
try:
resolve(p.path)
# If it fails, try adding a trailing slash
except Resolver404:
slash_path = p.path + "/"
# If adding a slash resolves it as an internal link, raise a ValueError
# to alert the user that they need to add a trailing slash to their link
try:
resolve(slash_path)
raise ValueError(f"Link {p.path} is missing a trailing slash")
# If adding a slash doesn't resolve it, it's an external link
except Resolver404:
pass
# link is external - secure and mark
attrs[(None, "target")] = "_blank"
attrs[(None, "class")] = attrs.get((None, "class"), "") + " external"
attrs[(None, "rel")] = "nofollow noopener noreferrer"
return attrs