markdown-it-html5-media icon indicating copy to clipboard operation
markdown-it-html5-media copied to clipboard

Improvements to URL matching in guessMediaType()

Open Crissov opened this issue 7 years ago • 1 comments

function guessMediaType(url) {
  const extensionMatch = url.match(/\.([^/.]+)$/);
  if (extensionMatch === null)
    return 'image';
  const extension = extensionMatch[1];
  // ...
}

That means the plugin is currently looking for a dot . followed by anything but a dot or slash before the end of the URL. If I'm not mistaken, this should yield unwanted results in these example cases:

  • http://example.mov -- match is mov, which could be a recognized video file extension and is a valid TLD
  • video.mp4?t=1m30s -- match is mp4?t=1m30s, not mp4 as intended
  • audio.mp3#chapter4 -- match is mp3#chapter4, not mp3 as intended

I'm not sure if it's usable yet, but URL.pathname should strip out host, query and hash automatically. Otherwise you could also consider them manually.

  const extensionMatch = url.match(/\/.*\.([^/.]+)(?:\?[^?]*)?(?:#[^#]*)?$/);

Crissov avatar Sep 28 '18 11:09 Crissov

Thanks for the report, @Crissov! I agree that the matching should be improved. I'm reluctant to add the URL dependency to simplify running the same code in Node and in the browser, and also because URL isn't supported by IE without a polyfill. But I'll do some testing with your suggested regex and may use that, or a version of that.

eloquence avatar Oct 09 '18 07:10 eloquence