webmention.io icon indicating copy to clipboard operation
webmention.io copied to clipboard

Fetch WMs for aliased/equivalent URLs

Open keithjgrant opened this issue 8 years ago • 4 comments
trafficstars

For any given page on my site, I have to fetch webmentions for four different equivalent URLs. Consider the page at http://keithjgrant.com/posts/2017/09/code-not-clojure/

If I want to check for WMs, its possible the trailing slash was left off the WM (I think my site currently adds the slash, but a previous iteration didn't). So to get all WMs, I have to fetch for these two URLs: http://keithjgrant.com/posts/2017/09/code-not-clojure http://keithjgrant.com/posts/2017/09/code-not-clojure/

This is a little annoying, but workable. But recently, I switched added SSL. WMs could have been sent before I made this switch (that is, without https in the target url) or after (with https). So now I have to fetch WMs for four URLs: http://keithjgrant.com/posts/2017/09/code-not-clojure http://keithjgrant.com/posts/2017/09/code-not-clojure/ https://keithjgrant.com/posts/2017/09/code-not-clojure https://keithjgrant.com/posts/2017/09/code-not-clojure/

It seems silly (and a bit wasteful) to have to include all four permutations of this same url when fetching WMs from the API. I would like a way to provide the target URL once, and get back results for all four versions.

keithjgrant avatar Nov 09 '17 16:11 keithjgrant

This is tricky. At least with the API you can provide all the URLs in the same API call.

I'm not a huge fan of the idea of having to update previous webmentions when your URLs change. Keeping track of that sounds like a really hard problem. I also can't assume that a trailing slash is the same page as without, since other websites may serve different content at the two URLs.

Do you have any suggestions for how this could work?

aaronpk avatar Nov 09 '17 16:11 aaronpk

I would probably suggest a flag that I could pass to the API—or possibly two flags: on for the trailing slash, another for http/s.

When the "include trailing slash" flag is set, it returns urls that match both with and without a trailing slash. When the "include https" flag is set, it returns urls that match both http and https.

keithjgrant avatar Nov 09 '17 16:11 keithjgrant

On second thought, I think I like the idea of hard-coding rules for "trailing slash" and http/https less than being able to update the webmentions received in webmention.io.

Here's a proposal: webmention.io can provide an API (and probably also a UI) where you'd give it an old URL that now redirects to your new URL. It would fetch that URL and see the redirect, then update any webmentions that were at the old URL to be associated with the new URL.

aaronpk avatar Nov 09 '17 17:11 aaronpk

Sounds good to me

keithjgrant avatar Nov 09 '17 17:11 keithjgrant