rehype-autolink-headings icon indicating copy to clipboard operation
rehype-autolink-headings copied to clipboard

feat: add linkPrefix option

Open pReya opened this issue 1 year ago • 18 comments

Initial checklist

  • [x] I read the support docs
  • [x] I read the contributing guide
  • [x] I agree to follow the code of conduct
  • [x] I searched issues and couldn’t find anything (or linked relevant results below)
  • [x] If applicable, I’ve added docs and tests

Description of changes

I've added a linkPrefix option which adds a static prefix to every generated link. One possible use case for this is to convert relative anchor links (#my-anchor) to absolute anchor links (https://mypage.com/content/blogpost#myanchor).

This feature/wish has been mentioned before in this discussion: https://github.com/rehypejs/rehype/discussions/104

pReya avatar Oct 15 '22 12:10 pReya

Codecov Report

Base: 100.00% // Head: 100.00% // No change to project coverage :thumbsup:

Coverage data is based on head (2af3dbe) compared to base (2a6b77f). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff            @@
##              main       #16   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            2         2           
  Lines          176       183    +7     
=========================================
+ Hits           176       183    +7     
Impacted Files Coverage Δ
lib/index.js 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov-commenter avatar Oct 15 '22 12:10 codecov-commenter

I don’t quite understand why. What <base> use requires this? How does base affect these hashes?

wooorm avatar Oct 17 '22 10:10 wooorm

I don’t quite understand why. What <base> use requires this? How does base affect these hashes?

This is a known limitation of the base tag, see for example this description: https://rogerkeays.com/blog/using-base-href-with-anchors. Relative links will use the href attribute from the base tag, but anchor links will not.

Let's say you have a <base href="/my/base/path/">. A normal, relative link like <a href="mypage"> will lead you to /my/base/path/mypage. However if you want to do the same with an anchor tag <a href="#mysection">, it will not consider/include the base tag. Relative anchor links will only work on the current location. So if you want to make this work, you need to manually prefix them with the desired "base path".

pReya avatar Oct 17 '22 11:10 pReya

but anchor links will not.

it will not consider/include the base tag.

Why is this a limitation? This behavior makes sense to me. I don’t think it should apply to anchors?

Why do you have a page that is hosted in location X, so all anchors point to location X + anchor, but you want those links to go to location Y + anchor?

wooorm avatar Oct 17 '22 12:10 wooorm

I have also encountered this problem, but I don’t think it can be fixed by adding a prefix like this. Often multiple files will be processed using the same processor. These files would then need different prefixes.

Such a prefix probably needs to be calculated based on the vfile location and custom logic. I would suggest to write a custom plugin to handle paths, for example something like this. Although personally I recommend to try and get rid of the <base /> tag.

remcohaszing avatar Oct 17 '22 12:10 remcohaszing

but anchor links will not.

it will not consider/include the base tag.

Why is this a limitation? This behavior makes sense to me. I don’t think it should apply to anchors?

Why do you have a page that is hosted in location X, so all anchors point to location X + anchor, but you want those links to go to location Y + anchor?

In my case I am using ephemeral preview deployments for my Astro SSG setup. E.g. I deploy every branch to a different base path (https://dev.mypage.com/branch-1 or https://dev.mypage.com/branch-2). And I want content editors to just use relative links, that should work under every domain/base path.

I understand your reluctance, since this is a niche use case – however I think it's more common than you think. I'd still appreciate a way to modify the link. If you don't like the narrow prefix approach, we could also use a function that receives the link as a parameter, so users can prefix, postfix, split, modify however they see fit. Giving users a way to modify the link seems like a sensible configuration option, don't you agree?

pReya avatar Oct 17 '22 13:10 pReya

And I want content editors to just use relative links, that should work under every domain/base path.

That’s a fine use case but I don’t understand what base or anchors have to do with it. You can use relative links, and have them point to the current site. You can use anchors and have them point to the current page. That’s how relative links and anchors already work? 🤔 Rewriting URLs is only needed if you don’t want relative links to go to the current site, and don’t want anchor links to go to the current page.

niche use case

The thing for me is: I don’t understand your use case at all. And I don’t think a prefix option will solve your use case. As Remco mentioned: plugins operate on many files. You are prefixing one page to many files. It doesn’t work.

Giving users a way to modify the link seems like a sensible configuration option, don't you agree?

Users have all the options they can imagine already: they can write plugins to exactly match their preferred behavior. This for me is about options that help most people reach common goals.

wooorm avatar Oct 17 '22 13:10 wooorm

You can use relative links, and have them point to the current site. You can use anchors and have them point to the current page. That’s how relative links and anchors already work? :thinking:

If a base is set, then the browser makes anchors relative to that. I’ve run into that issue before as well.

The following are equivalent in that sense:

<!doctype html>
<html>
  <head>
    <base href="/pokemon" />
  </head>
  <body>
    <a href="#bulbasaur">Bulbasaur</a>
  </body>
</html>
<!doctype html>
<html>
  <head>
  </head>
  <body>
    <a href="/pokemon#bulbasaur">Bulbasaur</a>
  </body>
</html>

remcohaszing avatar Oct 17 '22 13:10 remcohaszing

Oh, I was thinking of the inverse, due to the above:

Relative links will use the href attribute from the base tag, but anchor links will not.

Now it seems like, instead, anchor links will use the href attribute of the base element as a prefix.

wooorm avatar Oct 17 '22 13:10 wooorm

This still seems like, well, what base elements are for: to rewrite all relative (including paths, searches, anchors) URLs. Why have a base element to rewrite some of those, but not others? Why not drop the base element? Then authors can still write relative URLs?

wooorm avatar Oct 17 '22 13:10 wooorm

The following are equivalent in that sense:

<!doctype html>
<html>
  <head>
    <base href="/pokemon" />
  </head>
  <body>
    <a href="#bulbasaur">Bulbasaur</a>
  </body>
</html>
<!doctype html>
<html>
  <head>
  </head>
  <body>
    <a href="/pokemon#bulbasaur">Bulbasaur</a>
  </body>
</html>

No, these are NOT equivalent. This is exactly the problem. If this would work, then there would be no problem.

pReya avatar Oct 17 '22 14:10 pReya

I checked that in a browser and it's correct?

wooorm avatar Oct 17 '22 14:10 wooorm

You are right. I'm sorry, I have been wrong about that. I mixed up the motivation for the prefix option with another problem I am facing.

So, allow me to restate my motivation for adding a link prefix:

I have a content page that's being deployed to two different environments (each having their own domain and basePath). E.g.

Environment 1 (Live) is at https://www.mypage.com/wiki/some/page and uses no base tag Environment 2 (Preview) is at https://dev.mypage.com/branch-1/wiki/some/page and uses a <base href="/branch-1/" /> base tag.

In both environments I want to use the same, autolinked headline on the page, e.g. <h1 id="#myheadline">. The link in Environment 1 will work fine, but the link in Environment 2 will not work, because it will lead to https://dev.mypage.com/branch-1/#myheadline.

Unfortunately, getting rid of the base tag is not possible without JS, if you want to have this kind of setup with a different base URL/path per environment but still use relative links.

pReya avatar Oct 17 '22 14:10 pReya

I think you’re using the base element wrong. The base should in the second example be /branch-1/wiki/some/page to support truly relative values.

This also affects all your relative links. Say those two pages are otherwise the same, and you have a relative link to ./whatever/, then they go to different places, too. /wiki/some/page/whatever/ on prod, /branch-1/whatever in dev. As I understand it this instead prevents your authors from using relative links, rather than helping them?

How are you setting that base?

if you want to have this kind of setup with a different base URL/path per environment but still use relative links.

I don’t understand what relative links you have. If they’re relative, you don’t need a base.

wooorm avatar Oct 17 '22 14:10 wooorm

I would recommend to deploy to a subdomain for each branch, not to a different path. Using different paths in production and preview will probably lead to more discrepancies. I.e.: https://branch-1.dev.mypage.com. Or you could use a dedicated TLD such as https://branch-1.mypage.review, so they won’t show up when users google site:mypage.com.

remcohaszing avatar Oct 17 '22 14:10 remcohaszing

Perhaps you are talking about absolute paths btw: href="/xxx" is an absolute path, and yyy, ./zzz, ../aaa, and #bbb are all really relative.

Absolute values are resolved from new URL(base.href, window.location.href).origin, which is for example https://example.com. Relative values are resolved from new URL(base.href, window.location.href).href, which is for example https://example.com/alpha/bravo.

This is particularly noticable if you are on a file:// page, and point base to some other domain. First, let’s say we have index.html with this:

<!doctype html>
<html>
  <head>
    <base href="https://example.com/alpha/">
  </head>
  <body>
    <a href="#bravo">b</a>
    <a href="/charlie">c</a>
    <a href="./delta">d</a>
  </body>
</html>

Hovering over/clicking on the links:

  • b -> https://example.com/alpha/#bravo
  • c -> https://example.com/charlie
  • d -> https://example.com/alpha/delta

Now, if the base itself was also relative to the current page:

<!doctype html>
<html>
  <head>
    <base href="/alpha/">
  </head>
  <body>
    <a href="#bravo">b</a>
    <a href="/charlie">c</a>
    <a href="./delta">d</a>
  </body>
</html>

Hovering over/clicking on the links:

  • b -> file:///alpha/#bravo
  • c -> file:///charlie
  • d -> file:///alpha/delta

wooorm avatar Oct 17 '22 14:10 wooorm

@wooorm @remcohaszing I appreciate all the feedback regarding my setup. I totally agree with you, that it is not optimal. I would love to use separate subdomains – however I am limited by a certain hosting setup in this specific project. I need to deal with these folder-based environments for now.

Aside from my specific problem use case, I'd still like to know if you are open to allow any kind of modification/transformation of the generated link. My current "prefix" PR might be too narrow minded. But I think adding a transform function, that allows to prefix/suffix/replace/modify the link however a user might like would still be a valuable addition to this plugin, don't you agree?

pReya avatar Oct 23 '22 14:10 pReya

however I am limited by a certain hosting setup in this specific project. I need to deal with these folder-based environments for now.

Makes sense. As I understand your problem, I don’t think any feature added to this project, nor a <base> element, solve your problem though. Your problem isn’t about only headings.

I think you need to create your own plugin to rewrite all URLs. rehype-minify-url might do the trick, but I think that it probably can serve as inspiration at best.

But I think adding a transform function, that allows to prefix/suffix/replace/modify the link however a user might like would still be a valuable addition to this plugin, don't you agree?

I definitely understand folks would ask for it. But I try to add features only if there are good use cases for them. There’s often a very good alternative: create a custom plugin. Typically they are quite small. Only if a custom plugin wouldn’t work well, or if a lot of people need something, and if there is a clear use case and it is the right abstraction, do I think it’s worth maintaining it in “core” projects.

As it stands, I don’t yet understand a reason for why anchor links to headings themselves, on page A, should go to page B.

wooorm avatar Oct 24 '22 08:10 wooorm

Hi! This was closed. Team: If this was merged, please describe when this is likely to be released. Otherwise, please add one of the no/* labels.

github-actions[bot] avatar Aug 26 '23 10:08 github-actions[bot]

Hi team! Could you describe why this has been marked as wontfix?

Thanks, — bb

github-actions[bot] avatar Aug 26 '23 10:08 github-actions[bot]

Closing as there’s no practical, concrete, use case that this solves yet

wooorm avatar Aug 26 '23 10:08 wooorm