docusaurus icon indicating copy to clipboard operation
docusaurus copied to clipboard

File / Folder Names that contain parentheses do not resolve

Open lukejgaskell opened this issue 4 years ago • 16 comments

🐛 Bug Report

Prerequisites

  • [x] I'm using the latest version of Docusaurus.
  • [X] I have tried the npm run clear or yarn clear command.
  • [X] I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • [X] I have tried creating a repro with https://new.docusaurus.io
  • [X] I have read the console error message carefully (if applicable)

Description

Parentheses are not allowed in a file / folder name. When a file has parentheses the file still shows up in the sidebar but when clicked shows "page not found".

Steps to reproduce

  1. Create a file or folder that has ( or ) in the name.
  2. Go to that file in the sidebar.

CodeSandbox doesn't allow the special character... but stackblitz does. Demo:

https://stackblitz.com/edit/github-eranrw?file=docs/another(1).md

Expected behavior

I would expect it to be able to resolve the special character.

Actual behavior

The page did not resolve and seems to be considered a broken markdown link.

lukejgaskell avatar Aug 24 '21 21:08 lukejgaskell

agree there is something wrong and it should work

We use React Router, which use https://github.com/pillarjs/path-to-regexp

The slugification process that transforms filenames to uris/routes should probably remove or escape parentheses.

I'll have to check but is /docs/another(1) a valid path segment?

At what url do you expect this doc to be served?

slorber avatar Aug 26 '21 12:08 slorber

I don't believe there's anything wrong with having /docs/another(1) as the path segment. Azure DevOps Wiki seems to be using it that way. Do you have other cases of urls being changed from their original path and if so, how do you map those back to the repo with the edit this page button?

lukejgaskell avatar Aug 27 '21 15:08 lukejgaskell

I seem to get inconsistent results between dev & production builds for this, with spaces and .'s in the names - not sure if it's related.

runonthespot avatar Sep 06 '21 12:09 runonthespot

Hi, if this is an issue I could work on then I would gladly give it a shot. I'm still pretty new to contributing so some hints on where to start and what I can/should do would be very much appreciated. Thanks

mqnguyen5 avatar Oct 25 '21 05:10 mqnguyen5

Indeed, it seems to be because (paren) is perceived as a regexp by React router... I think we should just remove that paren when computing slug? @lukejgaskell is having parentheses important for you?

Josh-Cena avatar Jan 29 '22 16:01 Josh-Cena

Hey @Josh-Cena, I don't have the need currently, but what I will say is it would be a nice fix for porting docs over from other locations. A lot of people end up using parentheses in their file names and would be nice if this tool was able to handle that.

lukejgaskell avatar Jan 29 '22 16:01 lukejgaskell

Yes, absolutely. I think we will just remove parentheses from the slug automatically. Sounds good?

Josh-Cena avatar Jan 29 '22 16:01 Josh-Cena

@Josh-Cena hey, sorry for the slow reply. I would say with the question of removing them... would the link at the bottom of the page,"edit this page", still include them? Because if not that would make it hard to link back to the source.

lukejgaskell avatar Feb 13 '22 06:02 lukejgaskell

would the link at the bottom of the page,"edit this page", still include them?

I suppose yes, because the edit URLs are generated through file paths, not URL paths.

Josh-Cena avatar Feb 13 '22 06:02 Josh-Cena

@Josh-Cena sounds like it would work! Although you might have an issue with url collisions if the only difference between file names are the special characters.

lukejgaskell avatar Feb 13 '22 17:02 lukejgaskell

@Josh-Cena / @lukejgaskell is there a timeline for fixing this? This is currently blocking us from incorporating some docs in our docusaurus site which are auto-generated by another markdown-generating tool.

madelson avatar Aug 04 '22 14:08 madelson

@madelson External Markdown-generating tools are likely to break Docusaurus in one way or another (e.g. HTML tags). Some kind of postprocessing is almost always necessary, and removing characters from the file path is the easiest of all.

Josh-Cena avatar Aug 04 '22 14:08 Josh-Cena

@Josh-Cena it's a bit trickier than that since I'd also have to clean up all the cross-links between the generated pages but I agree that it is doable. Just curious, is there a reason why is is desirable for Docusaurus not to support such URLs if they are otherwise valid?

madelson avatar Aug 05 '22 22:08 madelson

Uh, it's not our fault, strictly speaking. It's because React-router processes them as regexps instead of literal characters. If you look at https://github.com/facebook/docusaurus/pull/6510 you see we want to align our behavior with other site generators, but so far I haven't got time to look into this. If you'd like to collect that information for us we'd greatly appreciate it.

Josh-Cena avatar Aug 06 '22 01:08 Josh-Cena

It's because React-router processes them as regexps instead of literal characters.

If this is the source of the issue and (I presume) we want them to be treated as literals, would the fix be as simple as escaping (rather than replacing) all regexp special characters (e.g. with something like https://stackoverflow.com/questions/3446170/escape-string-for-use-in-javascript-regex)?

If you'd like to collect that information for us

To be clear, you're looking for information on what characters are supported in the URLs of other site generators like Jekyll? Or markdown-based ones specifically?

madelson avatar Aug 06 '22 14:08 madelson

what characters are supported in the URLs of other site generators like Jekyll? Or markdown-based ones specifically?

See https://github.com/facebook/docusaurus/pull/6510#issuecomment-1028030028. Yep—site generators that have file-based routing, like Next.js or Remix. I'm curious if they (a) make ( appear literally in the slug (b) remove it (c) turn it into - or _. (I know Remix treats ( as literal characters)

would the fix be as simple as escaping (rather than replacing) all regexp special characters

Yes—if we decided they should be literal characters (to align with the behavior of other site generators). However, only ( is the "peculiar" one, because ?, [, and other stuff would already be encoded in URLs.

Josh-Cena avatar Aug 06 '22 14:08 Josh-Cena