path-to-regexp
path-to-regexp copied to clipboard
Exact Match by non capturing group but allow wildcard for subsegments
I'm trying to rewrite non-localized paths (e.g /page1) to localized ones (/de/page1) and therefore I'm seeking help to get the following to work:
-
"/de" or "/en" should NOT be matched
-
"/de/anything" or "/en/anything" should NOT be matched
-
"/test" or "/test/anything" should be matched
-
"/deno" or "/end" should be matched
With my current implementation which is based on: https://github.com/pillarjs/path-to-regexp/blob/0c466b1b0944e8d0022b5b15069364a8483bf9c5/src/index.spec.ts#L2498
I can get the first 3 of 4 requierements, but can't prevent "de" from also matching "deno". It seems like instead of allowing any chars which are not "/" after the lang segment, I would need a solution which allows either an exact match or a wilcard which must be preceded by another "/". (So both /de and /de/anything would be covered).
Segmenting these values any further is not a requierement as I'm taking req.url as a whole and only prepend the locale.
This sandbox should probably make it easy to get what I mean: https://codesandbox.io/s/path-to-regexp-demo-c9efk?file=/src/index.js (Result is in the console)
I got a bit stuck here, and any help is appreciated, thank you! :)
This is pretty complex and a little outside of how path-to-regexp is intended to be used. My suggestion would be to split this into separate routes and have them ordered. Do you know what framework you're using this with? Depending on where, you could just match /:lang first, and if it didn't match your expected language pass it to child routes.
@blakeembrey Thanks for your help. I'm using nextjs, which allows for an ordered list of matches, and pass those that don't match to the next one. (Functionality is described here: https://github.com/zeit/next.js/issues/9081)
Regarding separate routes, yes I also that idea, but thought that having one expression to match all cases would've been cleaner and I'm just to unskilled with regexpes to get this right. But I'm happy to put this into separate routes if that is easier.
So, with separate routes, would I have
- one that does exact matches like /abc or /test or /deno, but does not match /de or /en
- one that does exact matches but allows for a wilcard after a "/" occured, so it matches /abc/[...] and /test/[...], but not /de[...] and /en[...]?
The general background: I'm creating an i18n example project and I have all names available of the routes that should NOT match as I can import the array of used locales by name at the point where I define my rewrites, so I can have something like en|de, but I don't have access to all currently existing paths like /test or /abc at that point, and want to avoid duplication by manually having to define all of them there.
I can't quickly translate it into next.js, but normally I'd do something like this:
const firstRoute = pathToRegexp('/:lang(en|de|...)', { end: false }) // The `end: false` lets you match everything under these paths, e.g. `/en`, `/en/test`, but not `/end` (because it's not a separate "segment").
const anyOtherRoute = '/...'
The key is typically to design a route that matches on what you're trying to do, then build something for the fallback routes. This makes it a lot easier to maintain than trying to do negative route matches.
@blakeembrey I'm running into the exact same issue.
Here is how Next.js allows us to customise this behaviour (RFC specs)
See https://github.com/zeit/next.js/issues/9081 rewrites section (screenshot below)

Here is how we actually use it, inside /next.config.js:
const rewrites = [
{
// XXX Doesn't work locally (maybe because of rewrites), but works online
source: '/',
destination: '/api/autoRedirectToLocalisedPage',
},
{
source: `/:locale((?!${allowedLocales.join('|')})[^/]+)(.*)`,
destination: '/api/autoRedirectToLocalisedPage',
},
];
Full /next.config.js code source:
https://github.com/UnlyEd/next-right-now/pull/42/files/3cde65a68dc2bb90dcb777f19ffb036bb967c607#diff-5d0c276360a637d1b787a57760665fbeR34-R48
How would you advise to tackle this issue? I'm really not familiar with path-to-regexp. (and regexes in general) Thank you!
Just a note that if this were possible (it isn't support by path-to-regexp yet), I think it would solve some issues (e.g. /end would redirect to /en/end properly):
> source = '/((?!en(\\/|$)|fr(\\/|$))[^/]+))(.*)'
'/((?!en(\\/|$)|fr(\\/|$))[^/]+))(.*)'
> options
{ strict: true, sensitive: false, delimiter: '/' }
> regexp = pathToRegexp(source, [], options);
Uncaught TypeError: Capturing groups are not allowed at 7
at lexer (/home/nchiang/repos/covid-tutoring/node_modules/path-to-regexp/dist/index.js:74:31)
at parse (/home/nchiang/repos/covid-tutoring/node_modules/path-to-regexp/dist/index.js:97:18)
at stringToRegexp (/home/nchiang/repos/covid-tutoring/node_modules/path-to-regexp/dist/index.js:329:27)
at pathToRegexp (/home/nchiang/repos/covid-tutoring/node_modules/path-to-regexp/dist/index.js:403:12)
Note that I added a capture group at the end of each locale (e.g. en(\\/|$)) that ensure that we only capture it as a locale if it is:
- At the end of the URL (denoted by the
$). - Is directly followed by a slash (denoted by the
\\/).
@nicholaschiang we can split parentheses to separate element. Like (!?en($|/)) to (!?en/|en$).
Working example:
const languagesMask = ['zh-cn', 'en', 'es', 'fr', 'it', 'ru', 'th']
.map((lang) => [`${lang}/`, `${lang}$`])
.flat()
.join('|');
const source = `/((?!${languagesMask})[^/]+)/:path*`
const languagesMask = ['zh-cn', 'en', 'es', 'fr', 'it', 'ru', 'th']
.map((lang) => [${lang}/, ${lang}$])
.flat()
.join('|');
const source = /((?!${languagesMask})[^/]+)/:path*