TypeScript
TypeScript copied to clipboard
Regular Expression finds
Acknowledgement
- [X] I acknowledge that issues using this template may be closed without further explanation at the maintainer's discretion.
Comment
Note: I eventually gave up on capturing "Not available unless target is ESXXXX" errors since they're not really interesting to look at
Via #58275
This character cannot be escaped in a regular expression.
const image_path_escape = image_path.replace(/\o/g, '/o') //escape string "\o" in "\output"
Named capturing groups are only available when targeting 'ES2018' or later
/^((?<negative>-)|\+)?P((?<years>\d*)Y)?((?<months>\d*)M)?((?<weeks>\d*)W)?((?<days>\d*)D)?((?<time>T)((?<hours>\d*[.,]?\d{1,9})H)?((?<minutes>\d*[.,]?\d{1,9})M)?((?<seconds>\d*[.,]?\d{1,9})S)?)?$/
Named capturing groups are only available when targeting 'ES2018' or later.
const IMPORT_REGEX = /(?<key>import|export)\s+(?:(?<alias>[\w,{}\s*]+)\s+from)?\s*(?:(?<quote>["'])?(?<ref>[@\w\s\\/.-]+)\3?)\s*(?<term>[;\n])/g
Named capturing groups are only available when targeting 'ES2018' or later
const match = text.match(/^(?<description>(.|\n)*)```(?<language>[^\n]+)\n(?<code>(.|\n)+)\n```$/m);
This regular expression flag is only available when targeting 'es2018' or later
return fileContent.replace(/<!--.*?-->/gs, '');
This character cannot be escaped in a regular expression
const fixedId = listItem.id.replace(/\_/g, "/").replace(/\-/g, "+");
Named capturing groups are only available when targeting 'ES2018' or later
const INPUT_EXTENSION_IMPORT_REGEX = /\.(svelte|(lite(\.tsx|\.jsx)?))(?<quote>['"])/g;
Octal escape sequences are not allowed. Use the syntax '\x04'
const propsRegex = /props\s*\.\s*([a-zA-Z0-9_\4]+)\(/;
Named capturing groups are only available when targeting 'ES2018' or later
private static SSH_PATH_RE = new RegExp(
[
/^\s*/,
/(?:(?<proto>[a-z]+):\/\/)?/,
/(?:(?<user>[a-z_][a-z0-9_-]+)@)?/,
/(?<domain>[^\s\/\?#:]+)/,
/(?::(?<port>[0-9]{1,5}))?/,
/(?:[\/:](?<owner>[^\s\/\?#:]+))?/,
/(?:[\/:](?<repo>(?:[^\s\?#:.]|\.(?!git\/?\s*$))+))/,
/(?:.git)?\/?\s*$/,
]
Named capturing groups are only available when targeting 'ES2018' or later
const regexp = /\[(?<link>http:\/\/[^\]]+)\]/g
A character class range must not be bounded by another character class
this.relocDataSymNameRe = /^(?<symname>[^\d-+][\w.]*)?\s*(?<addend_or_value>.*)$/;
filepath.replace(/^C:\/Users\/[\w\d-.]*\/AppData\/Local\/Temp\/compiler-explorer-compiler[\w\d-.]*\//, '/app/')
const ATFILELINE_RE = /\s*at ([\w-/.]+):(\d+)/;
const selectedPassRe = /[0-9]*(i|t|r)\.([\w-_]*)/;
Octal escape sequences are not allowed. Use the syntax '\x02'.
const shellChars = /[\002-\011\013-\032\\#?`(){}[\]^*<=>~|; "!$&'\202-\377]/;
This character cannot be escaped in a regular expression.
private readonly nameWithOwner = /(?<owner>-?[a-z0-9][a-z0-9\-\_]*)\/(?<name>(?:\w|\.|\-)+)/;
const isURlCustomFormat = /\.[a-z]+\z/.test(anchor.href);
Octal escape sequences are not allowed. Use the syntax '\x02'
const regexp = /([^\s'"]+(['"])([^\2]*?)\2)|[^\s'"]+|(['"])([^\4]*?)\4/gi;
A character class range must not be bounded by another character class
// Source: https://stackoverflow.com/a/8234912/2013580
const urlRegExp = new RegExp(
/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=+$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=+$,\w]+@)[A-Za-z0-9.-]+)((?:\/[+~%/.\w-_]*)?\??(?:[-+=&;%@.\w_]*)#?(?:[\w]*))?)/,
);
This regular expression flag is only available when targeting 'es2022' or later
// this regex is different from HASHTAG_REGEX in that it does not look for a
// #+character. It uses a negative look-ahead for `# `
const HASH_REGEX =
/(?<=^|\s)#(?![ \t#])([0-9]*[\p{L}\p{Emoji_Presentation}\p{N}/_-]*)/dgu;
This regular expression flag is only available when targeting 'es2018' or later
return message.replace(/([{}](?:.*[{}])?)/su, `'$1'`)
This regular expression flag is only available when targeting 'es6' or later
return message.replace(/([{}](?:.*[{}])?)/su, `'$1'`)
Octal escape sequences are not allowed. Use the syntax '\x00'
// Since negative lookbehind isn't supported in all browsers, this leaves out the negative lookbehind condition `(?<!\.lock)` to ensure the branch name doesn't end with `.lock`
const validBranchOrTagRegex = /^[^/](?!.*\/\.)(?!.*\.\.)(?!.*\/\/)(?!.*@\{)[^\000-\037\177 ~^:?*[\\]+[^./]$/;
// Since negative lookbehind isn't supported in all browsers, leave out the negative lookbehind condition `(?<!\.lock)` to ensure the branch name doesn't end with `.lock`
const refRegexShared = /\b((?!.*\/\.)(?!.*\.\.)(?!.*\/\/)(?!.*@\{)[^\000-\037\177 ,~^:?*[\\]+[^ ./])\b/gi;
This regular expression flag is only available when targeting 'es2018' or later
if (!/(\{\{.+?\}\})|(\{#.+?#\})|(\{%.+?%\})/s.test(str)) {
A character class range must not be bounded by another character class
const rUrl = /((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=+$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=+$,\w]+@)[A-Za-z0-9.-]+)((?:\/[+~%/.\w-_]*)?\??(?:[-+=&;%@.\w_]*)#?(?:[.!/\\w]*))?)/;
This character cannot be escaped in a regular expression
expect(data['message']).toMatch(
/Malformed FormData request. \_*Response.formData: Could not parse content as FormData./
)
This regular expression flag is only available when targeting 'es6' or later
const validBundleID = /^([a-zA-Z]([a-zA-Z0-9_])*\.)+[a-zA-Z]([a-zA-Z0-9_])*$/u
There is nothing available for repetition
const regExp: RegExp = /const foo *= *{0x1: *'bar'};/;
'}' expected
/tag`foo *\${0x1 *\+ *0x1} *bar`;/
A character class range must not be bounded by another character class
if (!/^([\w-.]*)$/.test(name)) {
A character class range must not be bounded by another character class.
return str.replace(/^(\w)|[\s-_:]+(\w)/g, function (match, p1, p2) {
A character class range must not be bounded by another character class
const urlGithubRE = /^(?:https:\/\/(?:github\.com|api\.github\.com\/repos)|(?:\/)?(?:\/)?repos)([\w-.?!=&%*+:@\/]*)/g;
This character cannot be escaped in a regular expression.
const H_REGEX = /(?<tag>[\w\-]+)?(?:#(?<id>[\w\-]+))?(?<class>(?:\.(?:[\w\-]+))*)(?:@(?<name>(?:[\w\_])+))?/;
This regular expression flag is only available when targeting 'es2022' or later.
const markRegex = /\bMARK:\s*(.*)$/d;
Octal escape sequences are not allowed. Use the syntax '\x09'.
function cssEscape(str: string): string {
return str.replace(/[\11\12\14\15\40]/g, '/'); // HTML class names can not contain certain whitespace characters, use / instead, which doesn't exist in file names.
}
A character class range must not be bounded by another character class.
const fileRegex = /(file:\/\/)?([a-zA-Z]:(\\\\|\\|\/)|(\\\\|\\|\/))?([\w-\._]+(\\\\|\\|\/))+[\w-\._]*/g;
A character class range must not be bounded by another character class.
/^\w([\w-.]*\w)?$/.test(x.preferredUsername)
Named capturing groups are only available when targeting 'ES2018' or later
const deprecation = (propDescriptor.description || '').match(/@deprecated(\s+(?<info>.*))?/);
A character class range must not be bounded by another character class.
let isText = /^[\w-\s.,\t\n]+$/.test(detail)
This character cannot be escaped in a regular expression
return tag.match(/^(?![\.\-])([a-zA-Z0-9\_\.\-])+$/g);
A character class range must not be bounded by another character class
const urlRegex = () =>
/((?:https?(?::\/\/))(?:www\.)?(?:[a-zA-Z\d-_.]+(?:(?:\.|@)[a-zA-Z\d]{2,})|localhost)(?:(?:[-a-zA-Z\d:%_+.~#!?&//=@]*)(?:[,](?![\s]))*)*)/g;
A character class range must not be bounded by another character class
export function expandDefaultServerVariables(url: string, variables: object = {}) {
return url.replace(
/(?:{)([\w-.]+)(?:})/g,
(match, name) => (variables[name] && variables[name].default) || match,
);
}
dozens of these in this file, see https://github.com/microsoft/TypeScript/issues/58275#issuecomment-2068174097
/([^a-zA-Z0-9\s{(\[<])(?:(?!\2)[^\\]|\\[\s\S])*\2(?:(?!\2)[^\\]|\\[\s\S])*\2/
A character class range must not be bounded by another character class
// eslint-disable-next-line @typescript-eslint/prefer-regexp-exec
const githubMatch = location.match(/https:\/\/github.com\/([\w-_]+\/[\w-_]+)/i);
slug: ['', unicodePatternValidator(/^[\p{Letter}0-9._-]+$/)],
A character class range must not be bounded by another character class
export const wordPattern = /(#?-?\d*\.\d\w*%?)|([$@#!.:]?[\w-?]+%?)|[$@#!.]/g;
return stream.advanceIfRegExp(/^[_:\w][_:\w-.\d]*/).toLowerCase();
I expect all of the errors not related to --target are a result of regular expressions that are allowed per Annex B.
IMO, all of the "Octal escape sequences are not allowed" and "A decimal escape must refer to an existent capturing group" are probably indications of actual errors in user code. They're allowed in Annex B, but the user likely intended to use them as a backreference to a capture group and that's not how Annex B would treat them.
All of the "A character class range must not be bounded by another character class" errors are probably fine and shouldn't be reported. Annex B allows them and most users wrote something like [\w-.] or the like thinking it meant "word characters, -, and .", which is how Annex B treats them.
Ah, I did once mentioned this on my PR and thought it was fine since Ryan reacted on my comment. https://github.com/microsoft/TypeScript/pull/55600#issuecomment-1735102411 I am fine with weakening the grammar, however keep in mind that we can’t guarantee everything runs on engines with Annex B support though I understand that this is mostly the case. IMHO another compiler option is the only realistic way to solve this, unfortunately.
IMO, all of the "Octal escape sequences are not allowed" and "A decimal escape must refer to an existent capturing group" are probably indications of actual errors in user code.
Yes, I actually thought that there is a consensus on not allowing any octal escapes anywhere per #53198 😅
Great work on adding validation for regexp!
We came across another regression on 5.5 for character class escape with script extensions that I did not see listed abover:
const regexpNonLatin = /\P{Script_Extensions=Latin}+/gu;
- OK on 5.4.5
- KO on 5.5.0-beta and KO on 5.5.0-dev.240240510 :
Unknown Unicode property value.
The issue seems specific to Script_Extensions and scx - Script is working fine. Same behavior is observed for \p and \P.
"٢".match(/\p{Script=Thaana}/u); // OK on 5.5
"٢".match(/\p{Script_Extensions=Thaana}/u); // KO on 5.5
// @ts-ignore can be used to work around the error, as hinted on https://github.com/microsoft/TypeScript/pull/58295
Those regexps are part of the samples on https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Unicode_character_class_escape ; we use something similar in our codebase and faced this when pretesting our typescript upgrade.
Would it be possible to support script extension values in 5.5?
Related links:
- https://github.com/microsoft/TypeScript/pull/55600
- https://github.com/microsoft/TypeScript/blob/main/src/compiler/scanner.ts#L3983 :
Script_Extensionsis// Currently empty - https://unicode.org/reports/tr24/#Script_Extensions_Def
@nostalic OMG, that’s totally my fault, I am very bad. I made it empty because the Script_Extensions section in PropertyValueAliases.txt shows nothing, without thinking much.
However, I don’t think the Team will have time to review PRs related to regular expressions in the immediate future; they even haven’t reviewed my short follow-up PRs yet 😅
@graphemecluster This is a great improvement, and the regex validation helps to catch some issues we had, so thanks for implementing it!
The issue can be worked around and as such this is not a blocker for us, though it would be great to have it fixed in 5.5 🙂
@nostalic OMG, that’s totally my fault, I am very bad. I made it empty because the
Script_Extensionssection in PropertyValueAliases.txt shows nothing, without thinking much. However, I don’t think the Team will have time to review PRs related to regular expressions in the immediate future; they even haven’t reviewed my short follow-up PRs yet 😅
Please do send things if you have them; I do think we want to get things looked at before 5.5 is branched off.
thoughts on this: since we already do regex group checking (as per release notes) shouldnt the resulting matchgroups be typed ?
(tried on playground with 5.5-beta)
No, the type system does not special case regexes like this. (yet?)
Enabling further implementation of regex type checking is the most vital reason why I implemented regex syntax checking, and it’s gonna be the most exciting part 😆