notepad4 icon indicating copy to clipboard operation
notepad4 copied to clipboard

可以为正则式增加字符集通配符吗?

Open lenny20 opened this issue 2 years ago • 11 comments

(除去几个重合的)增加支持一些也好啊,增强正则式功效。

TextPro专为处理全角半角、简繁汉字优化,支持下列通配符(注意大小写):

以半角为主的部分: \a  匹配任意英文字母; \~a  匹配除字母外的任意半角或全角字符; \d   匹配任意数字; \~d  匹配除数字外的任意半角或全角字符; \h   匹配任意词首字符(字母及下划线); \~h  匹配除字母及下划线以外的任意字符; \l   匹配任意小写字母; \~l  匹配除小写字母外的任意半角或全角字符; \o   匹配任意八进制数字(0-7); \~o  匹配除八进制数字外的任意字符; \p   匹配任意半角标点符号(非空格、字母、数字的可打印ASCII字符); \~p  匹配除半角标点以外的任意半角或全角字符; \s   匹配任意空白字符(半角空格、TAB); \~s  匹配任意非空白的半角或全角字符; \u   匹配任意大写字母; \~u  匹配除大写字母外的任意半角或全角字符; \w   匹配可成词的字符(字母、数字及下划线); \~w  匹配成词字符(字母、数字及下划线)外的任意半角或全角字符; \x   匹配任意十六进制数字(0-9,a-f, A-F); \~x  匹配除十六进制数字外的任意半角或全角字符;

以全角为主的部分: \f   匹配除ASCII字符外的任意全角字符; \~f  匹配任意ASCII字符; \A   匹配任意全角ASCII字符; \b   匹配收录在BIG5码集中的任意字符; \~b  匹配未收录在BIG5码集中的任意字符; \c   匹配任意汉字(不包括符号); \~c  匹配除汉字外的任意全角字符; \D   匹配地支字符(子丑寅卯……); \g   匹配收录在GBK码集中的任意字符; \~g  匹配未收录在GBK码集中的任意字符; \G   匹配大写希腊字母 \j   匹配日文片假名 \J   匹配日文平假名 \k   匹配小写希腊字母 \m   匹配数学符号; \n   匹配中文数字(一二三四……); \N   匹配大写中文数字(壹贰叁肆……); \P   匹配全角标点符号; \r   匹配小写俄文字母; \R   匹配大写俄文字母; \S   匹配罗马数字,带点、括号或圆圈的序号(⒈⒉⒊⒋……); \T   匹配天干字符(甲乙丙丁……); \V   匹配竖排标点符号; \y   匹配拼音字符; \Y   匹配注音字符; \Z   匹配制表字符;

lenny20 avatar Nov 16 '21 11:11 lenny20

Currently Unicode regex is not supported, Edit -> Convert -> Other Conversions has some utils for CJK.

zufuliu avatar Nov 16 '21 13:11 zufuliu

“Other Conversions” 这样转换简繁体不十分准确,某些字词需要二次处理。 那当前有没有一些可以添加支持的字符集呢?主要是搜索替换可以更高效。 那么将来会考虑改用Oniguruma正则引擎吗?在您看来有什么缺点吗?notepad3已经长期使用,有现成源码工作量不大吧。

lenny20 avatar Nov 16 '21 15:11 lenny20

“Other Conversions” 这样转换简繁体不十分准确,某些字词需要二次处理。

The result depends on system and is similar to Bing translator when Extended Linguistic Services is used, see https://docs.microsoft.com/en-us/windows/win32/intl/transliteration-services

zufuliu avatar Nov 16 '21 15:11 zufuliu

如果需要更强的正则式功能,可以使用功能丰富的Notepad3,没关系。 我只是希望添加一些字符集通配符,增强搜索替换的功能。

lenny20 avatar Nov 17 '21 17:11 lenny20

For binary size reason, external regex lib will not be used for a long time, we will stick with the builtin 8-bit POSIX regex. For people needs powerful regex functions for text analysis or so, I think a plugin is more useful. official Scintilla has a ctypes based Python binding (used for unit test), it can be ported to any Scintilla based editor but absolutely need someone to work on.

zufuliu avatar Nov 20 '21 01:11 zufuliu

Plugin support would be great. But Python-based plugin could be too heavy. Lua or so may be a decent choice.

PNBRQK avatar Nov 20 '21 08:11 PNBRQK

Python is just an example (it's simple and popular). I don't have ideas on how to make a plugin, was though that allow user to manipulate text in Notepad2 using external script will solve some complex tasks. e.g. Add menu File -> Launch -> Script Panel to open script panel, user can write or load script, and run it against text in current editor.

zufuliu avatar Nov 20 '21 08:11 zufuliu

如果是这么麻烦,那就不要了,我支持notepad2保持当前精巧的样子。

lenny20 avatar Nov 20 '21 13:11 lenny20

与其添加额外的程序代码,不如直接在帮助面板里写上常用的区间写法就好(可以加上复制或插入按钮)

Mapaler avatar Dec 08 '21 14:12 Mapaler

能不能更换notepad2的正则引擎呢,一直都是个和.net正则不兼容的引擎。vscode就可以参考微软的文档。 https://docs.microsoft.com/zh-cn/dotnet/standard/base-types/regular-expression-language-quick-reference

Mapaler avatar Apr 12 '22 03:04 Mapaler

The regex engine will change to Boost (also used by Notepad++ and other editors) in v4.24.01, please comment on #725 for regex features. Implement wildcards listed by @lenny20 is complex (some even conflicts with regex syntax, see https://www.boost.org/doc/libs/1_83_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html).

zufuliu avatar Oct 13 '23 23:10 zufuliu