Regex \x{0000}+ sometimes doesn't work
Description of the Issue
In a file I receive from an external source, updated monthly, there are a few occurrences of NUL, i.e. character codepoint 0 (zero). I need to clean them out, so I do a regex search for \x{0000}+. When I do that search from the top of the file, NP++ doesn't find anything and also shows an error in the status bar at the bottom of the search dialog, saying it's an invalid regex.

When I search for \x{0000} without the + at the end, it does find the NUL characters. It also finds them if I search for \x{0000}+ but start a bit from the top of the file. The file is ANSI ecoded, about 460000 lines long and about 55-60 Mbyte in size. The NUL characters appear around line 425000. The search for \x{0000}+ starts working from about line 23450 (and below). Above that, it never works. I can't see anything special with the file content at lines around 23450.
I may be able to provide a copy of the file for testing, but I'll have to ask a partner company for permission.
I've tried with e.g. \x{0030}+ (find any sequence of zeroes, character "0"). That seems to work everywhere.
Steps to Reproduce the Issue
- Open the affected file, go to top of file and search for regex \x{0000}+
- The search will not find anything and say the regex is invalid.
- Go to line 50000 or below and try again.
- The search will find the NUL characters.
Expected Behavior
The search should find the NUL characters regardless of where the search is started.
Actual Behavior
The search doesn't find the NUL characters if search is started at the top of the file (and down to about line 23450).
Debug Information
Notepad++ v8.2.1 (64-bit) Build time : Jan 19 2022 - 18:43:05 Path : C:\Program Files\Notepad++\notepad++.exe Command Line : Admin mode : OFF Local Conf mode : OFF Cloud Config : OFF OS Name : Windows 10 Pro (64-bit) OS Version : 2009 OS Build : 19044.1566 Current ANSI codepage : 1252 Plugins : mimeTools.dll NppConverter.dll NppExport.dll
What does the hover-bubble indicate when you hover over it in this?:

A misc. note is that regex searching for null characters has always had problems, so this isn't anything new (and probably foretells how likely this is to get fixed).
Notepad++ is a text editor, not a general-purpose hex editor; perhaps a hex editor program is a better choice when dealing with nulls.

So you understand from that info that the regular expression engine abandoned trying to do your search, right?
Apparently, yes. But it shouldn't really... Any similar search for something not to do with NUL works fine, so it's not a general limitation re. file size, time to perform the search or any such thing. It's s bug related to NUL. Understood that it may not be likely to get fixed, but a bug it is. At least now it's known.
For sure there are bugs here. It's funny but sometimes when I add info just for discussion purposes people think I am trying to deny bug existence -- not true.
I have a regex that generates the same message, but only on the second time it is run in a file. Also, the problem is only certain files. I've so far been unable to get an example of the bug under 182 lines. I don't mind posting the entire file, but I don't want to spam anything or hijack a thread. The regex I'm using is this: (ge)( 2:1)|.*\K(?1).*(?2)|.*\K(?2).*(?1). Do you want more details here, or start a new thread?
@cjbarth If you're inclined to, you should open a new issue as yours isn't null-related like the title of this one. However, if the regex engine can tell you that your expression plus your data is problematic, there isn't a lot that can be done.