nuclei File protocol regex search improvement

File protocol regex search improvement

Open ehsandeep opened this issue 3 years ago • 3 comments

trafficstars

Please describe your feature request:

Currently, we read everything in memory with assumption of processing samller data, which might not be the case all the time and slows down as we increase the input items to process https://github.com/projectdiscovery/nuclei/blob/e383449fb32696fed7ed8ed9ff4b40a96eb311c5/v2/pkg/protocols/file/request.go#L54

Reference:

https://github.com/golang/go/issues/26623
https://github.com/BurntSushi/rure-go

shared by @yabeow

Feb 10 '22 09:02 ehsandeep

Potential options to consider:

split large file into chunks and process them on separate threads
look into the feasibility of an interchangeable solution, controlled by a flag (default would remain the same, the flag would control the use of a shared library for more advanced users/use-cases)
look into Google's RE2?

Feb 10 '22 11:02 forgedhallpass

After investigation, the following implementations would be needed:

[ ] Actually, matcher works on string/byte slice only, it's necessary to implement a regex-based engine accepting io.Reader, capable of handling potential overlapping matches between chunks
[ ] rurego provides between x2 to x4 performance increase on large chunks of data => for better portability, the library should be optionally available statically linked within the GH generated binary.
[ ] Hyperscan is another very good option => the bindings are not up to date. We need to fork and refactor
[ ] Create bindings for https://github.com/google/re2

Feb 16 '22 12:02 Mzack9999

Blocked by #1634

Mar 02 '22 07:03 Mzack9999

nuclei nuclei copied to clipboard

File protocol regex search improvement

Please describe your feature request:

nuclei
nuclei copied to clipboard