searchcode-server icon indicating copy to clipboard operation
searchcode-server copied to clipboard

Issues Using Special Characters in Regex

Open sigilwig44 opened this issue 3 years ago • 2 comments

Hi there! I setup searchcode with one of my repos, and have been testing it out. I like it, and I like how easy it is to use, so I want to make it work. I started with some test searches, and was unable to find some strings that I definitely should be able to find using the tool, so I'm wondering if anyone has any ideas of what I need to do in order to make the tool recognize these strings.

First, I was using an API endpoint in this repo that I wanted to find. This is the specific string I was hoping to match:

../../api/studentAdmin/transportation/${personID},${calendarID}

After escaping a bunch of characters, I verified (using regex101.com) that the following regular expression matches the previous string:

/../../api/studentAdmin/transportation/${personID},${calendarID}/

I tried searching this exact term using searchcode, and came up with no results.

I was hoping to find all files containing the increment operator (++). I searched for ++, and found no results. Knowing these are both special characters, I tried escaping them, ++, and still had no results. I also tried double escaping, \+\+, with no results. I tried searching using a regular expression, /\+\+/ and /++/ and still had no results. When I search for ++ in vscode in this repo, I find 37 results, so it's not that the operator is not present in my code.

I also had trouble searching for strings that end with a period, or with a parenthesis, for example, student.instructor. or student.getStudent(

Let me know if anyone has any ideas how I should be searching for strings. Escaping characters does not seem to make any difference, and searching using a regular expression also does not seem to make any difference. I just want to use this tool to find any exact matches to strings that I am searching for, even if they contain special characters. Thank you!

sigilwig44 avatar Oct 11 '22 19:10 sigilwig44

Hi sorry about late response I have been getting some surgery and offline.

This is a limitation in the way the search is constructed. As you say its picking it up as a regex, but I am not familiar with how that operates inside lucene inside the application. Something for me to look at.

boyter avatar Oct 21 '22 00:10 boyter

Thank you! Let me know if you have any progress. I'm going to put this on the back burner for now, but if I have some time I will also take a look at see if I can figure it out as well. Here are a couple stack overflow articles that I found that look like they would probably be helpful:

https://stackoverflow.com/questions/42005525/how-to-extend-lucenes-standardanalyzer-for-custom-special-character-treatment https://stackoverflow.com/questions/6107875/how-to-search-special-characters-in-lucene

sigilwig44 avatar Oct 27 '22 15:10 sigilwig44