ghidra
ghidra copied to clipboard
[Analyzer] Add search for non-ASCII strings in the Auto-Analyzer
Is your feature request related to a problem? Please describe. I am mostly working with non-ASCII/Unicode encoded files (Japanese SHIFT-JIS in my case). Using the Auto-Analyze function of Ghidra i can find all ASCII Encoded Strings using "ASCII Strings", but as far as i can see a function to search for non-ASCII string does not exist right now. I have to use a HEX Editor, which displays the strings correctly with the right encoding, and manually change the "broken" strings (which are not displayed correctly in Ghidra-ASCII) to the correct encoding so they are actually useable. As far as i can see, this has to be done manually one by one for every string.
Describe the solution you'd like Using the "ASCII Strings" Auto-Analyzer as an example, the Auto-Analyzer could need an additional function which the user can specify which encoding you are searching for.
Describe alternatives you've considered There are some other solutions/scripts available for Ghidra, for example: StringSearcher PascalStringSearcher and AbstractStringSearcher
But the Auto-Analyzer doesn't have a build-in string searcher, in which you can tell Ghidra what exacly to search for.
Additional context Auto-Analyzer "ASCII Strings" function which i mentioned above: https://i.imgur.com/BGwloHP.png "Search for Strings" tab, which only can search for Pascal encoded strings: https://i.imgur.com/XK7fkUI.png StringSearcher: https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Base/src/main/java/ghidra/program/util/string/StringSearcher.java PascalStringSearcher: https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Base/src/main/java/ghidra/program/util/string/PascalStringSearcher.java AbstractStringSearcher: https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Base/src/main/java/ghidra/program/util/string/AbstractStringSearcher.java
This would be really useful in Windows, where you'll often have UTF-16/UCS-2 strings
Wishful thinking.
Would be nice to get this back on the radar; working on a blob at the moment that's very euc-jp string heavy and it's slow going - would be a definite time saver.
This is in my work queue and I have been playing around with this already. (example binaries welcome)
That's excellent to hear, many thanks 😺 Do you have a method for me to send binaries privately?
[email protected] works
Let me know if you need help debugging this feature!
Thank You @ryanmkurtz and @dev747368 !