sleigh
sleigh copied to clipboard
Write a script to check for new files
When new files are added to Sleigh, sometimes our weekly sync with Ghidra continues working fine and we don't notice. This leads us to having missing headers which we have to fix later down the track (https://github.com/lifting-bits/sleigh/pull/107).
We should write a script as part of our weekly sync that identifies new files and either adds them to the PR or fails loudly so we can manually fix it.
Partial improvement made in this commit https://github.com/lifting-bits/sleigh/commit/b80fefc0018bb5dc94738bf40e2a15decbebeb70
Hopefully, it lists the changed (including new and removed) files in the PR and commit message.
@ekilmer
Had a think about this today. I don't think it's realistic for us to automatically update the CMake configuration since there's no way for us to know what sub-library (libsla
, libdecomp
, etc) the file belongs to. However, I'd like it to be a bit more obvious than it is now since the additions in the git diff
output can easily get lost in a sea of modifications.
I want to use a regex to parse the git diff
output to figure out what files are added and if there are new sources, have the PR bot leave a comment saying something like: "Manual intervention required. This update contains the following new C++ sources."
Does that seem ok to you?
@tetsuo-cpp
Had a think about this today. I don't think it's realistic for us to automatically update the CMake configuration since there's no way for us to know what sub-library (
libsla
,libdecomp
, etc) the file belongs to. However, I'd like it to be a bit more obvious than it is now since the additions in thegit diff
output can easily get lost in a sea of modifications.
Very good point, and I agree with both statements
I want to use a regex to parse the
git diff
output to figure out what files are added and if there are new sources, have the PR bot leave a comment saying something like: "Manual intervention required. This update contains the following new C++ sources."Does that seem ok to you?
A regex would work but there's also a native way to filter for added, modified, and deleted files that I just learned about recently: diff-filter
git diff --diff-filter=M
and we could probably run it 3 times for each of M
(modified), A
(added), and D
(deleted), where A
and D
would likely require manual intervention.
Moreover, I think there are some additional improvements to be made for the sleighspec directory: We should do one (or combination) of the following (or some other equivalent)
- Ignore java, manifest, etc. files
- iterate all of the extensions for sleigh specifications
- Be more precise about directories we look for changes in (i.e.
Ghidra/Processors/*/data/languages
, but there might be other paths we want too)