Fix URL-encoded filesystem entries in the Velociraptor loader
Closes #707
Wouldn't this also be an issue on non-zip (directory) Velociraptor targets?
@Schamper yes it will, I would like to enforce the usage of a ZIP file as target, because extracting a ZIP file causes all kinds of errors. For example, the only option to extract a ZIP file in an automated manner is with unzip -o [...], which enables overwriting existing files without prompting, which possible removes valuable traces.
Should this explicitly be documented in the comments of the loader or should the loader also support a non-zip directory Velociraptor target?
We don't use Velociraptor so I can't really speak for it, but I do think I've heard people say they sometimes work with unzipped collections. If there are no other reasons to drop support for it, I'd suggest to let it in.
I'm not a huge fan of the explicit URL decoding under a generic decode_name flag, but I don't immediately have a better suggestion. I was thinking maybe to pass a callback function, but I'm not sure I like that either :smile:. I'll also throw it up for discussion in the team to see if anyone has a nice idea.
After a short discussion we decided it may be best to tweak the ZipFilesystem and DirectoryFilesystem implementations slightly to facilitate customization by subclasses a little better. You can then subclass these filesystems within your loader and provide the customized behaviour. This subclass can live within the same file as the loader as far as I'm concerned (we also do this in other cases where we e.g. subclass the registry plugin to provide a custom implementation).
I've created #707 and #708 to implement this in the respective filesystems. Would you be up for making these changes?
Yes but I don't understand how to fix the issues. I will add comments to the issues.
Would be nice to keep the support for a non-zipped folder since we noticed a huge hit on performance when working with zip archives rather than unzipped directories.
@Zawadidone wondering if you do not have the same problem with zipped vs unzipped
To add to the need for this.
Example, the following path is not picked up by target-*:
uploads/auto/C%3A/Users/Test/%2Essh/known_hosts
This totally makes sense since the openssh module is looking for the .ssh folder:
Can we include the fix not only for the zip loader, but also for the folder loader?
@Schamper yes it will, I would like to enforce the usage of a ZIP file as target, because extracting a ZIP file causes all kinds of errors. For example, the only option to extract a ZIP file in an automated manner is with
unzip -o [...], which enables overwriting existing files without prompting, which possible removes valuable traces.Should this explicitly be documented in the comments of the loader or should the loader also support a non-zip directory Velociraptor target?
@Zawadidone Under which circumstances do you run into files that are overwritten?
Would be nice to keep the support for a non-zipped folder since we noticed a huge hit on performance when working with zip archives rather than unzipped directories.
@Zawadidone wondering if you do not have the same problem with zipped vs unzipped
Yes we have the same performance hit when working with a collection stored in a ZIP file, but only when executing the MFT plugin. The other plugins are 'fast' but that's only because they return less records compared to the MFT plugin.
I haven't had the time for it, but I want to look into solving the following issues:
- Improving the speed of the MFT plugin by reading the MFT records using the segment size (https://github.com/fox-it/dissect.target/issues/665)
- Debugging performance issues with the MFT plugin when using a collection stored in a ZIP file
To add to the need for this.
Example, the following path is not picked up by
target-*:uploads/auto/C%3A/Users/Test/%2Essh/known_hostsThis totally makes sense since the
opensshmodule is looking for the.sshfolder:Can we include the fix not only for the zip loader, but also for the folder loader?
Yes this PR will include a fix for both the ZIP as well as the DIR loader.
@Schamper yes it will, I would like to enforce the usage of a ZIP file as target, because extracting a ZIP file causes all kinds of errors. For example, the only option to extract a ZIP file in an automated manner is with
unzip -o [...], which enables overwriting existing files without prompting, which possible removes valuable traces. Should this explicitly be documented in the comments of the loader or should the loader also support a non-zip directory Velociraptor target?@Zawadidone Under which circumstances do you run into files that are overwritten?
On Linux systems that use symlinks see https://github.com/fox-it/dissect.target/issues/699, but this could occur on all kind of systems that use symlinks.
@OlafHaalstra the ZIP implementation is fixed, I am working on a fix for https://github.com/fox-it/dissect.target/issues/708.
@Schamper, I have implemented my suggestion (https://github.com/fox-it/dissect.target/issues/708) which fixes the issue, but the tests are failing because it assumes that DirectoryFilesystem should also work when path is a str.
The tests fail, I will look into this in a month or less.
I've made some suggestions for how I envisioned the filesystem thing to work. I'll leave fixing the unit tests up to you :wink:.
@Schamper the implementation of the loader works, but a test keeps failing due to an unknown bug (https://github.com/fox-it/dissect.target/pull/700/commits/bc076dabf7bfd7cd3027352a749dc5f23b8ad2bd). Do you understand why this doesn't work?
I didn't verify (can try later if needed) but my guess is it's because of the blanket .replace(".", "%2E"), which will also replace periods which aren't encoded as %2E in the zip file (such as in file extensions).
Oh yeah indeed, I will fix this.
@Schamper the issue is fixed.
The tests are failing unfortunately.
@Zawadidone the unit tests are still failing.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 77.71%. Comparing base (
300f110) to head (af1653b). Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #700 +/- ##
==========================================
- Coverage 77.71% 77.71% -0.01%
==========================================
Files 326 326
Lines 28543 28559 +16
==========================================
+ Hits 22183 22194 +11
- Misses 6360 6365 +5
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 77.71% <100.00%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.