Layers - Several fixes around maximum_address and chunk sizes
maximum_address fixes in several layers
I think there are still other places to fix but I haven't fixed everything as testing it will be massive. I think we should review all occurrences of layer.maximum_address and where we calculate the data size based on that value. There are other places where I suspect it's being used incorrectly. To mention just a few:
- PdbMSFStream::maximum_address looks like it's also wrongly calculated as
len(self._pages) * self._pdb_layer.page_sizeshould be the next byte following themaximum_address. - Additionally, there are other places that require double-checking. For instance, everywhere the
maximum_addressis in a while condition but the value is not included in the address range i.e.: lime, or avml. Technically looks wrong but I think these could still be ok because they are expecting more than 1 byte to read anyway. So it should mean that the layer/file is truncated/bad format/etc.
CommandLine: file_handler_class_factory ->CLIFileHandler
- Fixed issue when the filename doesn't contain an extension: For instance, if we call the
LayerWriterplugin with--output aaa... it creates a.aaa(hidden filename in linux). Next iterations using the same argument will generate-1.aaa,-2.aaa, etc. insted ofaaa,aaa-1, andaaa-2respectively. - Stored the real filename created in
file_handle._output_filename. That way we can know the final filename including both the "--output-dir" and the "--output" values and its "-1", "-2", etc (if used). Otherwise, the filename reported will be incorrect (and old version of this file), which could result in inaccurate analysis. Unfortunately, the final filename is determined when the file is closed, which might lead to some controversy regarding its implementation.
LayerWriter plugin. Several fixes
- Fixed missing
--outputargument and other bugs, plus some code improvements.
I was wrong with e5a5b895771b655d21c36689c33a534034c31e36, it should be fine. Anyway, I think it is a more common way to express this kind of check and is generally considered more readable.
Tests:
Vmem image
Vol2
$ python2 ../volatility2/vol.py imagecopy -f ../ubuntu2004lts64bit/ubuntu2004lts64bit-Snapshot35.vmem -O ubuntu2004lts64bit-Snapshot35.vmem.raw.vol2
Volatility Foundation Volatility Framework 2.6
Writing data (5.00 MB chunks): |.......................................................................................................|
Vol3
$ python3 ../volatility3/vol.py -f ../ubuntu2004lts64bit/ubuntu2004lts64bit-Snapshot35.vmem LayerWriter --output ubuntu2004lts64bit-Snapshot35.vmem.raw.vol3
Volatility 3 Framework 2.7.0
Progress: 100.00 PDB scanning finished
Status
Progress: 99.61 Writing layer primary
Layer has been written to /home/user/vol3-imagecopy/ubuntu2004lts64bit-Snapshot35.vmem.raw.vol3
Test
-rw------- 1 user user 536870912 May 15 2023 /media/user/ubuntu2004lts64bit/ubuntu2004lts64bit-Snapshot35.vmem
-rw-rw-r-- 1 user user 536870912 May 6 16:09 ubuntu2004lts64bit-Snapshot35.vmem.raw.vol2
-rw------- 1 user user 536870912 May 6 16:09 ubuntu2004lts64bit-Snapshot35.vmem.raw.vol3
$ sha1sum ubuntu2004lts64bit-Snapshot35.vmem.raw.vol2 ubuntu2004lts64bit-Snapshot35.vmem.raw.vol3
48206033995ab862fdf729ed4bf2a8bd104654ab ubuntu2004lts64bit-Snapshot35.vmem.raw.vol2
48206033995ab862fdf729ed4bf2a8bd104654ab ubuntu2004lts64bit-Snapshot35.vmem.raw.vol3
Core image
Vol2
$ python2 ../volatility2/vol.py imagecopy -f ./ram-4.10.0-42.core -O ram-4.10.0-42.core.raw.vol2
Volatility Foundation Volatility Framework 2.6
Writing data (5.00 MB chunks): |...........................................................................................................................................................................................................................................|
Vol3
+ ./vol.py -f ./ram-4.10.0-42.core LayerWriter --output ram-4.10.0-42.core.raw.vol3
Volatility 3 Framework 2.7.0
Progress: 100.00 PDB scanning finished
Status
Progress: 99.98 Writing layer primary
Layer has been written to /home/user/vol3-imagecopy/ram-4.10.0-42.core.raw.vol3
Test
-rw-r--r-- 1 user user 1208166496 May 6 07:51 ./ram-4.10.0-42.core
-rw-rw-r-- 1 user user 4294967296 May 6 16:10 ram-4.10.0-42.core.raw.vol2
-rw------- 1 user user 4294967296 May 6 16:10 ram-4.10.0-42.core.raw.vol3
$ sha1sum ram-4.10.0-42.core.raw.vol2 ram-4.10.0-42.core.raw.vol3
36a504a0f1e203a6d6cb9bc1a4c0657fc3a47c97 ram-4.10.0-42.core.raw.vol2
36a504a0f1e203a6d6cb9bc1a4c0657fc3a47c97 ram-4.10.0-42.core.raw.vol3
@ikelos LayerWriter plugin needs #1142 to be fixed
In future it might be easier to submit these as separate, smaller PRs, just so that the trivial ones are easy to put through, and don't get held up by any of the others. The maximum_address fix seems really valuable, but is being held up by the output stuff in layerwriter...
Cool! I used commits for that. Each commit addresses a specific thing. I always try my best to keep them separated and organized. Sorry, I thought that way will also work for you.
There are only two commits ( 6d22347ce6dd0152746564a2d77022ca1b4d9045 and d7aae3a9828ba03bd5531aa385724a954e48501a) that are not ok. Would you be okay with cherry-picking the commits, or would you prefer if I reverted those ones?
In summary:
Ignore:
- 6d22347ce6dd0152746564a2d77022ca1b4d9045
- d7aae3a9828ba03bd5531aa385724a954e48501a
Merge
- f116c08a7ec60f62e3ef931de7639e402f421d65 Fix bug in segmented layers i.e. core elf.
- a6c77c488436c7b05040b5bf475be79cb59457f3 Fix bug in LayerWriter
- c7f29936dcfb81bb9ae3fe33e261db2bb7c3be85 Unless you want to remove all the functionality in CLIMemFileHandler::close() this is still a bug.
- 32a5f131fd22d5351eee264fd6e58420a1458e9c Still a bug, self.config['output_name'] doesn't exist if you don't want to merge 6d22347ce6dd0152746564a2d77022ca1b4d9045. Otherwise, it will always print "Layer cannot be written to None: <exception>"
Up to you but I think these improves the Python code readability.
- e5a5b895771b655d21c36689c33a534034c31e36
- 1246b20b6521e11110038fe091af912e2b5f4439
I think I'd prefer you to revert them, just so I can do it through the github UI easily, if that's ok? I'm ok with all the ones except the two you marked as ignore (but again, I'll want to check through the whole thing as one to make sure it makes sense, if that's ok)...
Ok, reverted those commits and redid it using the preferred_filename as you suggested. See example below.
It was a bit tricky as preferred_filename has a setter. The file has to be still open to change it. Additionally, the character set used cannot contain "/" .. so, it seems it was intended to be the base name.
$ ./vol.py -f ./ram-4.10.0-42.core LayerWriter --layer primary
Volatility 3 Framework 2.7.0
Progress: 100.00 PDB scanning finished
Status
Progress: 99.98 Writing layer primary
Layer has been written to primary.raw
$ ./vol.py -f ram-5.8.0-53.vmem LayerWriter
Volatility 3 Framework 2.7.0
Progress: 100.00 PDB scanning finished
Status
Progress: 99.61 Writing layer primary
Layer has been written to primary-1.raw
$ ./vol.py -f ram-5.8.0-53.vmem LayerWriter
Volatility 3 Framework 2.7.0
Progress: 100.00 PDB scanning finished
Status
Progress: 99.61 Writing layer primary
Layer has been written to primary-2.raw
$ rm primary.raw
$ ./vol.py -f ./ram-4.10.0-42.core LayerWriter --layer primary
Volatility 3 Framework 2.7.0
Progress: 100.00 PDB scanning finished
Status
Progress: 99.98 Writing layer primary
Layer has been written to primary.raw
@ikelos Another thing I noticed and would like to check with you is the handling of the temporary file. Before renaming it to the final filename, it’s created in the current directory instead of the system’s temporary directory. This might have been intentional since these files are usually large, but I'm not entirely sure. Thoughts?
Last but not least, do you have a way to test what I mentioned in the description? maybe @atcuno can help with that
- PdbMSFStream::maximum_address looks like it's also wrongly calculated as len(self._pages) * self._pdb_layer.page_size should be the next byte following the maximum_address. I think it should be:
diff --git a/volatility3/framework/layers/msf.py b/volatility3/framework/layers/msf.py
index 8d84a774..6ad6b1fa 100644
--- a/volatility3/framework/layers/msf.py
+++ b/volatility3/framework/layers/msf.py
@@ -253,8 +253,11 @@ class PdbMSFStream(linear.LinearlyMappedLayer):
@property
def maximum_address(self) -> int:
- return self.config.get(
- "maximum_size", len(self._pages) * self._pdb_layer.page_size
+ return (
+ self.config.get(
+ "maximum_size", len(self._pages) * self._pdb_layer.page_size
+ )
+ - 1
)
@property
maximum_size and maximum_address are not equivalent. maximum_address = maximum_size - 1
Lemme know if you're happy for this to be merged as is, or if you want to throw in the fix for maximum_size/maximum_address?
Ok, reverted those commits and redid it using the
preferred_filenameas you suggested. See example below. It was a bit tricky aspreferred_filenamehas a setter. The file has to be still open to change it. Additionally, the character set used cannot contain "/" .. so, it seems it was intended to be thebase name.
It is, because a path makes no sense if this is being handled by a web user interface, for example. The whole framework has to be seen through a UI agnostic lens. The way of returning files can be dealt with by a UI however it wants, but we can't make assumptions about the results being tied to a filesystem.
Happy for the temporary file to be constructed in a temporary directory (although kept in memory may be bad depending on how much data someone wants to write to the file).
I don't know enough about the test you're suggesting, but I'm happy to accept your changes if you say they're needed. Might be best to check with @atcuno, but I'm happier with the patch now. Thanks for making the requested changes... 5:)
Yeah, probably better merge it as is, and we can always create a new ticket for the other changes. Thanks