Add support for newer LLVM parsing for Linux - Replace "long unsigned int" with"unsigned long"
Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] Scenario: Memory acquisition and analysis of Google Kubernetes (GKE) nodes as well as any Linux based kernel image that was compiled with newer version of clang+LLVM, see commit: https://github.com/llvm/llvm-project/commit/f6a561c4d6754b13165a49990e8365d819f64c86. And this failure occurs for when the Linux kernel (both 5.10 and 5.15) is compiled.
The problem occurs when memory acquisition is done on GKE nodes using AVML and then dwarf2json to build a symbols file using vmlinux of the current build_id of the GKE node and active COS version. vmlinux is acquired via: curl -O https://storage.googleapis.com/cos-tools/build_id/vmlinux .e.g curl -O https://storage.googleapis.com/cos-tools/17412.101.24/vmlinux
The current version of Volatility3 uses "long unsigned int" at the following places: https://github.com/volatilityfoundation/volatility3/blob/d56297c4b1cf0f9f4912f4f4158e232c700acb3f/volatility3/framework/automagic/linux.py#L162 https://github.com/volatilityfoundation/volatility3/blob/d56297c4b1cf0f9f4912f4f4158e232c700acb3f/volatility3/framework/plugins/linux/kmsg.py#L71C32-L71C32
Which causes Volatility3 to fail when the command: python3 volatility3/vol.py -s PATH_TO/dwarf2json_profile.json -f PATH_TO/memory_dump.lime linux.ANY_PLUGIN is run Banners and isInfo still works but any other Linux plugin that uses the linux framework and related automagic fails and claims that the symbols file or memory dump file is missing.
Describe the solution you'd like At the following lines: https://github.com/volatilityfoundation/volatility3/blob/d56297c4b1cf0f9f4912f4f4158e232c700acb3f/volatility3/framework/automagic/linux.py#L162 https://github.com/volatilityfoundation/volatility3/blob/d56297c4b1cf0f9f4912f4f4158e232c700acb3f/volatility3/framework/plugins/linux/kmsg.py#L71C32-L71C32
Replace: "long unsigned int" With: "unsigned long"
When testing this locally Volatility3 runs perfectly fine but for the solution above more testing would have to be done to fix this.
Describe alternatives you've considered This solution was originally discovered by the Google GKE team and I have no knowledge of other solutions to fix this.
Additional information Add any other information or screenshots about the feature request here. This issue was first raised here in the Volatility community Slack channel: https://volatilitycommunity.slack.com/archives/CP9LZ5KD5/p1700060787864299
Just to provide some more details. The LLVM behavior change in https://github.com/llvm/llvm-project/commit/f6a561c4d6754b13165a49990e8365d819f64c86 can be observed with the following program:
#include <stdio.h>
long unsigned int increment(long unsigned int a) {
return a + 1;
}
int main(void) {
long unsigned int b = increment(5);
printf("%lud\n", b);
return 0;
}
After compiling this program with clang, here is a sample of the debug symbols before the change:
$ dwarf2json linux --elf a.out | jq .base_types
{
"int": {
"size": 4,
"signed": true,
"kind": "int",
"endian": "little"
},
"long unsigned int": {
"size": 8,
"signed": false,
"kind": "int",
"endian": "little"
},
"void": {
"size": 0,
"signed": false,
"kind": "void",
"endian": "little"
}
}
and after the change:
$ dwarf2json linux --elf a.out | jq .base_types
{
"int": {
"size": 4,
"signed": true,
"kind": "int",
"endian": "little"
},
"unsigned long": {
"size": 8,
"signed": false,
"kind": "int",
"endian": "little"
},
"void": {
"size": 0,
"signed": false,
"kind": "void",
"endian": "little"
}
}
A change that only replaces usage of long unsigned int with unsigned long will break compatibility with old debug symbols, so we'll need to be careful of that.
Thanks @Monrava and @rkolchmeyer for reporting this issue. I will do my best to help fix this, but I would also need your assistance. Based on the vmlinux you attached to this ticket, it is a Chromium OS kernel.
Linux version 5.15.109+ (builder@a1d285de7a25) (Chromium OS 15.0_pre458507_p20220602-r18 clang version 15.0.0 (/var/tmp/portage/sys-devel/llvm-15.0_pre458507_p20220602-r18/work/llvm-15.0_pre458507_p20220602/clang a58d0af058038595c93de961b725f86997cf8d4a), LLD 15.0.0) #1 SMP Fri Jun 9 10:57:30 UTC 2023
After investing considerable time, I successfully installed Chromium OS in QEMU. I followed the official Chrome OS VM for Chromium developers procedure. However, I find myself halfway through the process, facing challenges such as retrieving the kernel dwarf info (only managing the BTF), encountering issues with emerge/portage, dev_install doesn't work, and more. Assistance would be greatly appreciated, as the setup process is proving to be time-consuming.
Is there any chance you could prepare a VM for QEMU, VMware, or VirtualBox for me? This would allow me to dump the memory, obtain the vmlinux with dwarf info, System.map, etc. It would also be quite useful for future tests.
I'm wondering if this might'nt be better solved in dwarf2json, I just don't know whether to centralise on always providing one or the other or both in three basic types?
We could do it in volatility but then there's no scope for operating systems deciding to change them.
Another option would be to go back through all the existing plugins and make a lookup table that points at the right symbol name, but that seems a little like lookup up tables of lookup tables.
Hi @gcmoreira - Apologies for the late answer on this. To your question for providing: VM for QEMU, VMware, or VirtualBox I have to defer to @rkolchmeyer for that info, since I myself haven't tested that approach. The testing approach I had was simply to a standard GKE cluster, with the latest version of COS, and then run dwarf2json + Volatility3 from inside a privilege container. From which the above error was found and root cause discovered by @rkolchmeyer
On @ikelos question: I'd trust your judgement on the best approach for the solution. My only thinking would be that since this issue seems to only be relevant for the combination of dwarf2json + Volatility3 and the following fix made the combination work again: https://github.com/Monrava/cyberthreat2023/blob/c525bd64300980e344f8af5d2061180e6e7352cb/terraform/modules/create_avml_resources/instance_startup_scripts/install_dependencies.sh#L46C1-L49C92
I'd guess there might either be a unique case for COS in particular, for which this could be an exception and handled separately. Or if more OS suffers the same problem there might an opportunity to either implement this more broadly or present the user with a argument to allow for a "type" filtering.
I honestly don't know what the best option is here either but unless more users suffer from this I'd say it works to keep this as a separate use cases and problem for COS.
@ilch1 @npetroni Any thoughts on how best to deal with this one? In the generated tables, or by adding an additional alias in volatility? The generated tables are probably accurate and consistent, but we reference them by a different name (I hope, one that's from the C standard). Do we supplement their base types and do we do that in the table or just as a background thing volatility always does (it feels weird making a base type that'd be like pointer, but we do technically tinker with pointer to ensure it's right for the table so...?)
Any update on this? Super curious to the solution for this :)
Worth adding that this issue popped up again on slack, @Abyss-W4tcher pointed the user BlueNinja this was is it was causing an issue with a memory dump from a GKE node.
Thanks @eve-mem for the info! I'm not sure if a fix to this was ever concluded or how big this issue is. Assuming that the use of Volatility is limited on GKE memory dumps but it would still be useful to have support for this. @Abyss-W4tcher - Was the user BlueNinja experiencing a new problem or the same one as describe above? If it's the same, I used this approach in my own repo to fix this locally while awaiting a permanent fix: https://github.com/Monrava/cyberthreat2023/blob/c525bd64300980e344f8af5d2061180e6e7352cb/terraform/modules/create_avml_resources/instance_startup_scripts/install_dependencies.sh#L48