Ensure mvt-adb and AndroidQF collect the same forensic data
We have issues where AndroidQF acqusitions contain different raw forensic data as mvt-android. This can be a frustrating experience as it if often not possible to return to the device owner at a later stage to acquire more data.
We should compare the raw data collected by androidqf and mvt-android check-adb and ensure that both tools collect the same data. It may be necessary to add new modules and data sources to both.
There is a risk that these tools will continue to become out of sync as we continue development. We should consider how to create some kind of end-to-end integration tests against a virtual Android device to allow us to reproducible compare the final data and outputs generated by MVT after processing the output directly from ADB and from an AndroidQF dump.
Writing down my thoughts here as I thought about this issue some time in the last days.
Quick Overview over current differences between mvt-android and androidqf extractions
What the acquisitions on mvt and AQF basically do is running a list of adb commands on the device and then parsing and storing the results. The approach is slightly different here: In Androidqf the strategy is to take as much as possible, transfer that to one of our analysis machines and do the analysis there, in the MVT codebase more analysis and parsing happens right away. One example is that in AQF just the complete output from the command dumpsys is stored, in MVT granular information about packages is queried and merged into the package information structs right away.
So the process to come from a device to analysis results is contained of two steps, first the set of adb commands executed and secondly the analysis steps performed on the resulting data. Both the commands and the performed analysis is different in AQF and MVT currently.
To keep things more consistent, here is one solution possible solution:
- The Amount of analysis that AQF is performing is stripped down to a minimum
- Both MVT and AQF run an identical set of queries, performing in the same set of output files. It is easier to run the end-to-end integration test on that collection of output files which just follow a "take everything" strategy. Testing that the output files from an acquisition on MVT and AQF are identical ensures that they are executing the same commands
- The analysis would be done by the same codebase in MVT to have more consistency. So in MVT as well the acquisition of ADB data and the analysis would be separated. I see the following advantages:
- If we find a bug in the analysis code, we can rerun the analysis on an older mvt acquisition
- We only have one place to maintain the analysis code
- If we want to implement the end-to-end integration test right now, we have to implement the analysis and parsing identically in two different ways, so its possible that the data will be the same but displayed / structured differently
I have made a quick comparison of MVT check-adb and this repo:
| Module | MVT | Androidqf |
|---|---|---|
| Chrome History | Yes (only root) | No |
| Dumpsys | Yes (7 parsing modules + full) | Yes (only full dumpsys) |
| Files | Yes using find command | Yes using collector + more specific folders |
| GetProp | Yes | Yes |
| Logcat | Yes (both current and old) | Yes (both current and old) |
| Logs | No | Yes |
| Backup | No | Yes (full or only SMS) |
| SMS | Yes (through backup) | No (but backup is available) |
| packages | Yes (apks separately + get dumpsys info) | Yes (with APK and parsing of certificates) |
| Processes | Yes (ps -A) | Yes (ps -A) |
| Services | No | Yes |
| Env | No | Yes |
| Settings | Yes | Yes |
| Temp | Yes (only default tmp folder) | Yes (check system tmp folder) |
| Root binaries | Yes | Yes #22 |
| SELinux status | Yes | Yes #21 |
| Yes (only root) | No |
Update: SELinux, RootBinaries
I agree with @viktor3002 point of view, the idea of androidqf has always been to keep the extraction as simple as possible to avoid parsing issues and have the parsing work done in MVT.
One issue we have is that MVT itself is mixing extraction and analysis, which works well when data can be easily parsed and check with indicators, but doesn't when 1) we don't know how to parse/analyse the data (like logs) 2) we don't want to do the analysis in MVT because the model is too complex (like apks). In that case, either MVT doesn't extract them at all, or does both the analysis and raw extraction (like dumpsys). I wonder if we should think about MVT doing extraction and analysis separately.
The other question is whether we want to include extractions that only work on rooted phones (like Chrome or WhatsApp here). I have honestly never had any success with them, so I don't see much benefits of adding them in androidqf, but I am pretty open about that.
In any case, in the 10 differences I noted above, 5 can be easily fixed (Logs, services, env, Root binaries and SELinux), 2 need to discussion about rooting (Chrome and WhatsApp). Files and packages need a bit of work to make sure the output is the same (for files related to using the collector in MVT, and for packages how to gather information from multiple sources)
What do you all think?
Hello!
I totally agree that AQF could be the main tool to do extractions and MVT the tool for analysis, and with all that you already mentioned.
Until today we haven't analyzed rooted phones in real scenarios, so the Chrome and WA modules are something that we haven't use on real phones.
Best. P
From my point of view, I find AndroidQF gives a greater option for chance of performing an extraction.
With MVT, we need someone who is physically with the device, has mvt installed and is somewhat comfortable on the command line.
With AndroidQF that can much more easily be done at a distance or with someone less technical and in the vast majority of cases it's enough to include a simple "how to" document along with the binary for the defender to follow.
Similar to @penserbjorne, rooted devices are an almost non-occurrence. I've only met one and it had just been replaced with a new non rooted device.