rayhunter icon indicating copy to clipboard operation
rayhunter copied to clipboard

feat(tools): extract and redact mobile identities from captures

Open oopsbagel opened this issue 9 months ago • 5 comments

Description

This branch adds two python scripts:

extract.py:

  • read mobile identities in TS24.301 NAS EPS Mobility Management messages from pcap files, as strings of digits and their id type.

redact.py:

  • redact TS24.008 bcd encoded mobile identities from any file type with bytes containing 0xAA.
  • uses yara-x to detect identities and prints the yara rules to stdout.

Note that when TS24.301 NAS messages are encapsulated in TS 36.331 LTE RRC messages, they may change byte alignment. redact.py attempts to correct for this by calculating several bit shifted versions of the identities.

This commit makes no attempt to identify encrypted identities or redact encrypted messages.

Users just need to run redact.py *.qmdl *.pcap or equivalent. extract.py is just for developers.

Notes

fixes #154

Because the redaction logic is decoupled from the detection logic, and because redact doesn't inherently require parsing, it would actually be pretty easy to add this to the qmdl/pcap writers in Rust. yara-x itself is written in Rust, and redacting is just overwriting a stream. I chose to have this operate on files in place because that seemed the most useful, but the logic should work with just in-memory edits.

oopsbagel avatar Mar 27 '25 23:03 oopsbagel

pycrate is very noisy (and I couldn't figure out how to silence it), so I leaned in to it and added my own diagnostic messages. Because stdout contains representations of mobile identities, it shouldn't be shared. It may be worth removing these, at the risk of making debugging harder.

There's still some work to be done around security protected NAS messages. Ciphertext fields may contain sensitive information that we may want to blank out as well.

This may be brittle with regards to differently-lengthed IMSI or IMEISVs. I only had a few pcaps from my own hardware to test with.

oopsbagel avatar Mar 27 '25 23:03 oopsbagel

aside: I used uv while developing this but didn't add in pyproject.toml / .python-version / uv.lock, although I honestly think we should: uv is still compatible with requirements.txt (just run uv pip freeze > requirements.txt occasionally), and makes handling python versions and packages fast and easy compared to pip.

oopsbagel avatar Mar 27 '25 23:03 oopsbagel

Usually I'm an advocate for using the defaults for tools, but wireshark is not set up for analysing mobile messages out of the box. I created some highlighting rules, display settings, and filters to help me navigate these pcaps. I'll share my wireshark profile shortly.

Orange is a message containing a mobile identity. Yellow are messages requesting mobile identities.

This is what it looks like with a redacted file: snoring

https://git.sr.ht/~oopsbagel/wireshark-profile-ltenas

oopsbagel avatar Mar 27 '25 23:03 oopsbagel

I have the beginnings of some tests in place for this that I haven't pushed as well, the main reason I haven't is the test data would either be very contrived (which is probably fine) or be binary blobs, and I was hoping to get feedback before solidifying any interfaces.

oopsbagel avatar Mar 27 '25 23:03 oopsbagel

It looks like the some of the qmdl_store tests have become flaky in main. When running cargo test on my local machine, it usually passes but sometimes I see either one of these fail:

qmdl_store::tests::test_creating_updating_and_closing_entries ... FAILED
qmdl_store::tests::test_create_on_existing_store ... FAILED

My guess is that the changes here: https://github.com/EFForg/rayhunter/pull/206/files#diff-22d0410f27b950c785f5c7b84a665e728e92f08c1394497f97dfec0048a26473L48 had more subtle effects than intended.

oopsbagel avatar Mar 27 '25 23:03 oopsbagel

This is not directly connected to this issue, but.. I think it could be useful if Rayhunter would print out neighboring cells when it detects IMSI catcher. So for metrics (null cipher, 2G downgrade, etc.), I would add metadata about network, neighbouring cells, etc.

MatejKovacic avatar Apr 02 '25 10:04 MatejKovacic

Usually I'm an advocate for using the defaults for tools, but wireshark is not set up for analysing mobile messages out of the box. I created some highlighting rules, display settings, and filters to help me navigate these pcaps. I'll share my wireshark profile shortly.

Orange is a message containing a mobile identity. Yellow are messages requesting mobile identities.

This is what it looks like with a redacted file: snoring

https://git.sr.ht/~oopsbagel/wireshark-profile-ltenas

loving this - i was trying to make this better on my end. thank you!

m0xsec avatar Apr 03 '25 02:04 m0xsec

I realise maintainer time is limited, so I could also just move these tools to my own repository. The Rust implementation is the particularly valuable part, and it's not in this branch.

loving this - i was trying to make this better on my end. thank you!

Thanks for sharing that! I won't post further updates about it in this PR, but I'll mention that I've pushed more polished version of the config's filter buttons and I unlicensed it. Please feel free to contribute back any additions or publish your own versions etc.

oopsbagel avatar Apr 03 '25 08:04 oopsbagel

thanks a ton @oopsbagel! this week's been insanely busy for me so i haven't had a chance to look at this yet, but this is pretty much exactly what i was hoping for in #154. i'm hoping to get to it sometime next week

wgreenberg avatar Apr 03 '25 19:04 wgreenberg

so I talked with @wgreenberg a bit and I think you already talked to him as well but I think I'm gonna reject this redaction pull request even though I appreciate all the work! The main reason is that it doesn't actually cover all the cases where an IMSI could show up, and as much as I love yara I don't think it's quite the right tool for the job here. We also want to redact cell ids, and I think for all of this we want to redact them in an irreversible but deterministic way, so like IMSI and TMSI etc. should be essentially hashed but with a random seed so that each instance of a given cell id in a given pcap encodes the same way. I think its best to do this through the existing parser.

cooperq avatar Apr 11 '25 19:04 cooperq