redlib icon indicating copy to clipboard operation
redlib copied to clipboard

Request for help with reverse-engineering APK

Open sigaloid opened this issue 1 year ago • 12 comments
trafficstars

Hello all,

I'm working on keeping the requests made by Redlib in line with what the official app does, in order to protect against detection. While I was once able to MITM the app, I can no longer - I've tried a lot of things, including apk-mitm, APKLab, etc - all various problems but mainly even once I figure out the splitting of the APKs, and patch out the TLS cert check with apk-mitm, I cannot get the app to launch. Seems like they have some kind of more serious anti-debugging on it that didn't exist a few months ago. While I can still MITM on iOS easily (since user certs aren't treated as different than system certs on iOS, whereas Android apps never use user certs by default unless you manually patch it out), it's not the same, since there's likely differences in how the two apps send headers and connect etc (I think they already spell headers with different capitalizations!).

What I've tried:

  • apk-mitm
  • android-unpinner
  • manually patching out calls to isAppDebuggable, isDebuggerAttached, isEmulator, isRooted, etc
  • APKLab/apktool disassembly

So, if you can figure out how to take the app's APK, patch out the certificate pinning, then also bypass the anti-debugging check, thus providing an easy way to open the app with user-level TLS certificates trusted, and can detail your process, please reply or email me at re@[my site in profile].

EDIT: got it working (you just need root lol)

sigaloid avatar Nov 19 '24 17:11 sigaloid

There is one unofficial Reddit app that still works (without having to login), maybe that can be helpful? It's open source: https://github.com/QuantumBadger/RedReader

I believe that one was still allowed because of its accessibility features.

jimdrie avatar Nov 19 '24 19:11 jimdrie

There also is Glance, which has a Reddit feature. I am using it as a homepage and have the same list of Subs in there as i have configured on Redlib. It's working without any issues.

https://github.com/glanceapp/glance

r7l avatar Nov 19 '24 20:11 r7l

As far as i know the two of those both rely on low rate limits and .json routes which are static and based on IP address, Redlib needs a method that is tied to the OAuth token that we generate, so that when we inevitably go over the rate limit in 500 seconds, we can just throw it away and refresh to a new token.

I looked into RedReader and either I can't extract the oauth key they distribute or I'm just not looking right. I also don't want to piggyback off of theirs as if there's the same level of serious load as Redlib causes now, on some small app who relies on the goodwill of Reddit to allow their usage, I worry they'll get kicked off.

I'd prefer to reverse-engineer the actual official Reddit app and match its behavior for the anonymous browsing mode, since it's their own app.

sigaloid avatar Nov 19 '24 20:11 sigaloid

From the OP it sounds like you've only tried MitM with user certs. System certs are "easy" to MitM on Android as well, but require a rooted device or emulator. The easiest way to do so is usually to use HTTP Toolkit. I have a gist on the topic if it's helpful, though it's pretty surface level and aimed mostly at those entirely unfamiliar with the concept.

Apologies if reddit uses explicit certificate pinning or if this isn't useful for your workflow. I tried looking at the mobile app myself and it did seem to successfully intercept some requests to reddit, but it doesn't seem like I can try any important requests before signing up, which I'm not going to do. Best of luck.

ghost avatar Nov 19 '24 20:11 ghost

@sigaloid I just tried and was able to MITM perfectly with moved system certificates using root and HTTP Toolkit on my computer.

What I noticed is the app uses the GQL API through the gql-fed.reddit.com hostname. I also noticed that the rate limit is significantly higher at x-ratelimit-remaining: 1999.0.

Another thought I had was instead of getting a new token each time we reach the rate limit, could we store a pool of tokens, and keep rotating between them as the rate limit gets used. Once all of them are exhausted, then we fetch a new token and expand the pool.

FireMasterK avatar Nov 19 '24 22:11 FireMasterK

Awesome, I hadn't tried a rooted Android VM. What VM are you running? I'm struggling with getting Waydroid rooted - seem to get stuck on SELinux errors every time.

I do like the idea of using the GQL API if possible - it would be a lot of changes though, as many things are currently built off of the path-based way of constructing requests.

Currently the issue isn't the requesting of new tokens, because that's actually something the app will do by default regularly, but the headers being sent are uniquely identifiable. Granted, it's possible we continue down this path of back-and-forth breaking and fixing, and thus I would need to match behavior that precisely. So for sure something to keep on my radar.

sigaloid avatar Nov 19 '24 23:11 sigaloid

What VM are you running?

On Linux I find it easiest to use Android Studio's built-in device emulation for stuff like this.

ghost avatar Nov 19 '24 23:11 ghost

Awesome, I hadn't tried a rooted Android VM. What VM are you running? I'm struggling with getting Waydroid rooted - seem to get stuck on SELinux errors every time.

I used a real device running Android 15 and the https://github.com/ys1231/MoveCertificate model with kernelsu.

For a VM, I would recommend using the official AVD, taking the necessary boot/ramdisk image from the system img file, and updating the VM's disk files. Follow this for the patching instructions: https://topjohnwu.github.io/Magisk/install.html After this, you can use the module I linked above.

I do like the idea of using the GQL API if possible - it would be a lot of changes though, as many things are currently built off of the path-based way of constructing requests.

I agree, but I see this as the best long term solution as both the Website and App seem to fully use it these days, to best emulate the app's requests.

Currently the issue isn't the requesting of new tokens, because that's actually something the app will do by default regularly, but the headers being sent are uniquely identifiable. Granted, it's possible we continue down this path of back-and-forth breaking and fixing, and thus I would need to match behavior that precisely. So for sure something to keep on my radar.

But aren't the tokens valid for a day? I don't think the app would fetch a new token regularly unless it's really needed? 🤔 The headers should be the first priority indeed like you said, I agree, but we still aren't emulating the app unless we use GQL. We have quite a bit of work to fully emulate the app. 😅

FireMasterK avatar Nov 19 '24 23:11 FireMasterK

For a VM, I would recommend using the official AVD, taking the necessary boot/ramdisk image from the system img file, and updating the VM's disk files. Follow this for the patching instructions: ... After this, you can use the module I linked above.

Note that this is all a bit overkill in cases where the default devices haven't been specifically blocked by whatever you're trying to RE. If you use a non Google Play image (so one that says Google APIs instead), you will be rooted by default. After starting the emulator for the first time, you can launch HTTP Toolkit and click the ADB connector and get system trust automatically. Then adb install-multiple com.reddit.frontpage.apk config.en.apk config.mdpi.apk and the app runs without issue.

edit: and if you did want to use a Google Play build or custom kernel, something like rootAVD is probably an easier way to do so, but I have limited experience with this. In my experience, it's pretty uncommon that apps won't work on a standard x86_64 Google APIs image.

ghost avatar Nov 20 '24 00:11 ghost

I’ve been working on a new version of Redlib over the past few weeks because I anticipated that Reddit would implement more sophisticated detection measures. I wanted to develop something that aligns with the project's long-term goals and current needs. So far, I’ve made significant improvements and have implemented several features to address the challenges we’ve been facing.

That said, it's a bit overkill at the moment, and I’m still working through some bugs, as I’m relatively new to coding. I didn’t build this to be easily replicated yet, and there are many parts that are tightly coupled to my local setup. But I’m hoping the ideas might be useful, and perhaps some of you can help improve it further.

This is what I've done to mine:

list of real Android app versions Rotates between different device configurations Proper OAuth client IDs for Android devices Token daemon that refreshes tokens Automatic retry logic with fallback Proper Android API endpoints Tracks rate limit remaining Rolling over mechanism Randomization to avoid patterns / Adds random jitter to requests to avoid patterns Improve token rotation with proper rate limit tracking Tracks rate limit remaining counts Also added proxy support on my end Rotate proxies on rate limits Randomly rotate proxies (X% chance per request) to avoid patterns Support for Tor (More frequent rotation when using Tor) Location tagging for geographic distribution Limited connection pooling In-memory cache with configurable TTL (default 1 hour) Cache key generation that ignores irrelevant query parameters Configurable cache size limits Automatic cache invalidation Caches only successful GET responses Additional Features: A number of small, complex features that I can’t fully explain here

I’ve built a fancy GUI with many debug features that help monitor the functions above. This makes it easier to track requests, rate limits, and proxies, and troubleshoot when things go wrong.

I also dug deep with other privacy centered social media projects similar to this, and one that has dealt with these issues the most is Nitter. Refer to their solutions with rate-limiting, and check out the other solutions they employed to dodge these problems. I think the punishment system for scrapers is perfect, but if you're going to be employing this I recommend putting the amount of requests one can do in public.

As mentioned, the project is still massively buggy, simply put it’s not ready. And since I’m relatively new to coding, a lot of the structure may only work on my computer, but I wanted to throw these ideas out there and see if there’s something here that can be of value to you.

I suggest making maybe a discord and see if other people want to hop on it and help out in a more interactive way. Feel free to reach out if you're interested in taking a look at what I did. Thanks!

JohnD-o avatar Nov 20 '24 08:11 JohnD-o

I don't know if it helps but I was able to do it with AVD and Gplay by using an older version of android which didnt treat user certs differently and that the app (UDB App Pro) still supported.

drakeerv avatar Nov 20 '24 15:11 drakeerv

But aren't the tokens valid for a day? I don't think the app would fetch a new token regularly unless it's really needed? 🤔 The headers should be the first priority indeed like you said, I agree, but we still aren't emulating the app unless we use GQL. We have quite a bit of work to fully emulate the app. 😅

The app does regularly request new tokens way before the expiry for seemingly no reason at all. Also many old versions of Reddit did still use the OAuth routes, and they still have to work for approved third party apps like RedReader (the routes not the tokens).

Btw: did indeed get root and installed the certs, MITM going well now!

I have moved closer to matching headers extremely closely here: https://github.com/redlib-org/redlib/commit/6be6f892a4eb159f5a27ce48f0ce615298071eac

I’ve been working on a new version of Redlib over the past few weeks because I anticipated that Reddit would implement more sophisticated detection measures

I'd love to see your changes, you can reach out to me via email (anything @ domain in profile)!

sigaloid avatar Nov 21 '24 06:11 sigaloid