AdAway icon indicating copy to clipboard operation
AdAway copied to clipboard

Write out blocked hosts in more compact form to speed up lookups with lenghty lists

Open zer0def opened this issue 2 years ago • 10 comments

For comparison, with about ~1-1.2M blocked hosts across both IP4 and IP6:

  • one host per line produces a ~68M hosts file with time ping -c1 -w1 <any hostname> on average taking anywhere between 8-10s on each lookup, including hosts expected to be cached
  • all hosts on one line produces a ~54M hosts file with time ping -c1 -w1 <any hostname> on average taking 2-3s

zer0def avatar Aug 14 '22 22:08 zer0def

Marked as draft, as I'm expecting remarks about the test suite requiring relevant brush-ups, but otherwise it's a perfectly reviewable PR.

zer0def avatar Aug 14 '22 22:08 zer0def

Hi @zer0def

Thanks for the PR!

The main question I have in mind is how well having so many hosts on one line will be supported on all Android devices and versions? This might be a good opportunity to test as a feature flag or as a beta version. I think some already discussed it in project issues...

And have you tried an intermediate state? Like have 10 or 20 hosts per line to measure the ping performance benefits?

Performance wise, I am concerned about creating a single line strings with all the blocked hosts. It might OOM on some devices... But it can easily be split and wrote to file if it happens during tests.

PerfectSlayer avatar Aug 15 '22 06:08 PerfectSlayer

Performance wise, I am concerned about creating a single line strings with all the blocked hosts. It might OOM on some devices...

To be perfectly fair, current code doesn't force any eager write flushes and, at least based on my testing, flushes implicitly just don't occur until fd close, so current use of BufferedWriter is just as prone to getting OOM'd.

I've pushed out changes to address this and am flashing an older ROM to double-check conditions on an older Android stack, but considering hosts fifle lookups happen much lower in userspace and have succeeded in the constrained environment I've got for testing right now (~2GB RAM device on A12), I would say that these changes are addressing the biggest currently present lookup bottleneck.

zer0def avatar Aug 15 '22 08:08 zer0def

Just double-checked on A7.1 with the same device:

  • all hosts on a single line results in the same lookup times as on A12
  • lookup times for one host per line have improved

Haven't noticed any particular OOMs or issues with resolving either.

zer0def avatar Aug 15 '22 12:08 zer0def

Hi,

Sorry for the delay, I got a lot on my plate lately (like starting a new job!). Thanks for your tests. I am going to merge it on my side and test it on my device. If everything works fine, it should be part of the next release soon:tm: :wink:

Related issue: #455

PerfectSlayer avatar Aug 28 '22 06:08 PerfectSlayer

Congrats on the new job, hope it treats you at least fairly and decently.

Just for the record, I've picked pageSize a bit arbitrarily, essentially trying to keep the I/O memory buffer low (anywhere below 1MB per flush even with abusively lengthy domain names), so file render times are likely going to be atrocious, but given that it's iterating by record offset, not page number multiplier, one could implement some sort of tweaking I/O performance tracking function to tweak that value as needed.

zer0def avatar Aug 28 '22 07:08 zer0def

Thanks!

I started from your code and made some adjustments to keep the hosts order by names. If there is a redirect, I write the current line of grouped hosts, write the redirection at the expected location, then resume grouping hosts for the following lines.

My main concern is that I failed to test the hosts resolution time improvement. With the default hosts (even with IPv6 enabled), I got a very quick time that is not relevant to show how optimization performs.

# time ping -c1 -w1 abstractedauthority.com
PING abstractedauthority.com (34.160.113.1) 56(84) bytes of data.
64 bytes from 1.113.160.34.bc.googleusercontent.com (34.160.113.1): icmp_seq=1 ttl=116 time=7.45 ms

--- abstractedauthority.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.456/7.456/7.456/0.000 ms
    0m00.14s real     0m00.00s user     0m00.02s system

Can you share some public sources to reach the ~70M hosts you use?

PerfectSlayer avatar Aug 29 '22 19:08 PerfectSlayer

Ah, ok, I've misphrased my initial PR comment in that when I used M specifically in relation to the hosts file, I was referring to it's size in MiB. The host list, after deduplication hovers around ~2.5M records (not bytes!) and the device this was tested against is a low-tier device (Redmi 5A/riva, to be precise; yes, I'm aware it's a 2/16 trashphone from 2017).

I'm attaching the exported list of remote sources, backed up directly from my AdAway app instance.

adaway.txt

zer0def avatar Aug 29 '22 19:08 zer0def

Thanks for your hosts!

I can definitely see the difference with and without optimization using your lists but there is two things that bother me.

First, I can't find where does this timing difference comes from? I had a look to the Android source code and hosts file resolution should not be impacted by this optimization (other than having to read the loopback IP more often...): https://cs.android.com/android/platform/superproject/+/master:packages/modules/DnsResolver/getaddrinfo.cpp;drc=344cc11f53c5841aa8719b646894c5e9c22e9640;l=1464

Second, and the worst, ad-blocking fail using a big number of hosts per line (ex 500). On top of the ping command, I also use some Java calls to test to represent application behavior (InetAddress.getByName(hostname) and new URL(hostname).open().connect()) and the redirection to the loopback does no more work.

I might try to find how many host (called alias) I can set per line until it failed but it might be dependent of the Android version, OEM, etc.. 😞

PerfectSlayer avatar Sep 03 '22 13:09 PerfectSlayer

Just re-tested the hosts file on a relatively fresh build of AOSP 13.0.0_r3 and noticed the same thing - every time at least one line in the hosts file exceeds the length of ~6K(i?)B characters, the whole file isn't being considered during lookup at all.

With the max allowed length of an individual label being 63 chars and the max whole record length being 253 chars, I would assume keeping an individual line length under 1KiB to be sensible enough™, making it 4 records per line in the worst case. Will follow up with a complementary fix shortly.

[Edit 2022-09-03 ~7PM UTC] After testing 4 records per line, the same device takes ~5 seconds on each domain lookup, which, while still an improvement, can quickly compound on itself. That effectively forced the total potental line length to become 16 records (assumed max of 4KiB) - every following optimization would require tracking by total substring length, not record count, which is better done during database query time. At 16 records per line, query time takes ~3 seconds.

zer0def avatar Sep 03 '22 17:09 zer0def