ungoogled-chromium icon indicating copy to clipboard operation
ungoogled-chromium copied to clipboard

The future of content filtering (declarativeNetRequest, Manifest v3, and beyond)

Open joey04 opened this issue 5 years ago • 128 comments

This is both an FYI and some follow-on questions.

The FYI -- For those who haven't yet heard, Google is planning a severe reduction in extension capability for blockers and other content filters. Their current draft of changes includes deprecating the webRequest API and replacing it with declarativeNetRequest. This will only permit a fixed number of purely declaritive rules in a single json list. Moreover, the browser will have final say on what requests get blocked or redirected.

For more info, see this uBlock Origin thread. It has all of the relevant links.

I have two follow-on questions:

  1. Is there any interest from the developers here to integrate a filtering engine? I realize this is a big request, but the new API will eliminate the possibility of uBlock Origin, uMatrix, and other robust extensions. Since this project recommends those on the wiki, it will impact this project and its users. My idea is to modify Brave's engine, which is already integrated but has only AdBlock Plus level functionality.

  2. If not, can you recommend how to undertake such an effort? I have experience working with large C++ codebases, but nothing browser-related. Any tips, gotchas, etc. would be appreciated.

joey04 avatar Jan 24 '19 19:01 joey04

@virtadpt asked something similar on Gitter:

Given that Chrome and seemingly Chromium-mainline are going to lose the use of the webRequest API, which is going to break many ad-blocking extensions (https://www.howtogeek.com/fyi/chrome-may-get-faster-ad-blocking-while-breaking-ublock-origin/), are there plans to keep it in Ungoogled Chromium? Or would this be technically infeasible due to ongoing maintenance concerns?

My response:

Google has stated that their draft is subject to change, so I won't plan anything unless it is certain to become a reality.

That being said, I have nothing against such patch as long as it's implemented well. If the patch becomes too difficult to maintain, we could look into other options like Brave's native blocking implementation. We will discuss this on GitHub Issues if it ever gets to that point.

Eloston avatar Jan 24 '19 21:01 Eloston

I advise planning for this sooner rather than later. I have little hope that Google will make any serious concessions on this matter. (Folks can read the thread I linked to, the threads on Reddit, etc. to understand why.)

I have several points to cover in this post, but first a little about me since I'm new here.

I'm a Firefox guy who has never built a browser from source. I've been using Chrome on Windows and Chromium on Linux for a few years as a supplement, but I'm much more familiar with the Mozilla way of doing things. As for content filtering, I've twice forked uBlock Origin (uBO), first for XUL (aka Firefox legacy) and now for WebExtensions. It's been a while since I've used regular uBO :)

I obviously know a lot more about filtering than modifying Chromium. But I've now spent several hours looking into code possibilities. From the readme here I found Bromite with its integrated blocker inherited from the defunct NoChromo code. Csagan5 has done some nice refactoring work, including moving the code from the net component to the chrome component. However, he still has the very simplistic hard-coded header file of 50K+ rules, apparently translated via script from EasyList. (Upon seeing this, I couldn't help but laugh at the irony of Google going this direction with extensions being limited to a single json file of rules.)

On the other hand, Brave has a full-fledged filtering engine (though not robust enough for me) but with a heavyweight integration into Chromium. (Understandably, given their grand ambitions of an entirely new online advertising model, along with other features they've added.)

For me, assuming my big learning curve and setup of a build environment goes well, I'll probably go with a more Bromite way of doing things to start. Eventually I'd incorporate more robust capabilities, possibly borrowing from the Brave code. Lots of details TBD, of course, but I mention it as food for thought here.


For those reading this and fretting about their browsing future, I too have some anxiety about the end of robust extensions. (Not just in Chrome; I would expect Mozilla to follow suit eventually, which is another reason I'm here.) It'll be at least a year before the current extension API is phased out of Chromium, so there's plenty of time to prepare. While there won't be any options as desirable as the robust extensions we're accustomed to, we won't be restricted to just a Google or Mozilla browser with crappy filter extensions.

I advise not putting all your eggs in one basket. A custom Chromium appears to be the best choice to transition to for my primary browser, given its best-possible web compatibility. But another possibility is Firefox forks. I used Pale Moon for several years and can attest to the quality work of Mark Straver and his helpers. He has a clear vision of what he wants his browser to be and has done a good job making it happen. There are inherent drawbacks, unfortunately, like web compatibility, but it's still worth considering, especially since it will always maintain robust XUL extension support. (It'll be easy for me to revive my original uBO fork.)

I expect I'll continue to have a multi-browser setup. It's TBD what will be primary (for the lion share of browsing), but that's what I intend to figure out in the coming months. Worst-case, I can use regular Chromium with a very limited v3 extension as a supplemental browser starting next year. Like I said before, there are a number of options worth considering.

joey04 avatar Jan 25 '19 23:01 joey04

I'll probably go with a more Bromite way of doing things to start. Eventually I'd incorporate more robust capabilities, possibly borrowing from the Brave code. Lots of details TBD, of course, but I mention it as food for thought here.

Where I could follow your progress on a new Chromium Fork ?

Like you know from the other Nano thread I'm really interested by a new Chromium fork that would stay retro-compatible with V2 Manifest but that would also be compatible with the future V3 manifest.

Regards :octocat:

mikhoul avatar Jan 26 '19 15:01 mikhoul

Where I could follow your progress on a new Chromium Fork ?

Nowhere. This is a preliminary exploratory time. I can't guarantee I will do anything on Chromium. (The way I wrote it yesterday didn't properly convey that. Sorry about that.)

As I wrote yesterday, I've never built a browser. Chromium is a massive codebase with millions of lines of code. And one very important item I didn't mention before is that I would be purchasing a new machine to do this work.

My current PC is a Windows 8.1 with hyper-threaded dual-core and 8 GB of RAM. So it's capable of building Chromium, but doing so would abuse the hell out of my hardware. I really like my highly-customized Windows setup, in large part because 8.1 is the last version I can control. In fact, I've been intending for several years to transition to a Linux distro for my next PC because I hate Windows 10. But I'm in no hurry to do this and would prefer to continue using my current PC indefinitely. Running it hot for several hours (at least) for an initial build of a Chromium pull is the last thing I'll be doing to it.

At this point, undertaking a very time-consuming Chromium modification project is literally a thousand times more involved for me compared to simply installing and configuring one of Straver's browsers (Pale Moon or Basilisk) as a backup option to my current primary browser of Firefox ESR. (I probably won't do that just yet, because I'm content with my ESR setup, but it wouldn't take me more than a few hours to have a Basilisk with Classic Theme Restorer configured to my liking.)

I mention all of this for context. I'm still interested in modifying Chromium because it will have the best web compatibility in the long run. But, practically-speaking, so long as Firefox or Straver's forks are a viable primary browser for me, I would prefer to stick with them. There are a number of things I prefer about the Firefox way, including privacy settings and the quality of Gecko rendering, compared to Chromium.

Finally, I'll add that while I suspect Mozilla will eventually conform to Google's V3 API at some point, that wouldn't be for at least another year. So there's no urgency for me here.

joey04 avatar Jan 26 '19 19:01 joey04

There's one more point I'd like to make. Please don't think I'm a Mozilla advocate here. That's not my intention.

In fact, I don't like the Mozilla organization. Nor do I trust them to make good or even ethical decisions. But, as I've stated, its ESR browser is my best choice as primary right now.

I must state that my ESR setup is highly customized to the point that the browser does not emit a single packet unless I expressly want it to. (I've verified it with Wireshark.) This took a number of hours to do, using the thorough documentation of the ghacks-user project to disable a bunch of settings. I honestly consider it an "UnMozillad Firefox" :)

It helps a lot that all of my extensions are either my own creation or modified by me. So they're all unsigned, of course, and I had to fully disable add-on updates to not have the browser ping Mozilla's add-on servers.

One advantage of Straver's forks is that I got this same level of control much easier. He doesn't do any shady crap like Mozilla. Seriously, default Firefox settings result in almost Chrome level of phoning home.

(No way I can completely UnGoogle my Chrome install on Windows, of course, but I have done so to the extents possible in settings. As a supplemental browser, I only use Chrome for a few websites, so I'm okay with it.)

joey04 avatar Jan 26 '19 20:01 joey04

It just occurred to me, perhaps there's a way to do Chromium development without need of my own build machine.

Ideally, I could just pull the code here on GitHub and run a Travis build. But I don't know much about the particulars.

Question: Does anyone know if this is feasible? Or is there another online service I could use?

joey04 avatar Jan 27 '19 01:01 joey04

Thanks for describing in depth your problem and motivations. I believe I understand the situation better now.

We would all appreciate new feature development, even if it isn't immediately useful, or even if it doesn't end up useful at all. While I can't guarentee that I will personally contribute code, I would at least like to provide guidance and support for you and this feature.

It looks like you have a vision of what you want this feature to become, so I won't push back with feature requirements or the sort at this time. I think this'll be a great way to let you familiarize yourself with the Chromium codebase, and allow us to form better ideas based on actual work.

You said you've worked on large codebases before, so here are some brief tips I can give (in no particular order of importance):

  • Take a look through Google's design docs in chromium.org and in the Chromium source tree under docs/. At least skim them to get an idea of what's there and to set your bearings straight while in the code.
  • Have a look through ungoogled-chromium's patches to get an idea of how certain components of Chromium work.
  • Try reading through the code mentioned by the design docs or patches. Especially code comments, like those in the header files. Reading helped me get an intuitive "feel" for Chromium's code style and structure.
  • There are several major differences between ungoogled-chromium's build process and Chromium's build process. There are several reasons for doing so, but they should make more sense as you go through them.

I realize that working on Chromium, or any new large codebase, is a daunting task. But like any other large codebase, no one person will know how all of Chromium works, or even what all the individual components are. Thankfully, there is a nice coherent structure and style throughout that makes reading Chromium code a lot easier, even satisfying at times. Reading the design docs covering overall motivations, structures, and processes helped quite a bit too (albeit it is usually a bit outdated). While the amount of code you will need to look through will not be insignificant, it shouldn't be too overwhelming either.

To keep track of your work, I think it is best to start a fork; perhaps we could maintain a Pull Request here to increase visibility.

Finally, to wrap up with a question you had:

Question: Does anyone know if this is feasible? Or is there another online service I could use?

Have a look through #17. Basically, Travis won't work. You could try OpenSUSE Build Service or maybe something like an Amazon EC2 instance.

Eloston avatar Jan 29 '19 06:01 Eloston

@Eloston thanks for your thoughtful reply. I appreciate your advice. In fact, I suggest adding it to your developing doc.

I read several of the Chromium docs last week when I first looked into this. I was pleasantly surprised how helpful they are. It was a good first impression.

But, all things considered, I probably won't be interested in using a modified Chromium until uBO becomes unviable in Chrome. I'm guessing that will be in the next 18 months, depending on Google's TBA timeframe for phasing out webRequest.

This doesn't mean I'm not interested in exploring the integration of a custom filter engine. I'm actually going to continue looking into that in the near future. I already have very clear notions of what a custom filter engine should be; for me, that's the easy part because I've tailored my current uBO fork to be just that. Of course there are other essential considerations, which is what I'll be focusing on as I look at relevant portions of the code. But without a build machine of my own or a pressing need to make these changes, it'll just be a learning exercise for now.

joey04 avatar Jan 30 '19 00:01 joey04

FYI the chromium build requirements are just crazy. If you want to hack on the chromium source, you really need a beefy machine with a lot of ram, the more cores the better. On my old 8-core-8-thread-32G system it takes about 12 hours to build, and I have to turn off debug symbols or else I run out of ram. On my new 16-core-32-thread-64G system, it takes two or three hours without symbols and three or four with debug symbols.

On the relatively rare occasions I do chromium-related development I usually am worried about packaging for Gentoo, not chromium itself, per-se. Which means, my workflow is quite different from the standard one. But the standard workflow, which I do also have limited experience with, is considerably more, not less, resource intensive.

My point is, you are going to need some horse-power, probably more than you'd like to pay Amazon for, unless you're made of money. You might look into if the gcc compute cluster, a free resource hosted by FSF France for open-source projects needing access to a bunch of heterogeneous computer systems, could help you.

There are some bureaucratic hurdles to getting access (human review is involved) but they deliver a fair amount of compute for free. If you go this route, please mind your resource consumption; those are shared compute resources like in the bad old days, and there is an assumption that folks will not gobble up all the resources on those machines, so you need to play nice.

For the record, I landed here because I share your concerns about this manifest v3 business, and have come to the same (or worse) conclusions as you about the likely outcomes. In my (hopefully paranoid) imagination, their incentive is to create as much "counter-fud," to coin a phrase, as possible, essentially stringing everyone along with vague hopes that the inconceivable will not happen until it's too late, and coordinated code-drops and a simultaneous web-store revamp drop this on the unsuspecting public like a bomb.

Hopefully I'm wrong. I definitely have no insider knowledge and only a very cursory understanding of the technical stuff involved. But I'm not optimistic at all, personally -- it seems to me they have decided that, with the public increasingly concerned about privacy, now is the time to monetize their investment in web-client software, leveraging it as a means to limit the threat of the privacy arms race to their future revenue.

gmt avatar Feb 02 '19 12:02 gmt

FYI the chromium build requirements are just crazy. If you want to hack on the chromium source, you really need a beefy machine with a lot of ram, the more cores the better. On my old 8-core-8-thread-32G system it takes about 12 hours to build, and I have to turn off debug symbols or else I run out of ram. On my new 16-core-32-thread-64G system, it takes two or three hours without symbols and three or four with debug symbols.

I'm not sure what your setup is like, but that sounds excessive. I have been using a laptop for all my recent ungoogled-chromium builds; a Skylake i5 (dual core, hyper-threaded) with 16 GB of RAM. I can put the entire build tree on tmpfs (no swap), and it compiles in roughly 3-6 hours (time varies due to the particular build configuration and Chromium version). For my Debian builds, the entire build tree takes somewhere between 8 GB to 10 GB, and the compiler and linker are just fine with the remaining amount of RAM (though, I do have to log out of GNOME 3 to prevent an OOM issue during linking).

EDIT: I should note I usually build in release mode, which is why I'm able to fit the build tree into tmpfs.

Eloston avatar Feb 02 '19 18:02 Eloston

My old workstation is pretty darn slow (bulldozer) and largely built with -Og -ggdb3 which leads to high build-time memory demands. Also maybe I'm exxagerating the numbers a little bit... definitely in that ballpark but I never measured scientifically.

Btw I've noticed for my chromium builds ccache always got huge "hit" percentages compared to other projects I've tried it with, when I activate it. This can be a lifesaver under the right circumstances (or lead to segfaults in the wrong ones).

gmt avatar Feb 05 '19 00:02 gmt

There's way too much off-topic conversation here. @Eloston can you hide all these comments (mostly about hardware "needed" to build Chromium) using GitHub's new feature (like this one) and if possible make an official comment, even if it's just:

Right now, there are no concrete plans related to this issue.

pipboy96 avatar Mar 17 '19 04:03 pipboy96

I've always wanted a browser with uMatrix intergrated in by default. I wonder if ungoogled-chromium could be that browser? It would probably require a change of policy from just being a simple Chromium fork with the spying stuff removed, to being its own browser. But Google and Mozilla are falling deep into the censorship rabbit hole (see the Dissenter extension removal), so we can't rely on them for long. Expect extensions to become more and more gimped as time goes on.

What do you think?

SuperRobinHood avatar Apr 22 '19 06:04 SuperRobinHood

@SuperRobinHood I also think baseline stuff such as blocking and HTTPS Everywhere could be simply integrated into the browser.

pipboy96 avatar Apr 22 '19 06:04 pipboy96

https://groups.google.com/a/chromium.org/forum/m/#!topic/chromium-extensions/veJy9uAwS00/discussion

I've started becoming aware of the progress of this situation and it appears as if Google has expressed no intention of lifting the 30k limit to filters, according to the active thread on the topic above. The other changes they plan will force list updates to require whole extension updates, which, with ungoogled-chromium's divorce from extension update infrastructure, would break a significant amount of current adblock functionality and the train is full speed ahead for destroying uBlock Origin with no sign of change in the future. If anything they seem explicitly dedicated to destroying uBO the way Google staff keep making excuses for breaking its functionality.

I just want to bring this to awareness of the development team, and others. I will keep a close eye on progress of this situation and try to learn more about the project and about the browser's development as well.

Thank you.

Technically-Alexander avatar May 30 '19 06:05 Technically-Alexander

@Eloston Need your opinion since the situation had changed.

pipboy96 avatar May 30 '19 08:05 pipboy96

edited, wrong link https://groups.google.com/a/chromium.org/forum/m/#!topic/chromium-extensions/WcZ42Iqon_M

Sorry posting from mobile has been giving me trouble I may send too many notifications.

"""Increased Ruleset Size: We will raise the rule limit from the draft 30K value. However, an upper limit is still necessary to ensure performance for users. Block lists have tended to be “push-only”, where new rules are added but obsolete rules are rarely, if ever, removed (external research has shown that 90% of EasyList blocking rules provided no benefit in common blocking scenarios). Having this list continue to grow unbounded is problematic."""

February 15th, 2019, Google Chromium dev posts this, specifically stating they are going ahead with most of their plans, they may implement Google approved top domains for remote requests and specific types of requests, but in this quote they specifically express intention to bring an end to EasyList and its additional filters, as they consider it "problematic"

This is Googles response to the endless stream of complaints in the original thread posted in my previous comment.

Technically-Alexander avatar May 30 '19 08:05 Technically-Alexander

In that case, we should find a solution that allows us to keep using uBlock Origin with full functionality. My question is: What kind of options do we have? Should we remove the 30k limit (is that even a good idea)? Is it feasible to adopt Firefox's webRequest API (of which has advantages mentioned here)? I'm open to suggestions.

Eloston avatar Jun 04 '19 05:06 Eloston

@Eloston Maybe implement basic blocking as part of our patches? So even if uB0 is discontinued we can still use the lists.

pipboy96 avatar Jun 04 '19 13:06 pipboy96

image

https://twitter.com/BrendanEich/status/1134141335881912320

image

https://old.reddit.com/r/brave_browser/comments/buhq20/chrome_to_limit_full_ad_blocking_extensions_to/epdmuk5/

Brendan Eich is an American technologist and creator of the JavaScript programming language. He co-founded the Mozilla project, the Mozilla Foundation and now he is the CEO of Brave Software.

@Eloston I'm pretty sure it's the way to go, just base Ungoogled Chromium on Brave instead of Chromium, this way you will benefit from the work of Brave developers and focus on the things that make you différents from Brave/Chrome i.e. the things that make you unique.

Regards :octocat:

mikhoul avatar Jun 04 '19 13:06 mikhoul

@mikhoul That's a good suggestion, but I don't want to rush in until we see how they implement it. Not only that, but rebasing ungoogled-chromium based on Brave means we implicitly include all the kinds of changes that fit their goals and agends; I don't know Brave well enough to be comfortable depending on them. For now, I'd rather just cherry-pick changes from them like we've been doing with other projects.

Eloston avatar Jun 06 '19 03:06 Eloston

Off-topic just to clarify (perhaps worth documenting?):

I also think baseline stuff such as blocking and HTTPS Everywhere could be simply integrated into the browser.

I notice that unlike Firefox recent Chrome versions default to HTTPS when you type an URL. So in regards to that there is no need for HTTPS Everywhere at all (which itself has anti-privacy implications just like any extension which requires list updates from remote hosts however promising those hosts may pretend to be).

Also if you use uMatrix with setting "Forbid mixed content" you are protected from the non-obvious insecure HTTP requests.

To harden this even further you can use flags:

chrome://flags/#enable-mark-http-as = Enabled (mark as actively dangerous) chrome://flags/#enforce-tls13-downgrade = Enabled chrome://flags/#disallow-unsafe-http-downloads = Enabled

emanruse avatar Jun 06 '19 17:06 emanruse

@Eloston

Why don't you cooperate with @gorhill and make your browser do what uMatrix does without requiring extensions? I have always wanted a browser which would allow me to block 3rd party requests etc. and enable them selectively. If it is built-in rather than an extension, it can be very fast and optimized.

emanruse avatar Jun 06 '19 17:06 emanruse

@emanruse Not quite true, there are cases when simply doing s/^http:/https:/ causes redirect to wrong/broken page.

pipboy96 avatar Jun 06 '19 17:06 pipboy96

@Eloston

Please keep in mind that uBlock Origin is not the only extension affected by these changes. There are many extensions which rely on the blocking and modifying abilities of the webRequest API and a lot of them will have their functionality reduced and some will not be able to exist anymore because of these changes.

The main aim here should be to preserve the webRequest API to its full extent. Expanding the API or adopting Firefox's version would be an improvement on Chrome's webRequest API but depends on whether you or someone else could implement them.

I don't think it's a good idea to base on Brave either. At least not yet.

xEIkiFo avatar Jun 06 '19 17:06 xEIkiFo

@emanruse Not quite true, there are cases when simply doing s/^http:/https:/ causes redirect to wrong/broken page.

Which part exactly is not true? And where are you doing this replacement?

emanruse avatar Jun 06 '19 18:06 emanruse

@emanruse

So in regards to that there is no need for HTTPS Everywhere at all

pipboy96 avatar Jun 06 '19 18:06 pipboy96

@pipboy96 I don't know what you are talking about. In any case this is off-topic, so perhaps open a separate issue to discuss there.

emanruse avatar Jun 06 '19 18:06 emanruse

The main aim here should be to preserve the webRequest API to its full extent

Amen ! Also keep extension like Userscript/Userstyle Managers to download external ressources (userscripts & userstyles).

Backgound page swill also be deprecated.

mikhoul avatar Jun 06 '19 18:06 mikhoul

@mikhoul A side effect of background page deprecation is that <script type="module"> will no longer work.

pipboy96 avatar Jun 06 '19 18:06 pipboy96