mixer-tools icon indicating copy to clipboard operation
mixer-tools copied to clipboard

Mixer doesn't support autoproxy.

Open ahkok opened this issue 6 years ago • 22 comments

You could exec curl to get proper autoproxy support for downloads. Or use a curl binding.

ahkok avatar Sep 05 '18 20:09 ahkok

https://github.com/andelf/go-curl

ahkok avatar Sep 05 '18 21:09 ahkok

I'll take a look at it, but I'd like to try to avoid yet another dep if possible.

Does the other tools shipped with clear (e.g. swupd) have support to it? Since this seems to be exclusive to libcurl, does this imply that all our projects must use it?

rchiossi avatar Sep 05 '18 21:09 rchiossi

There is a dbus interface exposed by the proxy related daemons. Applications using "libproxy" can benefit from it directly. Our libcurl was patched to take advantage of it.

I had some old code from last year to do the DBus call, might be useful. It does pull a dbus library as a dependency. I prefer this approach than the libcurl binding as it is likely a less intrusive change.

package main

import (
        "io"
        "log"
        "net/http"
        "net/url"
        "os"
        "strings"

        "github.com/godbus/dbus"
)

func FindProxyForURLWithPacRunner(req *http.Request) (*url.URL, error) {
        // Don't close conn since this is a shared connection by dbus library.
        conn, err := dbus.SystemBus()
        if err != nil {
                return nil, nil
        }

        var s string
        obj := conn.Object("org.pacrunner", "/org/pacrunner/client")
        err = obj.Call("org.pacrunner.Client.FindProxyForURL", 0, req.URL.String(), req.Host).Store(&s)
        if err != nil {
                return nil, err
        }

        if !strings.HasPrefix(s, "PROXY") {
                return nil, nil
        }
        s = s[6:]

        if !strings.HasPrefix(s, "http://") && !strings.HasPrefix(s, "https://") {
                s = "http://" + s
        }
        url, err := url.Parse(s)
        if err == nil {
                log.Printf("Using proxy from AutoProxy: %s", url)
        }
        return url, err
}

func AutoProxy(req *http.Request) (*url.URL, error) {
        url, err := http.ProxyFromEnvironment(req)
        if url != nil || err != nil {
                log.Printf("Using proxy from environment: %s", url)
                return url, err
        }

        url, err = FindProxyForURLWithPacRunner(req)

        // Using PacRunner is an optional fallback so failure to use it should not propagate.
        return url, nil
}

func main() {
        client := &http.Client{
                Transport: &http.Transport{
                        Proxy: AutoProxy,
                },
        }

        resp, err := client.Get("https://download.clearlinux.org/latest")
        if err != nil {
                log.Fatal(err)
        }
        defer resp.Body.Close()

        io.Copy(os.Stdout, resp.Body)
}

cmarcelo avatar Sep 05 '18 21:09 cmarcelo

It supports autoproxy on my system and others... more info please.

tmarcu avatar Sep 06 '18 15:09 tmarcu

It doesn't. It's easy to prove that it doesn't, either.

Steps to reproduce:

  1. open terminal and type sudo busctl monitor | grep org.pacrunner.client

Note that you'll see messages here when the autoproxy is called for each address

  1. open a second terminal and disable all manual proxy settings
unset http_proxy https_proxy no_proxy
sudo systemctl stop redsocks

Then, use e.g. curl to show autoproxy is functional:

curl https://github.com/clearlinux/mixer-tools

In terminal 1, you'll see a pacrunner request appear:

Monitoring bus message stream.
  Sender=:1.84984  Destination=org.pacrunner  Path=/org/pacrunner/client  Interface=org.pacrunner.Client  Member=FindProxyForURL

Now use mixer in terminal 2:

mkdir testmix
cd testmix
mixer init --local-rpms

the end result is:

ERROR: Failed to retrieve latest published upstream version. Missing proxy configuration? : Get https://download.clearlinux.org/latest: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

And, more importantly, never ever does any FindProxyForURL request arrive at pacrunner over dbus, the output of the busctl command shows that mixer isn't asking pacrunner over dbus anything.

ahkok avatar Sep 06 '18 16:09 ahkok

I appreciate the lengthy response :+1: If I understand, it takes 1.) Manually removing all your proxy settings 2.) Stopping a working auto proxy solution to make it fail - using another, specific auto proxy implementation that is specifically catered to only working with curl calls. That's purposely removing 2 (IMHO better) solutions to get this to fail, and that is not a good reason to manually call a shell exec of curl in the code for every single file download, it just failed on the first of many. This is specific to running in an environment that requires manual proxy intervention to do ANY outside network communication, and it *only works with curl because we manually patch the actual curl package to recognize pacrunner and read/send messages to it over dbus. This is not a general case/solution.

I think a better solution is to just allow users to set a proxy in the builder.conf under a [Proxy] section, which can auto-populate the .docker/config.json, and that automatically passes proxy information into the container at call-time so everything just works. That requires no code change to the download paths & just reading a new section.

@cmarcelo suggestion is viable as well, but I don't particularly want to make this specific to pacrunner & pull in another dependency if it's not needed.

tmarcu avatar Sep 06 '18 17:09 tmarcu

No, I didn't remove any working autoproxy in my replication case.

All I did was replicate a properly new installed clearlinux system where autoproxy is working, and where mixer isn't capable of talking to it. This means that I am making sure to disable redsocks and manually set environment variables which can make golang programs appear to function with autoproxy, but are actually bypassing the autoproxy.

ahkok avatar Sep 06 '18 17:09 ahkok

I have no issues with additional ways to manually support proxies.

But all our development tools must be able to work with autoproxy, and they don't. We've patched autospec before to use pycurl, for instance (I did so myself). Now it's time for mixer to do this. People, and customers, are expecting this to work.

ahkok avatar Sep 06 '18 17:09 ahkok

Specifically people with weird internally proxied network setups who are running only pacrunner & no environment proxies would experience this. Again, the only reason curl works (in Clear) is because curl is hot patched to talk to pacrunner... I'm not arguing that it's broken with only pacrunner running. There are already 2 other options to get past this issue, so I am not willing to change an entire download API to support a 3rd w/shell calls; I think it can be solved in different ways like I explained above. I also think the pacrunner setup is hacky and shouldn't be the default to begin with, but that's another issue :smirk: Waiting to see if Bun's test of my 1st suggestion works and we'll take it from there.

tmarcu avatar Sep 06 '18 17:09 tmarcu

Note that code snippet I've posted falls back to previous methods if it can't find / use the autoproxy.

Note you can also call the "proxy" utility that will talk dbus for you. Not sure is a real win, though.

cmarcelo avatar Sep 06 '18 17:09 cmarcelo

@tmarcu At present, in order for mixer to work behind a proxy, one must carry out one of these steps manually:

  • set proxy environment variables
  • enable a non-default autoproxy, like redsocks

I think the point of this bug report is that mixer does not work with the default Clear Linux autoproxy services (pacrunner/pacdiscovery), and this leads to a less-than-optimal user experience. The motivation for shipping autoproxy services by default is to enable a working out-of-the-box experience, so users do not have to configure anything proxy-related.

The current implementation is limited to tools/services using libcurl, but the autoproxy was never meant to be limited in this way. IMO, the other ideas being proposed to address this limitation should be considered. I don't think we should settle for requiring manual proxy setup when an autoproxy is readily available to use.

phmccarty avatar Sep 06 '18 18:09 phmccarty

@phmccarty I agree we have a hacky & limited default implementation, and as I said I agree it's an issue, but I don't agree with the suggested fix. It's an ugly band-aid solution.

After testing, setting the proxy URL inside ~/.docker/config.json per the docker documentation automagically passes proxies inside the container. As it stands, the only way for that to be accomplished 100% automatically is to query pacrunner over dbus... I'd rather just document & warn that a proxy must be set inside of that file (or builder.conf which can then add it automatically for the user), and everything else will just work from there as it should. The user already has to manually edit things in the mixer workspace for any mix, so 1 extra line is insignificant, and only required by users running in a locked down intranet that requires proxy intervention.

tmarcu avatar Sep 06 '18 23:09 tmarcu

The issue reports the problem running mixer init that happens outside a container, so I don't expect Docker to be involved.

After testing, setting the proxy URL inside ~/.docker/config.json per the docker documentation automagically passes proxies inside the container.

One downside is people going in and out a network that needs proxy. Then we need to change that file every time we change the network setup.

I'd assume once mixer command can find the right proxy by itself, it would take care of propagate it to the container execution (that is, if the container can't talk to the autoproxy service by itself).

Given the amount of "network works but mixer fails" reports we got in the past, I think this issue is worth pursuing as would certainly reduce those reports. Improving Autoproxy itself to be transparent to the applications is also a good endeavor (something like redsocks), but until that hits the distribution, I suggest taking advantage of the resources we have (the DBus API call or shelling out to proxy tool, I don't recommend changing the code to use libcurl bindings).

cmarcelo avatar Sep 07 '18 00:09 cmarcelo

@cmarcelo You have to expect docker to be involved always now because it's default part of the architecture. You're right about the network switches though, that is an annoying case we have to support. Unfortunately, mixer is not the only tool in the world that doesn't use libcurl, so we're going to hit this again and again at a distro scale. I think moving up the stack and fixing the autoproxy implementation itself is a better path forward, @phmccarty is looking into something now after a hallway conversation that may be more generic :)

tmarcu avatar Sep 07 '18 00:09 tmarcu

My point is even if I put the configuration specific for docker, the reported command would still fail... I think the issue reported here is legitimate.

I think moving up the stack and fixing the autoproxy implementation itself is a better path forward, @phmccarty is looking into something now after a hallway conversation that may be more generic :)

OK :-) Looking forward to know what the fix is.

cmarcelo avatar Sep 07 '18 00:09 cmarcelo

@cmarcelo I believe the biggest concern here is just turning the "network works but mixer fails" problem into "mixer init works but mixer build fails" errors. Using your sample does fix the issue with mixer init but unless you propagate that to docker, you are just turning one problem into another. But a t the end of the day, I do agree that this is a legitimate issue, we just need to figure out a solution that will work for the bigger picture.

rchiossi avatar Sep 07 '18 11:09 rchiossi

@rchiossi thanks for the clarification. I'm aware the issue is not simple, hence my previous comment :-)

I'd assume once mixer command can find the right proxy by itself, it would take care of propagate it to the container execution (that is, if the container can't talk to the autoproxy service by itself).

cmarcelo avatar Sep 07 '18 16:09 cmarcelo

Ideally, docker doesn't go out on the network and has all the data it needs. Making docker go on a public network defies a large part of the reason why you should use docker.

Can we avoid this?

I find the suggestion that we could pass a proxy host to the container actually pretty elegant.

ahkok avatar Sep 07 '18 17:09 ahkok

Can we avoid this?

No, by the core definition it has to go over the network on every single build to pull RPMs down, unless you happen to be a very very rare use case that happens to have cloned all RPMs locally so you can build completely offline. These cases exist and are supported (hence the --offline flag), but are by far not the norm, and they are doing so because of other network related issues where a full mirror is actually more feasible for them.

However I think this issue will keep popping up in other places, and I would really like to improve the base autoproxy solution to whack all these kinds of bugs out of Clear.

tmarcu avatar Sep 07 '18 22:09 tmarcu

I think this is a core design problem when docker was implemented. The container shouldn't have to go out over the network. It should purely exist to 'do the root bits' in a safe environment, and it should just be given all the assets that it needs to do the job, so it can function without networking entirely.

Maybe it's overly aggressive, but if you do root bits, and you do fetch internet bits at the same time, you're not doing separation properly imho....

ahkok avatar Sep 13 '18 21:09 ahkok

I totally agree with you. From a security perspective, this is very wrong. I think the problem is that the reason why docker is used in mixer is a bit misleading. As I understand, it was never added in order to run the root bits, but rather to simulate an environment with the correct tooling for a particular update format, given that a mixer release cannot generate updates for formats older than the one it was build for.

rchiossi avatar Sep 14 '18 16:09 rchiossi

As I understand, it was never added in order to run the root bits, but rather to simulate an environment with the correct tooling for a particular update format

Actually, running as non-root was specifically required as part of a security review. Docker was meant to originally solve that problem as well as solve the format bump issue. Two birds with one stone sort of thing.

I don't like using docker, as I've said before, but I'm afraid we are stuck with it due to functional requirements (format bumps) and policy requirements (non-root builds).

matthewrsj avatar Sep 14 '18 16:09 matthewrsj