sourced-ce Installation Error message (inside corporate firewall)

Here is the error message:

error while getting docker-compose container: error downloading https://github.com/docker/compose/releases/download/1.24.0/run.sh: Get https://github.com/docker/compose/releases/download/1.24.0/run.sh: dial tcp: lookup github.com on 171.172.3.251:53: no such host

Comments from @creachadair on this topic:

It makes me realize, though, that we should make it possible to build our stuff without having to pierce the firewall. I think right now they have to import fully-built images, because building would require them to fetch stuff from GH. I was thinking they could just build everything inside—but our build scripts have a bunch of special cases. At least for Babelfish, and I suspect other teams too. fixable, but that's the current state and it's probably worth looking into

Sep 12 '19 23:09 vcoisne

In this case the error is happening because the system does not have docker-compose installed, and sourced-ce tries to install a containerized alternative. The solution is to simply install docker-compose following the instructions here: https://docs.docker.com/compose/install/

But sourced-ce will fail again when it tries to download the docker-compose.yml file from this repository. If opening the firewall is not an option, the user will have to download it manually:

Somehow download https://raw.githubusercontent.com/src-d/sourced-ce/v0.15.1/docker-compose.yml
Place it in ~/.sourced/compose-files/v0.15.1/docker-compose.yml
Enable it with sourced compose set v0.15.1

Of course, without access to github, sourced-ce will not be able to download any organization metadata. Only sourced init local will work, sourced init orgs will be pointless.

Sep 13 '19 12:09 carlosms

@carlosms @vcoisne Do you think we should document something like:

How to use source{d} offline? Explaining what you wrote here.

I think it could be a recurrent problem in some scenarios (if I'm not wrong, there was no regular access to the Internet in other enterprises we tried)

Sep 13 '19 12:09 dpordomingo

I'd strongly prefer not to, to be honest. We use docker images that need to be downloaded, and the main use case is going to download git repositories and metadata...

If there are firewalls in place, our error messages are clear enough to understand which URLs we are trying to access.

Sep 13 '19 14:09 carlosms

What about @creachadair's comment. Should we create a separate issue to discuss alternatives for setup behind firewalls ?

Sep 13 '19 15:09 vcoisne

Ok, let's create an issue (in backlog? feature-idea?) if you think this a use case we will need to support. There are some things we could do in sourced-ce to avoid downloads (like embedding files blobs in the binary and creating necessary files without internet access). But to be honest, building the whole sourced-ce with dependencies without internet access is something I would not know where to start with. For example to build the sourced-ui image we download the base docker images, python packages, npm packages, debian packages...

Sep 13 '19 16:09 carlosms

I think for this case, operating completely locally would be OK: The goal here is to test CE inside the corp firewall, where direct GitHub access is not possible—and as I understand it, he only needs to target local repositories. The problem, though, is that he wasn't able to start up the tool even for local repos, because of missing dependencies.

It sounds like fetching more containers explicitly might help in this case. More generally, I feel like we probably should have a way to do completely-local testing after some initial setup. It's fine if the user has to fetch some stuff (or maybe we give them a script to do it), but I think we ought to have some point after which local access should "just work" without further chatting to GH.

Sep 13 '19 17:09 creachadair

cc @rpau @eiso What do you think ?

Sep 13 '19 18:09 vcoisne

To be clear, I'm against adding any possible deployment mode by default to the docs, without further discussion on what we really want to support and to what extent. But I'm not against supporting this use case specifically, if we think it's needed.

As it is, the code will work locally with the differences mentioned above:

docker-compose is not optional, it's a requirement.
The docker-compose.yml file has to be placed manually in ~/sourced/..., and enabled with sourced compose set
The docker images need to be available.

After this is done, either manually or by running sourced init local once with internet access, the user can run init local offline from that point.

(This is off the top of my head, somebody please correct me if I'm wrong!)

Now, the question is: if we want to support it, to what extent? Should this be just an entry in the FAQ, and we consider it a workaround? Or should this be a first-class use case?

If it's the latter, it may have some implications beyond the docs. For example we may want to add tests for it. Also, as I commented, there are ways to embed the docker-compose.yml file into the released sourced-ce binary, to avoid the download.

Sep 16 '19 09:09 carlosms

As I understand it, he was using docker-compose, had fetched the images locally (by separate arrangement), and had set up the docker-compose.yml as you describe. I believe the trouble was not in fetching the images, but that some part of the startup tried to grab a copy of run.sh directly from GitHub.

Unfortunately I don't have all the details of the transaction, but broadly the issue is not that he wants to avoid using Docker at all, but that pulling in data from outside the corp network requires special permissions (and I believe he had to fetch and/or build the images manually via some other side channel and import them). Once all those pieces are in place, it would be good if CE could be started up without additional pulls from remote sources.

Edit: Also, I have in my notes that he tried fetching run.sh separately, but could not figure out where to put it, so that it would work correctly with the other files.

Sep 16 '19 14:09 creachadair

Once all those pieces are in place, it would be good if CE could be started up without additional pulls from remote sources.

This is already the case if the requirements I described above are met. Most probably their system did not have docker-compose installed. Following the official Docker docs, docker and docker-compose are installed separately. The run.sh file is an alternative we use when sourced-ce fails to find the docker-compose binary installed. This is why we say in our docs docker-compose is an optional dependency, and why they probably didn't install it in their system.

Sep 16 '19 15:09 carlosms

The run.sh file is an alternative we use when sourced-ce fails to find the docker-compose binary installed. This is why we say in our docs docker-compose is an optional dependency, and why they probably didn't install it in their system.

Aha, thanks for the clarification. I thought this had come up while running docker-compose. So if he installs the docker-compose tool (and has the other files in place as described), it should be self-contained?

Sep 16 '19 15:09 creachadair

srcd-ce makes calls to github to check new version on every run but shouldn't fail if gh is unavailable, it will just print a warning.

Sep 16 '19 15:09 smacker

So if he installs the docker-compose tool (and has the other files in place as described), it should be self-contained?

Yes.

srcd-ce makes calls to github to check new version on every run but shouldn't fail if gh is unavailable, it will just print a warning.

:+1: I've checked the code just in case. We don't even print a warning, just ignore it silently.

Sep 16 '19 16:09 carlosms

About the error getting run.sh, which caused this issue, I think it could be more user friendly, and point to proper docs about what was needed, why, and how to solve it.

Sep 16 '19 19:09 dpordomingo

About the error getting run.sh, which caused this issue, I think it could be more user friendly, and point to proper docs about what was needed, why, and how to solve it.

True, I created #245 to move the discussion there. There might be other errors that we can improve.

Sep 17 '19 09:09 carlosms

@creachadair @vcoisne Is this a firewalled environment (e.g. has a corporate HTTP proxy) or airgapped?

Sep 19 '19 10:09 smola

@creachadair @vcoisne Is this a firewalled environment (e.g. has a corporate HTTP proxy) or airgapped?

As I understand it is primarily firewalled, but I think it's more than just an HTTP proxy; I believe they also proxy DNS and other key services as well. He got permission to import the components like Docker and the Docker images, but got stuck when trying to start up. I suspect @carlosms's diagnosis is right, that he didn't install docker-compose, and that's why it was trying to reach out to GitHub for the fallback script.

Sep 19 '19 16:09 creachadair

@vcoisne which data they want to download? Is it from public GH or GHE?

Sep 20 '19 12:09 rpau

@vcoisne which data they want to download? Is it from public GH or GHE?

Their data are all internal to their network: He's trying to use CE to verify that the stack works on their internal infrastructure. So the short answer is: Neither. After verifying that CE works on his own machine (outside the corp network), he got permission to set it up internally for testing. The only reason GitHub got involved in this case (I think) was that he tried to start up CE and it attempted to fetch the run.sh file from our GitHub repo.

Based on the discussion above, I believe the likely issue was he didn't realize he had to install docker-compose, and that if he does that he won't need to contact GH anymore.

Sep 20 '19 17:09 creachadair

Then I think we would need to document required firewall rules (domains and ports we access), better software requirements documentation, possibly with better self-documenting error messages, and maybe embedding a docker-compose.yml as fallback.

Although the embedded fallback would not be so important if we have clear documentation on the requirements.

Sep 23 '19 13:09 smola

Would satisfy your suggestions about the error messages, the ones described by #247 ? Should we iterate on this?

Sep 30 '19 16:09 dpordomingo

[...] this case (I think) was that he tried to start up CE and it attempted to fetch the run.sh file from our GitHub repo. Based on the discussion above, I believe the likely issue was he didn't realize he had to install docker-compose, and that if he does that he won't need to contact GH anymore. https://github.com/src-d/sourced-ce/issues/241#issuecomment-533634365 by @creachadair

In this new FAQ, it is listed the resources that source{d} gets from the net. The FAQ is linked from the Dependencies docs when noticing that source{d} requires Internet to use all its features.

As you said, run.sh is downloaded as an alternative if docker-compose is not locally available.
Then, it's needed the docker-compose.yml for the very same version of source{d} being used. It's downloaded the first time sourced is used.
Then, the images described by that docker-compose.yml are downloaded from Docker repos.
since the user is trying to analyze local repos, no other resource will be downloaded.

The place where files from 1 and 2 must be stored is neither documented nor exposed in the error messages because source{d} was not meant to be used offline, but adding that info to the error messages could be easily done if you agree.

Oct 02 '19 15:10 dpordomingo

SGTM nice work @dpordomingo

Oct 03 '19 02:10 vcoisne

The place where files from 1 and 2 must be stored is neither documented nor exposed in the error messages because source{d} was not meant to be used offline, but adding that info to the error messages could be easily done if you agree.

I'd prefer to not add any other information to the errors to cover this uncommon and not really intended use case. IMO the error message it's clear enough that you require internet connection. Moreover, a user that faces this is probably used to this kind of error by working in a firewalled environment. On the other hand adding a subsection in the doc explaining how to provide those files seems good to me.

Oct 03 '19 07:10 se7entyse7en

+1 for the @se7entyse7en comment. However, I thing that we can provide a more explanatory message for the user with a link with the explanation.

However, IMO looks inconsistent being able to process local repositories and not being able to boot. That feature seems created for offline machines. Could you give me more context?

Oct 03 '19 14:10 rpau

I don't like the idea of adding extra details in markdown docs because then you're exposing internals, tieing to them, and being forced to update the docs if the implementation changes. Otherwise, if it is done in error messages, the user will only see the hint when it is really needed, and no former update will be needed because the noticed location will be taken directly from the function causing the error.

Oct 03 '19 14:10 dpordomingo

However, I thing that we can provide a more explanatory message for the user with a link with the explanation.

We already provide a more explanatory message, it has been fixed here.

However, IMO looks inconsistent being able to process local repositories and not being able to boot. That feature seems created for offline machines. Could you give me more context?

@dpordomingo described here what are the requirements. Once you have the requirements it can be used offline for local repos.

Oct 03 '19 15:10 se7entyse7en

@dpordomingo described here what are the requirements. Once you have the requirements it can be used offline for local repos.

The problem I see is that it's not explained how to fulfill the 2nd. And it might be not trivial for the user to guess that the docker-compose.yml could be manually copied into ~/.sourced/compose-files/__active__/docker-compose.yml

The rest of the requirements: 1, 3 and 4, are clear in my opinion.

Oct 03 '19 17:10 dpordomingo

sourced-ce sourced-ce copied to clipboard

Installation Error message (inside corporate firewall)

sourced-ce
sourced-ce copied to clipboard