docs.rs Add custom docker image support

Fixes #315

Nov 24 '20 20:11 devsnek

cc @jyn514

if this needs tests, where should they go?

Nov 24 '20 20:11 devsnek

@devsnek there aren't currently tests for a full build: #822. If you tested locally it works that's good enough, and I'll also test it manually myself before merging.

Nov 24 '20 22:11 jyn514

One potential issue I see here is that we'll run out of disk space on the build server really quickly: right now we have 20 GB free (out of 100 GB) but we'll quickly run out of that if every project has their own dockerfile. The current workaround is to run docker container prune && docker image prune -a about once a week, but if each docker image is a separate tag, I don't think they'll ever be removed, and we'll run out of space pretty quickly.

Nov 27 '20 02:11 jyn514

@jyn514 i don't know much about docker storage stuff but I was thinking you would probably want to delete the image right after the build.

Nov 27 '20 02:11 devsnek

Yup, that would work! Can you add logic for that here? I think docker image rm <image> would work.

Nov 27 '20 02:11 jyn514

@jyn514 does the dummy crate need to be built in the custom docker image?

Nov 27 '20 02:11 devsnek

@devsnek I don't quite follow - the dummy crate doesn't configure a docker image, right? So it should be built in the default image, which is https://github.com/rust-lang/crates-build-env.

Nov 27 '20 03:11 jyn514

ah yeah that makes sense, I was thinking the cargo.toml of the target crate would somehow be loaded for the dummy crate which makes no sense of course...

Nov 27 '20 03:11 devsnek

So, my understanding is there are two problems to solve:

We need to ensure the images we download are not too large, to avoid exhausting disk space on the server.
We need to ensure we don't hit Docker Hub's rate limits.

The first issue can be fixed with an experimental command, docker manifest inspect, which allows to fetch information about a remote image before downloading it:

pietro@january: ~/tmp$ docker manifest inspect rustops/crates-build-env
{
        "schemaVersion": 2,
        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
        "config": {
                "mediaType": "application/vnd.docker.container.image.v1+json",
                "size": 4686,
                "digest": "sha256:1a347871fe6e52601397c10649f49781e704fe732ca1194f762c06d62dd2a9fc"
        },
        "layers": [
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 28558714,
                        "digest": "sha256:6a5697faee43339ef8e33e3839060252392ad99325a48f7c9d7e93c22db4d4cf"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 847,
                        "digest": "sha256:ba13d3bc422b493440f97a8f148d245e1999cb616cb05876edc3ef29e79852f2"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 162,
                        "digest": "sha256:a254829d9e55168306fd80a49e02eb015551facee9c444d9dce3b26d19238b82"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 5176,
                        "digest": "sha256:956656aa3a91375bfd9eaf0c2e429fe88ffc8e0da6a415bf0621dfacd74a01a7"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 2065351348,
                        "digest": "sha256:6dc46121fa2b1fb5b1eb401c30a4b2272b13d9f411a049d0794118cad4e7471e"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 159,
                        "digest": "sha256:e619300b8e707992732abccb37e3d9950ccb23ab8d4de064d99c802668af5050"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 670,
                        "digest": "sha256:25fa93f27f0567e5ba7a7ddaed764bb269967125c7e99aece3986415cf0d7ca4"
                }
        ]
}

Then we can sum the size of each layer, and if it's greater than the limit we abort the build with an error. Instead if it is lower than the limit we fetch the image with the config.digest to ensure we're fetching the image we want. The problem is, calling this command consumes yet another Docker Hub rate limit point, which is far from great.

To "solve" the rate limiting problem, and to prevent slowing down the queue too much, we could gate using custom Docker image with a false-by-default limit, which we lift only if a crate opens an issue and can't add the dependency to crates-build-env (like the nginx case we had). What do y'all think?

For the implementation side I'd move the size check over to Rustwide, doing a breaking change on the SandboxImage::remote() function to be fn remote(name: &str, size_limit: Option<usize>) -> ....

Nov 27 '20 11:11 emilyalbini

To "solve" the rate limiting problem, and to prevent slowing down the queue too much, we could gate using custom Docker image with a false-by-default limit, which we lift only if a crate opens an issue and can't add the dependency to crates-build-env (like the nginx case we had). What do y'all think?

I really like this idea, that gets us three things:

People aren't using a custom docker image when they don't need to, so crater still benefits from the new dependencies in crates-build-env most of the time
We're downloading fewer docker images overall, so it's less strain on our resources
We know who requested an override, so a) we can verify the docker image is a reasonable size, and b) if they change the image, we know exactly who to ask pointed questions.

@devsnek can you add a resource limit for docker images that's disabled by default? It would go in Limits, under docbuilder/limits.rs.

Nov 27 '20 14:11 jyn514

@jyn514 do you also want to have size limits for images implemented?

Nov 27 '20 15:11 emilyalbini

Sure, size limits seem reasonable - maybe 5 GB to start?

Nov 27 '20 15:11 jyn514