gitian-builder
gitian-builder copied to clipboard
Consider integration with Docker.io images/containers
I'm curious, how does one know that a docker image is free of tampering?
Helping to get a deterministic build of docker images would be great.
Agreed. However, that's a pretty big project. I believe the Whonix guy did some groundwork for getting images to be deterministic.
I am going to give this a try
@devrandom I made progress on this and I successfully built bitcoin using gitian-builder inside a docker container.
Now I am wrapping up the results and going to publish them, thus you might consider integrating them here.
The key steps are:
- create a base Debian Wheezy image built through debootstrap (not downloaded from index.docker.io)
- create a "gitian host" image, inheriting from above, that will be run as a privileged containers
- some script borrowed&adapted from https://github.com/jpetazzo/dind to have LXC work inside a docker container (spoiler: works flawlessly with --privileged containers), that I am going to publish too
- inside a running gitian host container, one uses vmbuilder to build the VMs (I bothered only with Linux i386 and x64, no reason others won't work)
- still inside the gitian host container, one installs apt-cacher-ng and starts it (I couldn't manage to use an external apt-cacher-ng, although I believe relaying proxies should be possible)
- at this point it's enough to run the gbuild (with USE_LXC=1) as usual
I will give a ping back once I complete publishing the (Docker)files and instructions
NOTE about step 1: you can set index.docker.io as 127.0.0.1 for what matters, no external sources (except Debian/Ubuntu repositories) will be used to create the gitian host and the VMs, this answers your question about tampering
Nice :)
@devrandom I have to resize my enthusiasm, I have successfully built bitcoin but signatures do not match fully. I am reporting some issues to bitcoin repository but I haven't yet cut if it's because of the environment (Docker/LXC) or because of incomplete descriptors.
So far I am more for the latter, if not just because I have been unable to first reproduce the binaries using a VirtualBox VM.
I truly believe that the gitian builds should not be tied to when you make them, but should be possibel to reproduce them after a while.
I think gitian-builder already supports package-pinning of the builder VM? Another improvement I'd like to propose here is to allow parallel creation of VMs (and then use them for parallel gbuilds). It's doable but for example the LXC code would clash in bin/make-base-vm and I didn't check about parallel gbuilds yet
However, problems aside, what I said in previous post still holds. It's perfectly doable to run gitian-builder inside a privileged docker container and it gives absolutely 0 errors. Problems I am incurring are probably project-specific.
Package pinning is not yet implemented, although the package versions are recorded.
Are you able to get reproducible builds?
@devrandom yes
That's good. Is the difference you are seeing in bitcoin-qt or in bitcoind? That would help narrow down the culprit.
@devrandom both of them, but the cli is not affected for some reason! you can see here. Also the src archive does not match. Is that retrieved with the .git folder too? because this would never work in case one gets the tags (with git fetch --tags) and there is a new tag, for example
@gdm85 I'm crossing threads here but this PhD thesis on reproducible builds is why I'm guessing that your problems are fairly shallow. The security comes from not having deterministic builds but reproducible ones.
Of course, I am simply navel gazing so please shut me down it I'm being stupid : P
@gdm85 what do you mean by "the cli is not affected"? all the hashes seem different to me.
The source archive should have been identical. Perhaps compare it with the source distributed on bitcoin.org?
@devrandom 98546912776c6cc61ef22ed4121067045af7f01d012ab78e66dd0c31af2df520 for bitcoin-cli matches, same can be said for the hash bitcoin-cli.static (reference: here)
regarding the source archive: where is it on bitcoin.org? not really easy to find :) if I cannot manage to get same intermediate deps and src archive, surely cannot pretend the final binaries will match.. :\
Btw, I am meanwhile trying to push forward the VM build, so in case I will have a working one (0.9.1) I can compare
@indolering David A. Wheeler, this name rings a bell :) I had read about DDC already. If builds are not reproducible then I'd have to give it all up, be it VM-approach or LXC-approach.
Perhaps in future projects shall use only certain deterministic compilers?
@indolering I understand what you mean in deterministic vs reproducible. Perhaps I have been pretending the former (on building bitcoin 0.9.1), while I should accept that there is a certain realistic time window to make a reproducible build (and after that, you're on your own).
Still, for the topic of this issue, I am going to publish the scripts and Dockerfile's to have gitian-builder in docker containers.
NOTE: this is not the same as patching gitian-builder to use docker containers instead of VMs, I hope this was clear since the beginning
@devrandom the next step could be to try such patchwork to use directly docker containers instead of VMs..
@gdm85 oh, you are right the cli matches.
Download the linux tgx from https://bitcoin.org/en/download. It contains the src snapshot.
Sounds good regarding the direction (gitian-builder inside docker containers). Anything that makes it easier for new people to get up to speed is desirable.
Re discrepancies - one of the ways to approach this is to get an assembly level dump of the binaries and compare. objdump -d is useful for this.
@devrandom yes I can do that (objdump approach) or just file compare. But first step would be to get a reproducible build via VirtualBox VM, I am busy with that.
Meanwhile, all interested parties can start using gitian-builder in a (privileged) docker container: https://github.com/gdm85/tenku/tree/master/docker/gitian-host
All scripts and Dockerfile's are tested, the generated image is generic to build any project (no bitcoin references in it).
It does not use the Docker Index registry (although I will publish the images there eventually), everybody can debootstrap the full set autonomously.
a recap:
- builds are reproducible e.g. I always get same hashes when using the gitian-host Docker container
- I produced valid signatures with the VirtualBox VM, I am using the byproducts of this to compare with the Docker container build
- one first difference that I found is the source tar files: the order of files in the tape is different
I am will eventually check other subtle differences that make the hashes different in the case of binaries
@devrandom the issue was uniquely the fact that LC_* variables are allowed to pollute the LXC container, I think it somehow screws up the base vm.
I am not able to submit a PR for this because basically the fix is to always unset all LC_* and LANG variables before running any gitian-builder command. Can you think of a way to inject this fix in gitian-builder so that all commands will not pass environment to the created LXC VMs? I know you didn't introduce LXC originally..but I am not an expert either :(
For the records: if your host has none of the LC_* variables set and only LANG=en_US.UTF-8 (LANG=C is undetermined), then the LXC containers created with my gitian-host docker container will generate 100% matching hashes :)
the LC_* environment variables problem has been split in #56 and I have made an integration that makes gitian-builder run in a docker privileged container, so now (after a bit of OT comments, sorry) perhaps it's the time to consider this ticket as a feature request to support docker containers in place of the KVM/LXC VMs? something like a --docker option?
How does this relate with #48?
@gdm85 good job, btw, maybe I am wrong but I have some concerns regarding the combination of gitian + docker, doesnot it better to create a deterministic building process without gitian just in using docker ?
@canercandan no, it's actually more deterministic (and consistent with advised approach for Bitcoin gitian builds, for example) in current way I implemented because you have a further layer that constitutes a sort of VM (the Docker container for host running gitian-build).
One can argue, instead, that this is unnecessary to make a deterministic build, and actually I would like to see native Docker support in gitian-builder, but it takes some work to make such integration