loki icon indicating copy to clipboard operation
loki copied to clipboard

feat: build the Docker Driver for arm64

Open tucksaun opened this issue 2 years ago • 25 comments

What this PR does / why we need it: Add ARM64 build and release of the Docker driver in Drone pipeline

Which issue(s) this PR fixes: Fixes #5682

Special notes for your reviewer: I would have loved to have a unified x64 and arm64 build but apparently Docker drivers does not support multi arch images. So instead I went with tweaking the build steps to allow cross building the image for ARM64 and added the instructions to do so in drone.yml. It seems there's a drift in the images published on Docker Hub versus the one documented for use. I updated it here but this might be wrong.

Checklist

  • [x] Reviewed the CONTRIBUTING.md guide (required)
  • [x] Documentation added
  • [X] Tests updated (no tests for this AFAIK)
  • [x] CHANGELOG.md updated
  • [x] Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

tucksaun avatar Apr 23 '23 20:04 tucksaun

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Apr 23 '23 20:04 CLAassistant

@tucksaun could you resolve the conflicts?

jeschkies avatar Aug 30 '23 05:08 jeschkies

@jeschkies conflicts resolved

tucksaun avatar Sep 09 '23 06:09 tucksaun

Is there any issue preventing the advancement of this pull request?

guvmao avatar Oct 05 '23 08:10 guvmao

Is there any issue preventing the advancement of this pull request?

there were new conflicts I just resolved. excluding those, I believe someone (@jeschkies?) has to regenerate the drone.yml now that the jsonnet file is updated. and that's it?

tucksaun avatar Oct 05 '23 17:10 tucksaun

@tucksaun I'm sorry. I've switched teams and have missed GitHub notifications.

I would have loved to have a unified x64 and arm64 build but apparently Docker drivers does not support multi arch images.

I was about to mentioned the multi arch image. How is the arm image tagged differently?

jeschkies avatar Aug 22 '24 20:08 jeschkies

IIRC docker plugins does not support multi arch images (at least it did not went I worked on this)

On Thu 22 Aug 2024 at 21:21, Karsten Jeschkies @.***> wrote:

@tucksaun https://github.com/tucksaun I'm sorry. I've switched teams and have missed GitHub notifications.

I would have loved to have a unified x64 and arm64 build but apparently Docker drivers does not support multi arch images.

I was about to mentioned the multi arch image. How is the arm image tagged differently?

— Reply to this email directly, view it on GitHub https://github.com/grafana/loki/pull/9247#issuecomment-2305562692, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUNZUMY3BU6T4BP5DTQ3DZSZB4LAVCNFSM6AAAAABLF7PFYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBVGU3DENRZGI . You are receiving this because you were mentioned.Message ID: @.***>

tucksaun avatar Aug 22 '24 20:08 tucksaun

Hi, I think your PR would be very helpful

is it a lot of work to solve conflicts to this PR can be accepted?

francescor avatar Oct 04 '24 07:10 francescor

@francescor not much for the Dockerfile so this is done.

however this is a bit different for the pipeline because Drone has been recently removed (see https://github.com/grafana/loki/pull/14273) and it seems like the docker plugin is neither build not released anymore (I can't find any reference to it anyway).

@jeschkies, this means I'm not able to update the pipeline for testing or releasing anymore 🤷

tucksaun avatar Oct 04 '24 07:10 tucksaun

I see this one https://hub.docker.com/r/miacis/loki-docker-driver

is related to this context?

francescor avatar Oct 04 '24 11:10 francescor

I see this one https://hub.docker.com/r/miacis/loki-docker-driver

is related to this context?

I have no idea

tucksaun avatar Oct 04 '24 13:10 tucksaun

I'm fine with landing this but as I understand we would not be able to publish the ARM Docker image as it would override

jeschkies avatar Oct 08 '24 14:10 jeschkies

Hi, we really badly need the ARM images, it will allow us to move all our setup into ARM with considerable savings :) @tucksaun do you have the ARM image available? we may take it from you until this PM is solved thanks

francescor avatar Oct 10 '24 06:10 francescor

I'm fine with landing this but as I understand we would not be able to publish the ARM Docker image as it would override

They should be built with different tags. The ARM version uses an explicit tag so this should not impact x64 users

tucksaun avatar Oct 10 '24 07:10 tucksaun

We'd still have to pushblish the new image

AFAICT the full releasing of the plugin needs to be restore

tucksaun avatar Oct 10 '24 07:10 tucksaun

Hi, we really badly need the ARM images, it will allow us to move all our setup into ARM with considerable savings :)

@tucksaun do you have the ARM image available? we may take it from you until this PM is solved

thanks

I didn't build it for a while (and bad timing as I'm away for conferences) but I can try to have a look in the coming days

tucksaun avatar Oct 10 '24 07:10 tucksaun

@francescor I just pushed the plugin at tucksaun/loki-docker-driver:main-arm64

please note I didn't have the opportunity to test it so this comes with no warranty 😁

tucksaun avatar Oct 10 '24 13:10 tucksaun

lovely, thank you!

please note I didn't have the opportunity to test it so this comes with no warranty 😁

yes, of course :)

francescor avatar Oct 11 '24 09:10 francescor

@francescor I just pushed the plugin at tucksaun/loki-docker-driver:main-arm64

forgive me, @tucksaun how do I pull your image?

docker pull ghcr.io/tucksaun/loki-docker-driver:main-arm64

francescor avatar Oct 11 '24 09:10 francescor

docker plugin install tucksaun/loki-docker-driver:main-arm64 --alias loki --grant-all-permissions should make it work

On Fri 11 Oct 2024 at 11:25, Fra R @.***> wrote:

@francescor https://github.com/francescor I just pushed the plugin at tucksaun/loki-docker-driver:main-arm64

forgive me, @tucksaun https://github.com/tucksaun how do I pull your image?

docker pull ghcr.io/tucksaun/loki-docker-driver:main-arm64

— Reply to this email directly, view it on GitHub https://github.com/grafana/loki/pull/9247#issuecomment-2407007672, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUNZUNBF6RVH4RY5KQPLDZ26KR7AVCNFSM6AAAAABLF7PFYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBXGAYDONRXGI . You are receiving this because you were mentioned.Message ID: @.***>

tucksaun avatar Oct 11 '24 12:10 tucksaun

thanks @tucksaun

I can install it

# drain node from swarm manager
$ docker node update --availability drain my_arm_node
# then, in that node
$ docker --version 
Docker version 26.1.4, build 5650f9b
$ docker plugin ls
ID        NAME      DESCRIPTION   ENABLED
$ docker ps 
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

# install plugin
$ docker plugin install tucksaun/loki-docker-driver:main-arm64 --alias loki --grant-all-permissions
main-arm64: Pulling from tucksaun/loki-docker-driver
Digest: sha256:fb6b5790f298972624e01ab1bbee6e7a2eb1c4fdd8609426c65fd589d826750f
c4a9207ddf3d: Complete 
Installed plugin tucksaun/loki-docker-driver:main-arm64

but then apparently the "old" plugin keeps showing up :(

$ docker plugin ls 
ID             NAME          DESCRIPTION           ENABLED
2d6ac6b33590   loki:latest   Loki Logging Driver   true

and indeed, I still see that x86_64 emulator:

$ ps aux | grep x86_64-binfmt
root     3288290  3.8  1.2 1939332 202244 ?      Ssl  06:30   0:02 /usr/libexec/qemu-binfmt/x86_64-binfmt-P /bin/docker-driver /bin/docker-driver
root     3288337  0.0  0.0   6020  1920 pts/4    S+   06:31   0:00 grep --color=auto x86_64-binfmt

here the inspect (which correctly shows your plugin)

$ systemctl restart docker
$ docker plugin inspect loki
[
    {
        "Config": {
            "Args": {
                "Description": "",
                "Name": "",
                "Settable": null,
                "Value": null
            },
            "Description": "Loki Logging Driver",
            "DockerVersion": "27.2.0",
            "Documentation": "https://github.com/grafana/loki",
            "Entrypoint": [
                "/bin/docker-driver"
            ],
            "Env": [
                {
                    "Description": "Set log level to output for plugin logs",
                    "Name": "LOG_LEVEL",
                    "Settable": [
                        "value"
                    ],
                    "Value": "info"
                },
                {
                    "Description": "Activate pprof debugging endpoint for the given port.",
                    "Name": "PPROF_PORT",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                }
            ],
            "Interface": {
                "Socket": "loki.sock",
                "Types": [
                    "docker.logdriver/1.0"
                ]
            },
            "IpcHost": false,
            "Linux": {
                "AllowAllDevices": false,
                "Capabilities": null,
                "Devices": null
            },
            "Mounts": null,
            "Network": {
                "Type": "host"
            },
            "PidHost": false,
            "PropagatedMount": "",
            "User": {},
            "WorkDir": "",
            "rootfs": {
                "diff_ids": [
                    "sha256:c4a9207ddf3d51da7effcf838f81641ef9c37d4861908f319bcc9e86bb767aa4"
                ],
                "type": "layers"
            }
        },
        "Enabled": true,
        "Id": "2d6ac6b33590d8cd13657347c1e0bdbfcc183ebc5937f2234ea3b50c10091164",
        "Name": "loki:latest",
        "PluginReference": "docker.io/tucksaun/loki-docker-driver:main-arm64",
        "Settings": {
            "Args": [],
            "Devices": [],
            "Env": [
                "LOG_LEVEL=info",
                "PPROF_PORT="
            ],
            "Mounts": []
        }
    }
]

francescor avatar Oct 14 '24 06:10 francescor

The host is an AWS t4g.xlarge

# uname -a
Linux boat-worker-arm-1d15 6.2.0-1018-aws #18~22.04.1-Ubuntu SMP Wed Jan 10 22:31:58 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

with installed qemu-user-static & docker-buildx-plugin

If I remove it with apt purge qemu-user-static and then remove and reinstall the plugin, I cannot enable it

francescor avatar Oct 14 '24 07:10 francescor

@francescor I managed to reproduce the issue, I will try to have a look today or tomorrow

tucksaun avatar Oct 14 '24 09:10 tucksaun

@francescor should be fixed

tucksaun avatar Oct 14 '24 13:10 tucksaun

Super!

I confirm your loki image does not need x86_64-binfmt (so it can be removed: in Ubuntu 22.04 apt purge qemu-user-static)

thank you so much @tucksaun

I'll provide more feeds in case I hit issues, but I can see logs in our grafana, so it should be OK

Now we can definitively replace swarm node with ARM instances, with great savings!

Thanks!

francescor avatar Oct 14 '24 15:10 francescor

So @trevorwhitney has been working on the new release process. I don't know if it changed the way the Docker driver was released. I refer to him for reviewing this.

jeschkies avatar Nov 05 '24 15:11 jeschkies

can we add an entry here https://github.com/grafana/loki/blob/main/.github/release-workflows.jsonnet#L8 to add this to the image jobs that get run on release?

Sure. Though, AFAICT the docker-driver is not present for amd64 so I don't have a "template" to follow. Any requirements on your side regarding naming or such (to be sure to meet them)?

tucksaun avatar Nov 06 '24 08:11 tucksaun

Naming should be be grafana/loki-docker-driver which is how it used to be published. As for a template, you should be able to use the jsonnet functions to just add another image job in release-workflows.jsonnet and then run make release-workflows. thanks!

trevorwhitney avatar Nov 06 '24 22:11 trevorwhitney

@trevorwhitney this is added. But this required more changes than you probably anticipated (Docker plugins are not shipped as images...). Also the use of jsonnet does not make it easy to try the changes locally so please be thorough during review :) I added it at a second commit to make it easier for you to review it. I also extracted the changes to a PR to https://github.com/grafana/loki-release: https://github.com/grafana/loki-release/pull/162

tucksaun avatar Nov 07 '24 15:11 tucksaun

awesome, thank you for tackling this!

you're welcome 🙂

unfortunately we can't edit the vendored jsonnet directly at that will get wiped out. that lives in grafana/loki-release, so we'll need to port the changes over there.

yes this why I also extracted the changes and opened https://github.com/grafana/loki-release/pull/162

in the process, is there any existing GitHub action for creating the docker plugin, or do we have to do it by hand? thanks!

not that I know of

tucksaun avatar Nov 08 '24 15:11 tucksaun