vector
vector copied to clipboard
feat(platforms): Add ARMv6 builds
Add ARMv6 builds using the work started in PR #18514.
Extends CI pipeline for the new target architecture to publish to docker and s3.
fixes #18445
Was going to add CI targets for .rpm and .deb but there's a small conflict that I'm not knowledgeable enough to figure out.
As far as I understand it, Debian derived distros understand armhf
as "Arm v7 with Hardware Floating Point registry". This would conflict with the new target of armhf
as "Arm v6 with a Hardware Floating Point registry". As such I thought it'd be acceptable at least to get the started with just publishing to S3 and Docker where the triple is fully specified.
This still needs work, docker image fails to build due to not getting the artifact in with the expected name, currently investigating.
This still needs work, docker image fails to build due to not getting the artifact in with the expected name, currently investigating.
I have converted this to a draft to indicate this status.
This still needs work, docker image fails to build due to not getting the artifact in with the expected name, currently investigating.
I have converted this to a draft to indicate this status.
Perfect, thanks, didn't look like I could do it myself.
I've been running into some issues running cross locally. Turns out cross doesn't work with docker running on btrfs file systems.
Additionally, the containerised dev environment in this repo doesn't support cgroupsv2, this call here returns an empty string. I'll raise a seperate issue for that as fixing this whilst not breaking windows and mac will likely require a core maintainer with a couple of other systems to test on. Mitigated the issue by commenting out the lines that set $HOSTNAME and switch to non-host networking by running ENVIRONMENT_NETWORK="" make environment
this then correctly sets $HOSTNAME to the containerId, unsure if there are any impacts on this, I imagine there are for people trying to connect to the container.
:wave: This is now ready for review, I decided to forgo hf
aka hardware float in the target, this will have a small impact on the efficiency on raspberry pi's but to the benefit of compatibility, this binary will work on all arm v6 devices. A Debian archive is now also built but I skipped creating an RPM, I decided against it as, as far as I can tell, there are no rpm based distros that support arm v6.
Awesome, this is fantastic @wtaylor ! I'll give it a closer review soon.
I kicked off the workflow just to test: https://github.com/vectordotdev/vector/actions/runs/8112047694. This caused me to notice a couple of other things:
- I think the "generate checksums" job needs to depend on the added armv7 package builds (this job was added after you started this PR)
- For some reason, I thought this build was building debs (and not rpms) but it looks like not. I think that's OK if we just publish tarballs for the time being. We can add deb (and maybe rpm) package builds later
I kicked off the workflow just to test: https://github.com/vectordotdev/vector/actions/runs/8112047694. This caused me to notice a couple of other things:
* I think the "generate checksums" job needs to depend on the added armv7 package builds (this job was added after you started this PR) * For some reason, I thought this build was building debs (and not rpms) but it looks like not. I think that's OK if we just publish tarballs for the time being. We can add deb (and maybe rpm) package builds later
I'll add the generate checksums job momentarily, it does actually build a deb for glibc, it's used in the debian based docker image, am I missing a step to publish it or something?
Ah, also this needs a changelog entry 😄 See https://github.com/vectordotdev/vector/blob/master/changelog.d/README.md
I kicked off the workflow just to test: https://github.com/vectordotdev/vector/actions/runs/8112047694. This caused me to notice a couple of other things:
* I think the "generate checksums" job needs to depend on the added armv7 package builds (this job was added after you started this PR) * For some reason, I thought this build was building debs (and not rpms) but it looks like not. I think that's OK if we just publish tarballs for the time being. We can add deb (and maybe rpm) package builds later
I'll add the generate checksums job momentarily, it does actually build a deb for glibc, it's used in the debian based docker image, am I missing a step to publish it or something?
Ah, you are right. I think what was confusing me is that the "Verify DEB" job only verifies the x86_64 deb packages and not armv6 (or armv7 or aarch).
Thanks @wtaylor ! This is looks great. I'm happy to see both the glibc and musl-based builds. I just left one comment.
Out of curiosity: what issues did you run into trying to create the distroless docker images?
We'll also need to update the docs, but we can do that as a second pass after this is merged.
Ah yeah so the core problem is simply that there isn't arm v6 builds of distroless, they only build v7 upwards. We could circumvent that by building the base images too but you'd then need a seperate pipeline and maintainence for that and it seems more hassle than it's worth, people with arm v6 will be more than happy to use debian or alpine based imaged :raised_hand_with_fingers_splayed: myself included.
@jszwedko I've updated my forked branch and rebased onto the latest main whilst I was at it but my changes aren't here, you may need to convert to draft and convert back, I'll stick to linear additions from now on, my mistake :sweat_smile:
No problem! Just let me know when it's ready for another round of review.
I noticed that one of the jobs fails:
2024-03-01T14:14:52.1956444Z > [linux/amd64 builder 4/5] RUN ARCH=$(if [ "linux/amd64" = "linux/arm/v6" ]; then echo "arm"; else cat /etc/apk/arch; fi) tar -xvf vector-0*-"$ARCH"-unknown-linux-musl*.tar.gz --strip-components=2:
2024-03-01T14:14:52.1957866Z 0.120 tar: can't open 'vector-0*--unknown-linux-musl*.tar.gz': No such file or directory
2024-03-01T14:14:52.1958688Z ------
2024-03-01T14:14:52.1988779Z Dockerfile:10
2024-03-01T14:14:52.2002836Z --------------------
2024-03-01T14:14:52.2003767Z 9 | # special case for arm v6 builds, /etc/apk/arch reports armhf which conflicts with the armv7 package
2024-03-01T14:14:52.2005089Z 10 | >>> RUN ARCH=$(if [ "$TARGETPLATFORM" = "linux/arm/v6" ]; then echo "arm"; else cat /etc/apk/arch; fi) \
2024-03-01T14:14:52.2012184Z 11 | >>> tar -xvf vector-0*-"$ARCH"-unknown-linux-musl*.tar.gz --strip-components=2
2024-03-01T14:14:52.2012791Z 12 |
2024-03-01T14:14:52.2013076Z --------------------
2024-03-01T14:14:52.2015439Z ERROR: failed to solve: process "/bin/sh -c ARCH=$(if [ \"$TARGETPLATFORM\" = \"linux/arm/v6\" ]; then echo \"arm\"; else cat /etc/apk/arch; fi) tar -xvf vector-0*-\"$ARCH\"-unknown-linux-musl*.tar.gz --strip-components=2" did not complete successfully: exit code: 1
2024-03-01T14:14:52.2018370Z Error: command: cd "/home/runner/work/vector/vector" && "/home/runner/work/vector/vector/scripts/build-docker.sh"
2024-03-01T14:14:52.2019067Z failed with exit code: 1
2024-03-01T14:14:52.2019486Z make: *** [Makefile:626: release-docker] Error 1
2024-03-01T14:14:52.2035906Z ##[error]Process completed with exit code 2.
I attached the full log here since I'm not sure if outside contributors can see it in GHA.
Ah it looks like a partial outage is the reason that the PR is not synchronized, not my rebasing :D. May as well keep as a draft until I fix that expression anyway.
Amended and tested the alpine Dockerfile locally, to be honest, not a unix shell expert and not sure why separating the commands works I can only speculate that it's something to do with subshell execution. Anyway, ready for review again
Thanks @wtaylor ! It got farther this time, but ran into a different error:
2024-03-04T16:54:37.5466028Z + build distroless-static 0.37.0.custom.8e2b116
2024-03-04T16:54:37.5466873Z + local BASE=distroless-static
2024-03-04T16:54:37.5467458Z + local VERSION=0.37.0.custom.8e2b116
2024-03-04T16:54:37.5468700Z + local TAG=timberio/vector:0.37.0.custom.8e2b116-distroless-static
2024-03-04T16:54:37.5469885Z + local DOCKERFILE=distribution/docker/distroless-static/Dockerfile
2024-03-04T16:54:37.5474221Z + '[' -n linux/amd64,linux/arm64,linux/arm/v7,linux/arm/v6 ']'
2024-03-04T16:54:37.5474996Z + ARGS=()
2024-03-04T16:54:37.5475443Z + [[ true == \t\r\u\e ]]
2024-03-04T16:54:37.5476071Z + ARGS+=(--push)
2024-03-04T16:54:37.5476870Z ++ evaluate_supported_platforms_for_base distroless-static
2024-03-04T16:54:37.5477842Z ++ local BASE=distroless-static
2024-03-04T16:54:37.5478420Z ++ IFS=,
2024-03-04T16:54:37.5478931Z ++ read -ra SUPPORTED_PLATFORMS_FOR_BASE
2024-03-04T16:54:37.5540412Z ++ local BUILDABLE_PLATFORMS=
2024-03-04T16:54:37.5541106Z ++ for platform in "${REQUESTED_PLATFORMS[@]}"
2024-03-04T16:54:37.5542354Z ++ [[ linux/amd64 linux/arm/v7 linux/arm64/v8 =~ linux/amd64 ]]
2024-03-04T16:54:37.5543288Z ++ BUILDABLE_PLATFORMS+=linux/amd64,
2024-03-04T16:54:37.5546667Z ++ for platform in "${REQUESTED_PLATFORMS[@]}"
2024-03-04T16:54:37.5547679Z ++ [[ linux/amd64 linux/arm/v7 linux/arm64/v8 =~ linux/arm64 ]]
2024-03-04T16:54:37.5548589Z ++ BUILDABLE_PLATFORMS+=linux/arm64,
2024-03-04T16:54:37.5549369Z ++ for platform in "${REQUESTED_PLATFORMS[@]}"
2024-03-04T16:54:37.5550648Z ++ [[ linux/amd64 linux/arm/v7 linux/arm64/v8 =~ linux/arm/v7 ]]
2024-03-04T16:54:37.5551512Z ++ BUILDABLE_PLATFORMS+=linux/arm/v7,
2024-03-04T16:54:37.5552290Z ++ for platform in "${REQUESTED_PLATFORMS[@]}"
2024-03-04T16:54:37.5553179Z ++ [[ linux/amd64 linux/arm/v7 linux/arm64/v8 =~ linux/arm/v6 ]]
2024-03-04T16:54:37.5554741Z ++ echo 'WARN: skipping linux/arm/v6 for distroless-static, no base image for platform'
2024-03-04T16:54:37.5556242Z WARN: skipping linux/arm/v6 for distroless-static, no base image for platform
2024-03-04T16:54:37.5557225Z ++ echo linux/amd64,linux/arm64,linux/arm/v7
2024-03-04T16:54:37.5558075Z + local BUILDABLE_PLATFORMS=linux/amd64,linux/arm64,linux/arm/v7
2024-03-04T16:54:37.5560667Z + docker buildx build --platform=linux/amd64,linux/arm64,linux/arm/v7,linux/arm/v6 --tag timberio/vector:0.37.0.custom.8e2b116-distroless-static target/artifacts -f distribution/docker/distroless-static/Dockerfile --push
2024-03-04T16:54:37.9137919Z #0 building with "builder-5f88644b-d7cb-41ac-bd1b-b27fa84e369d" instance using docker-container driver
2024-03-04T16:54:37.9139102Z
2024-03-04T16:54:37.9139374Z #1 [internal] load build definition from Dockerfile
2024-03-04T16:54:37.9139945Z #1 transferring dockerfile: 617B done
2024-03-04T16:54:37.9140699Z #1 DONE 0.0s
2024-03-04T16:54:37.9140909Z
2024-03-04T16:54:37.9141255Z #2 [linux/amd64 internal] load metadata for docker.io/library/alpine:3.18
2024-03-04T16:54:37.9141893Z #2 DONE 0.1s
2024-03-04T16:54:37.9142085Z
2024-03-04T16:54:37.9142432Z #3 [linux/arm/v6 internal] load metadata for docker.io/library/alpine:3.18
2024-03-04T16:54:37.9143060Z #3 DONE 0.1s
2024-03-04T16:54:37.9143247Z
2024-03-04T16:54:37.9143516Z #4 [linux/arm64 internal] load metadata for docker.io/library/alpine:3.18
2024-03-04T16:54:37.9144039Z #4 DONE 0.1s
2024-03-04T16:54:37.9144192Z
2024-03-04T16:54:37.9144477Z #5 [linux/arm/v7 internal] load metadata for docker.io/library/alpine:3.18
2024-03-04T16:54:37.9144992Z #5 DONE 0.1s
2024-03-04T16:54:37.9145139Z
2024-03-04T16:54:37.9145434Z #6 [linux/amd64 internal] load metadata for gcr.io/distroless/static:latest
2024-03-04T16:54:37.9778743Z #6 ...
2024-03-04T16:54:37.9779399Z
2024-03-04T16:54:37.9780245Z #7 [linux/arm/v6 internal] load metadata for gcr.io/distroless/static:latest
2024-03-04T16:54:37.9781579Z #7 ERROR: no match for platform in manifest: not found
2024-03-04T16:54:37.9866351Z
2024-03-04T16:54:37.9866891Z #6 [linux/amd64 internal] load metadata for gcr.io/distroless/static:latest
2024-03-04T16:54:37.9867473Z #6 CANCELED
2024-03-04T16:54:37.9867637Z
2024-03-04T16:54:37.9867996Z #8 [linux/arm/v7 internal] load metadata for gcr.io/distroless/static:latest
2024-03-04T16:54:37.9868561Z #8 CANCELED
2024-03-04T16:54:37.9868717Z
2024-03-04T16:54:37.9868981Z #9 [linux/arm64 internal] load metadata for gcr.io/distroless/static:latest
2024-03-04T16:54:37.9870624Z #9 CANCELED
2024-03-04T16:54:37.9871205Z ------
2024-03-04T16:54:37.9871894Z > [linux/arm/v6 internal] load metadata for gcr.io/distroless/static:latest:
2024-03-04T16:54:37.9872781Z ------
2024-03-04T16:54:37.9883477Z Dockerfile:12
2024-03-04T16:54:37.9884185Z --------------------
2024-03-04T16:54:37.9884829Z 10 | # distroless doesn't use static tags
2024-03-04T16:54:37.9885548Z 11 | # hadolint ignore=DL3007
2024-03-04T16:54:37.9886073Z 12 | >>> FROM gcr.io/distroless/static:latest
2024-03-04T16:54:37.9886460Z 13 |
2024-03-04T16:54:37.9886880Z 14 | COPY --from=builder /vector/bin/* /usr/local/bin/
2024-03-04T16:54:37.9887322Z --------------------
2024-03-04T16:54:37.9887885Z ERROR: failed to solve: gcr.io/distroless/static:latest: no match for platform in manifest: not found
2024-03-04T16:54:37.9909500Z Error: command: cd "/home/runner/work/vector/vector" && "/home/runner/work/vector/vector/scripts/build-docker.sh"
2024-03-04T16:54:37.9910767Z failed with exit code: 1
2024-03-04T16:54:37.9919532Z make: *** [Makefile:626: release-docker] Error 1
Full log: job-logs.txt
Thanks @wtaylor ! It got farther this time, but ran into a different error:
The bane of PR's to CI systems, fixed
Thanks @wtaylor ! It got farther this time, but ran into a different error:
The bane of PR's to CI systems, fixed
For sure 😓 I'll kick off another build.
It looks like the custom build workflow jobs all passed 🎉 I'll give this another look shortly.
It looks like there were some shell lints flagged if you get some time.
It looks like the custom build workflow jobs all passed 🎉 I'll give this another look shortly.
It looks like there were some shell lints flagged if you get some time.
A new tool on the tool belt for me, all shellcheck issues fixed
Regression Detector Results
Run ID: 35b5c8dd-c1eb-44ec-b62f-e9bb25d4de11 Baseline: cbcb874a9944801e8a89d42e44ecf551db55071a Comparison: 2f58bbcf9bd50532ac2a27b357c57cc06ee909d5 Total CPUs: 7
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
No significant changes in experiment optimization goals
Confidence level: 90.00% Effect size tolerance: |Δ mean %| ≥ 5.00%
There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
Fine details of change detection per experiment
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | http_text_to_http_json | ingress throughput | +0.97 | [+0.83, +1.11] |
➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | +0.87 | [+0.73, +1.02] |
➖ | file_to_blackhole | egress throughput | +0.87 | [-1.62, +3.35] |
➖ | syslog_humio_logs | ingress throughput | +0.69 | [+0.58, +0.81] |
➖ | datadog_agent_remap_datadog_logs | ingress throughput | +0.63 | [+0.52, +0.75] |
➖ | http_elasticsearch | ingress throughput | +0.52 | [+0.45, +0.59] |
➖ | datadog_agent_remap_blackhole | ingress throughput | +0.42 | [+0.33, +0.51] |
➖ | splunk_hec_route_s3 | ingress throughput | +0.36 | [-0.12, +0.84] |
➖ | http_to_http_noack | ingress throughput | +0.22 | [+0.12, +0.33] |
➖ | datadog_agent_remap_blackhole_acks | ingress throughput | +0.12 | [+0.01, +0.24] |
➖ | otlp_grpc_to_blackhole | ingress throughput | +0.09 | [-0.00, +0.17] |
➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | +0.07 | [-0.06, +0.20] |
➖ | http_to_http_json | ingress throughput | +0.04 | [-0.04, +0.11] |
➖ | otlp_http_to_blackhole | ingress throughput | +0.03 | [-0.12, +0.19] |
➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | -0.00 | [-0.16, +0.16] |
➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | -0.00 | [-0.14, +0.14] |
➖ | http_to_s3 | ingress throughput | -0.02 | [-0.29, +0.26] |
➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | -0.05 | [-0.17, +0.06] |
➖ | enterprise_http_to_http | ingress throughput | -0.08 | [-0.14, -0.02] |
➖ | syslog_splunk_hec_logs | ingress throughput | -0.18 | [-0.25, -0.11] |
➖ | syslog_loki | ingress throughput | -0.22 | [-0.27, -0.17] |
➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | -0.39 | [-0.50, -0.27] |
➖ | socket_to_socket_blackhole | ingress throughput | -0.50 | [-0.59, -0.42] |
➖ | fluent_elasticsearch | ingress throughput | -0.74 | [-1.20, -0.28] |
➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | -0.77 | [-0.85, -0.69] |
➖ | syslog_log2metric_humio_metrics | ingress throughput | -0.91 | [-1.02, -0.80] |
➖ | http_to_http_acks | ingress throughput | -0.99 | [-2.29, +0.31] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
Regression Detector Results
Run ID: 07bc0d0b-c931-49aa-a112-899a19ad54ad Baseline: b35eaf53315532a7668cd36342f72af2d4e00488 Comparison: e9815e1f328a4ef59099c3d07918f167947c2e1f Total CPUs: 7
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
No significant changes in experiment optimization goals
Confidence level: 90.00% Effect size tolerance: |Δ mean %| ≥ 5.00%
There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
Fine details of change detection per experiment
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | +3.54 | [+3.39, +3.70] |
➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | +2.46 | [+2.25, +2.67] |
➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | +1.30 | [+1.20, +1.39] |
➖ | syslog_humio_logs | ingress throughput | +1.25 | [+1.17, +1.33] |
➖ | syslog_log2metric_humio_metrics | ingress throughput | +1.02 | [+0.89, +1.15] |
➖ | otlp_http_to_blackhole | ingress throughput | +0.70 | [+0.55, +0.85] |
➖ | splunk_hec_route_s3 | ingress throughput | +0.60 | [+0.11, +1.09] |
➖ | http_text_to_http_json | ingress throughput | +0.34 | [+0.22, +0.47] |
➖ | datadog_agent_remap_blackhole | ingress throughput | +0.25 | [+0.17, +0.34] |
➖ | http_to_http_acks | ingress throughput | +0.23 | [-1.09, +1.56] |
➖ | fluent_elasticsearch | ingress throughput | +0.17 | [-0.30, +0.64] |
➖ | otlp_grpc_to_blackhole | ingress throughput | +0.16 | [+0.08, +0.25] |
➖ | http_to_http_noack | ingress throughput | +0.14 | [+0.04, +0.25] |
➖ | http_elasticsearch | ingress throughput | +0.08 | [+0.01, +0.15] |
➖ | http_to_http_json | ingress throughput | +0.02 | [-0.05, +0.10] |
➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | -0.00 | [-0.16, +0.15] |
➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | -0.01 | [-0.14, +0.13] |
➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | -0.03 | [-0.15, +0.08] |
➖ | enterprise_http_to_http | ingress throughput | -0.08 | [-0.16, -0.01] |
➖ | http_to_s3 | ingress throughput | -0.10 | [-0.38, +0.19] |
➖ | datadog_agent_remap_datadog_logs | ingress throughput | -0.48 | [-0.60, -0.36] |
➖ | socket_to_socket_blackhole | ingress throughput | -0.85 | [-0.93, -0.77] |
➖ | datadog_agent_remap_blackhole_acks | ingress throughput | -0.89 | [-0.98, -0.79] |
➖ | file_to_blackhole | egress throughput | -1.07 | [-3.45, +1.31] |
➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | -1.08 | [-1.21, -0.95] |
➖ | syslog_loki | ingress throughput | -1.10 | [-1.17, -1.03] |
➖ | syslog_splunk_hec_logs | ingress throughput | -2.52 | [-2.62, -2.42] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".