stratum icon indicating copy to clipboard operation
stratum copied to clipboard

Migrate to Bazel docker rules

Open pudelkoM opened this issue 5 years ago • 5 comments

This PR migrates away from DOCKERFILEs and bash scripts, to rules_docker.

Resulting image can be loaded onto a switch and starts fine.

Docker inspect
# docker image inspect bazel/stratum/hal/bin/barefoot:stratum_bfrt_docker
[
    {
        "Id": "sha256:0aa0511807a5777be5feba99c297089106cad8a9d88f8aa582764c176564a137",
        "RepoTags": [
            "bazel/stratum/hal/bin/barefoot:stratum_bfrt_docker"
        ],
        "RepoDigests": [],
        "Parent": "",
        "Comment": "",
        "Created": "2021-01-05T02:40:04Z",
        "Container": "",
        "ContainerConfig": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": null,
            "Cmd": null,
            "Image": "",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        },
        "DockerVersion": "",
        "Author": "Bazel",
        "Config": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "28000/tcp": {},
                "9339/tcp": {},
                "9559/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": null,
            "Cmd": [
                "/bin/bash"
            ],
            "Image": "14c86ccce4abfac016d7c28c5640ac6814ef21f76b8e1be0c345d3a068da2876",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": [
                "start-stratum.sh"
            ],
            "OnBuild": null,
            "Labels": {
                "bf-sde-version": "9.2.0",
                "build-machine": "02d908cc597c",
                "build-timestamp": "1609814404",
                "description": "This Docker image includes runtime library for Barefoot Tofino switches",
                "maintainer": "Stratum dev <[email protected]>",
                "org.opencontainers.image.revision": "todo",
                "org.opencontainers.image.source": "todo",
                "org.opencontainers.image.version": "todo",
                "stratum-target": "stratum-bfrt"
            }
        },
        "Architecture": "amd64",
        "Os": "linux",
        "Size": 149776775,
        "VirtualSize": 149776775,
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/af45678163d48fe6ecec0ee0784d6bec3c26263e400f34e12ac68564f050b69e/diff:/var/lib/docker/overlay2/db533bd8baa5cca7a7c3082f147c78cf70844a5db12a084b0afa1d7204964bd9/diff",
                "MergedDir": "/var/lib/docker/overlay2/f5c2221c5d8c6fa3ddbd4ab2506296c8cdb6868f68e2282d629cad428db87b3a/merged",
                "UpperDir": "/var/lib/docker/overlay2/f5c2221c5d8c6fa3ddbd4ab2506296c8cdb6868f68e2282d629cad428db87b3a/diff",
                "WorkDir": "/var/lib/docker/overlay2/f5c2221c5d8c6fa3ddbd4ab2506296c8cdb6868f68e2282d629cad428db87b3a/work"
            },
            "Name": "overlay2"
        },
        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:b2be8bdd1ef94f1a3115419a827078473a3a9eb15d291f8f08878828a4be3e9b",
                "sha256:e46c47b94c070a764e5eaf1c8430e6baa17dbfc26939547234132b4729c6e9fc",
                "sha256:b3f5b8edb2cbe58c5cc7ace2308e4ec53a58db86d368e0b02eb6ec52a934edcd"
            ]
        },
        "Metadata": {
            "LastTagTime": "0001-01-01T00:00:00Z"
        }
    }
]
Stratum logs
# ./start-stratum-container.sh --bf-sim --bf-switchd-background=false --incompatible_enable_register_reset_annotations
+ [[ -z '' ]]
+ [[ -f /etc/onl/platform ]]
++ cat /etc/onl/platform
+ PLATFORM=x86-64-accton-wedge100bf-32qs-r0
+ '[' -d /etc/onl ']'
++ awk '{print "-v " $1 ":" $1 " " }'
++ ls /lib/x86_64-linux-gnu/libonlp-platform-defaults.so /lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1 /lib/x86_64-linux-gnu/libonlp-platform.so /lib/x86_64-linux-gnu/libonlp-platform.so.1 /lib/x86_64-linux-gnu/libonlp.so /lib/x86_64-linux-gnu/libonlp.so.1 /lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1
+ ONLP_MOUNT='-v /lib/x86_64-linux-gnu/libonlp-platform-defaults.so:/lib/x86_64-linux-gnu/libonlp-platform-defaults.so
-v /lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1:/lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1
-v /lib/x86_64-linux-gnu/libonlp-platform.so:/lib/x86_64-linux-gnu/libonlp-platform.so
-v /lib/x86_64-linux-gnu/libonlp-platform.so.1:/lib/x86_64-linux-gnu/libonlp-platform.so.1
-v /lib/x86_64-linux-gnu/libonlp.so:/lib/x86_64-linux-gnu/libonlp.so
-v /lib/x86_64-linux-gnu/libonlp.so.1:/lib/x86_64-linux-gnu/libonlp.so.1
-v /lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1:/lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1 '
+ ONLP_MOUNT='-v /lib/x86_64-linux-gnu/libonlp-platform-defaults.so:/lib/x86_64-linux-gnu/libonlp-platform-defaults.so
-v /lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1:/lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1
-v /lib/x86_64-linux-gnu/libonlp-platform.so:/lib/x86_64-linux-gnu/libonlp-platform.so
-v /lib/x86_64-linux-gnu/libonlp-platform.so.1:/lib/x86_64-linux-gnu/libonlp-platform.so.1
-v /lib/x86_64-linux-gnu/libonlp.so:/lib/x86_64-linux-gnu/libonlp.so
-v /lib/x86_64-linux-gnu/libonlp.so.1:/lib/x86_64-linux-gnu/libonlp.so.1
-v /lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1:/lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1                -v /lib/platform-config:/lib/platform-config               -v /etc/onl:/etc/onl'
+ '[' -n '' ']'
+ '[' -n /root/chassis_config.pb.txt ']'
+ CHASSIS_CONFIG_MOUNT='-v /root/chassis_config.pb.txt:/etc/stratum/x86-64-accton-wedge100bf-32qs-r0/chassis_config.pb.txt'
+ LOG_DIR=/var/log
+ SDE_VERSION=9.2.0
+ DOCKER_IMAGE=bazel/stratum/hal/bin/barefoot
+ DOCKER_IMAGE_TAG=stratum_bfrt_docker
++ uname -r
++ uname -r
+ docker run -it --rm --privileged -v /dev:/dev -v /sys:/sys -v /lib/modules/4.14.49-OpenNetworkLinux:/lib/modules/4.14.49-OpenNetworkLinux --env PLATFORM=x86-64-accton-wedge100bf-32qs-r0 -v /lib/x86_64-linux-gnu/libonlp-platform-defaults.so:/lib/x86_64-linux-gnu/libonlp-platform-defaults.so -v /lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1:/lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1 -v /lib/x86_64-linux-gnu/libonlp-platform.so:/lib/x86_64-linux-gnu/libonlp-platform.so -v /lib/x86_64-linux-gnu/libonlp-platform.so.1:/lib/x86_64-linux-gnu/libonlp-platform.so.1 -v /lib/x86_64-linux-gnu/libonlp.so:/lib/x86_64-linux-gnu/libonlp.so -v /lib/x86_64-linux-gnu/libonlp.so.1:/lib/x86_64-linux-gnu/libonlp.so.1 -v /lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1:/lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1 -v /lib/platform-config:/lib/platform-config -v /etc/onl:/etc/onl -p 28000:28000 -p 9339:9339 -p 9559:9559 -v /root/chassis_config.pb.txt:/etc/stratum/x86-64-accton-wedge100bf-32qs-r0/chassis_config.pb.txt -v /var/log:/var/log/stratum bazel/stratum/hal/bin/barefoot:stratum_bfrt_docker --bf-sim --bf-switchd-background=false --bf-sim --bf-switchd-background=false --incompatible_enable_register_reset_annotations
Mounting hugepages...
bf_kdrv_mod found! Unloading first...
loading bf_kdrv_mod...
bf_sysfs_fname /sys/class/bf/bf0/device/dev_add
Install dir: /usr (0x558157952020)
bf_switchd: system services initialized
bf_switchd: loading conf_file /usr/share/stratum/tofino_skip_p4_no_bsp.conf...
bf_switchd: processing device configuration...
Configuration for dev_id 0
  Family        : Tofino
  pci_sysfs_str : /sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0
  pci_domain    : 0
  pci_bus       : 5
  pci_fn        : 0
  pci_dev       : 0
  pci_int_mode  : 1
  sbus_master_fw: /usr/
  pcie_fw       : /usr/
  serdes_fw     : /usr/
  sds_fw_path   : /usr/
  microp_fw_path:
bf_switchd: processing P4 configuration...
P4 profile for dev_id 0
  p4_name: dummy
    libpd:
    libpdthrift:
    context:
    config:
  diag:
  accton diag:
  board-port-map: /usr/share/port_map.json
  non_default_port_ppgs: 0
  SAI default initialize: 1
board-port-map path: /usr/share/port_map.json
bf_switchd: loading board-map conf file /usr/share/port_map.json...
enable-debug-log:yes
Parsing Board-lane-entries 33 found
num-of-connectors:33

Connector->1 device_id:0 mac_block:23 media:copper enable-auto-neg:1
lane0 mac_ch:0 tx_lane:2, rx_lane:2, tx_pn_swap:1 rx_pn_swap:1
tx_eq_pre:6 tx_eq_post:0 tx_eq_attn:0
lane1 mac_ch:1 tx_lane:0, rx_lane:0, tx_pn_swap:1 rx_pn_swap:1
tx_eq_pre:6 tx_eq_post:0 tx_eq_attn:0
lane2 mac_ch:2 tx_lane:3, rx_lane:3, tx_pn_swap:1 rx_pn_swap:0
tx_eq_pre:6 tx_eq_post:0 tx_eq_attn:0
lane3 mac_ch:3 tx_lane:1, rx_lane:1, tx_pn_swap:1 rx_pn_swap:0
tx_eq_pre:6 tx_eq_post:0 tx_eq_attn:0

Operational mode set to ASIC
ASIC detected at PCI /sys/class/bf/bf0/device
ASIC pci device id is 16
bf_switchd: drivers initialized
Skipping P4 program load for dev_id 0
detecting.. IOMMU not enabled on the platform
Setting core_pll_ctrl0=cd44cbfe

bf_switchd: dev_id 0 initialized

bf_switchd: initialized 1 devices
Skip p4 lib init
Skip mav diag lib init
bf_switchd: spawning cli server thread
bf_switchd: spawning driver shell
bf_switchd: server started - listening on port 9999
I20210105 02:45:52.917454     1 main_bfrt.cc:78] switchd started successfully
W20210105 02:45:52.917532     1 credentials_manager.cc:45] Using insecure server credentials
Cannot read termcap database;
using dumb terminal settings.
I20210105 02:45:52.917764     1 hal.cc:128] Setting up HAL in COLDBOOT mode...
I20210105 02:45:52.917840     1 config_monitoring_service.cc:94] Pushing the saved chassis config read from /etc/stratum/x86-64-accton-wedge100bf-32qs-r0/chassis_config.pb.txt...
bf-sde> I20210105 02:45:52.920886     1 bf_chassis_manager.cc:986] Port status notification callback registered successfully
I20210105 02:45:52.921017     1 bf_chassis_manager.cc:87] Adding port 36 in node 1 (SDK Port 36).
I20210105 02:45:53.333361     1 bf_chassis_manager.cc:116] Enabling port 36 in node 1 (SDK Port 36).
I20210105 02:45:53.333573     1 bf_chassis_manager.cc:87] Adding port 37 in node 1 (SDK Port 37).
I20210105 02:45:53.746049     1 bf_chassis_manager.cc:116] Enabling port 37 in node 1 (SDK Port 37).
I20210105 02:45:53.746248     1 bf_chassis_manager.cc:87] Adding port 272 in node 1 (SDK Port 272).
I20210105 02:45:54.165330     1 bf_chassis_manager.cc:116] Enabling port 272 in node 1 (SDK Port 272).
I20210105 02:45:54.165588     1 bf_chassis_manager.cc:87] Adding port 280 in node 1 (SDK Port 280).
I20210105 02:45:54.583974     1 bf_chassis_manager.cc:116] Enabling port 280 in node 1 (SDK Port 280).
I20210105 02:45:54.584111     1 bf_chassis_manager.cc:87] Adding port 256 in node 1 (SDK Port 256).
I20210105 02:45:55.002215     1 bf_chassis_manager.cc:116] Enabling port 256 in node 1 (SDK Port 256).
I20210105 02:45:55.002486     1 bf_chassis_manager.cc:87] Adding port 264 in node 1 (SDK Port 264).
I20210105 02:45:55.421836     1 bf_chassis_manager.cc:116] Enabling port 264 in node 1 (SDK Port 264).
I20210105 02:45:55.421993     1 bfrt_switch.cc:60] Chassis config pushed successfully.
I20210105 02:45:55.422106    41 bf_chassis_manager.cc:879] State of port 36 in node 1 (SDK port 36): UP.
I20210105 02:45:55.425225     1 p4_service.cc:121] Pushing the saved forwarding pipeline configs read from /var/run/stratum/pipeline_cfg.pb.txt...
E20210105 02:45:55.425290     1 utils.cc:109] StratumErrorSpace::ERR_FILE_NOT_FOUND: /var/run/stratum/pipeline_cfg.pb.txt not found.
E20210105 02:45:55.425487     1 utils.cc:65] Return Error: ReadFileToString(filename, &text) failed with StratumErrorSpace::ERR_FILE_NOT_FOUND: /var/run/stratum/pipeline_cfg.pb.txt not found.
W20210105 02:45:55.425506     1 p4_service.cc:130] No saved forwarding pipeline config found at /var/run/stratum/pipeline_cfg.pb.txt. This is normal when the switch is just installed and no master controller is connected yet.
E0105 02:45:55.426722924       1 server_chttp2.cc:40]        {"created":"@1609814755.426684341","description":"Only 1 addresses added out of total 2 resolved","file":"external/com_github_grpc_grpc/src/core/ext/transport/chttp2/server/chttp2_server.cc","file_line":406,"referenced_errors":[{"created":"@1609814755.426680271","description":"Address family not supported by protocol","errno":97,"file":"external/com_github_grpc_grpc/src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":403,"os_error":"Address family not supported by protocol","syscall":"socket","target_address":"[::1]:28000"}]}
E20210105 02:45:55.426998     1 hal.cc:221] Stratum external facing services are listening to 0.0.0.0:28000, 0.0.0.0:9339, 0.0.0.0:9559, localhost:28000...
E20210105 02:45:55.427639     1 hal.cc:363] StratumErrorSpace::ERR_INTERNAL: Failed to check in with procmon: failed to connect to all addresses
E20210105 02:45:55.427714     1 hal.cc:231] Error when checking in with procmon: Failed to check in with procmon: failed to connect to all addresses.
I20210105 02:45:55.539577    41 bf_chassis_manager.cc:879] State of port 37 in node 1 (SDK port 37): UP.
I20210105 02:45:56.791774    41 bf_chassis_manager.cc:879] State of port 280 in node 1 (SDK port 280): UP.
I20210105 02:45:56.823993    41 bf_chassis_manager.cc:879] State of port 256 in node 1 (SDK port 256): UP.
I20210105 02:45:56.861732    41 bf_chassis_manager.cc:879] State of port 264 in node 1 (SDK port 264): UP.
I20210105 02:45:57.342635    41 bf_chassis_manager.cc:879] State of port 272 in node 1 (SDK port 272): UP.

bf-sde>

These rules require access to a running docker daemon, because we have to run installation scripts (debs).

TODOs:

  • [ ] Make Bazel run as normal user, not sudo (currently needed for Docker socket access) Will require some custom setup: --group-add $(getent group docker | cut -d: -f3) (Linux)
  • [x] Figure out how to make Docker-in-Docker work (setup_dev_env.sh). Mount Docker socket into dev container.
  • [x] Fix repository name and tag. Currently: stratumproject/stratum/hal/bin/barefoot:stratum_bfrt_docker. Fixed in the push command.

pudelkoM avatar Jan 05 '21 02:01 pudelkoM

Codecov Report

Merging #489 (0c2508f) into master (41f3f04) will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #489   +/-   ##
=======================================
  Coverage   78.61%   78.61%           
=======================================
  Files         331      331           
  Lines       28822    28822           
=======================================
  Hits        22658    22658           
  Misses       6164     6164           

codecov[bot] avatar Jan 05 '21 06:01 codecov[bot]

For reference: https://github.com/bazelbuild/rules_docker/issues/1575

pudelkoM avatar Jan 06 '21 01:01 pudelkoM

The "re-use deb and docker run" approach creates a equally sized image compared to our current approach:

REPOSITORY                                     TAG                       IMAGE ID            CREATED             SIZE
stratumproject/stratum/hal/bin/barefoot        stratum_bfrt_final        26821ee36074        45 seconds ago      227MB
stratumproject/stratum-bfrt                    20.12-9.3.0               99487cd66ac9        2 weeks ago         228MB

There is still some room for improvements according to dive:

# env CI=true dive 26821ee36074
  Using default CI config
Image Source: docker://26821ee36074
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 65.0752 %
  wastedBytes: 81165627 bytes (81 MB)
  userWastedPercent: 46.7508 %
Inefficient Files:
Count  Wasted Space  File Path
    2         78 MB  /stratum_bfrt_deb.deb
    2        1.5 MB  /var/cache/debconf/templates.dat
    2        1.5 MB  /var/cache/debconf/templates.dat-old
    2        153 kB  /var/lib/dpkg/status
    2        153 kB  /var/lib/dpkg/status-old
    2         70 kB  /var/lib/dpkg/info/perl-base.list
    2         24 kB  /var/lib/dpkg/info/passwd.list
    2         21 kB  /var/lib/dpkg/info/util-linux.list
    2         19 kB  /var/lib/dpkg/info/apt.list
    2         18 kB  /var/lib/dpkg/info/coreutils.list
    2         16 kB  /var/lib/dpkg/info/login.list
    2         16 kB  /var/lib/dpkg/info/dpkg.list
    2         16 kB  /var/lib/dpkg/info/libpam-runtime.list
    2         15 kB  /var/lib/dpkg/info/debconf.list
    2         14 kB  /etc/ld.so.cache
    2         12 kB  /var/lib/dpkg/info/adduser.list
    2        9.1 kB  /var/cache/debconf/config.dat
    2        9.1 kB  /var/lib/dpkg/info/libpam-modules:amd64.list
    2        8.9 kB  /var/cache/debconf/config.dat-old
    2        8.6 kB  /var/lib/dpkg/info/libapt-pkg5.0:amd64.list
    2        8.1 kB  /var/lib/dpkg/info/grep.list
    2        8.1 kB  /var/lib/dpkg/info/bash.list
    2        7.8 kB  /var/lib/dpkg/info/findutils.list
    2        7.3 kB  /var/lib/dpkg/info/tar.list
    2        7.2 kB  /var/lib/dpkg/info/sed.list
    2        6.3 kB  /var/lib/dpkg/info/diffutils.list
    2        6.0 kB  /var/lib/dpkg/info/debianutils.list
    2        4.0 kB  /var/lib/dpkg/info/libgpg-error0:amd64.list
    2        3.1 kB  /var/lib/dpkg/info/sensible-utils.list
    2        2.7 kB  /var/lib/dpkg/info/libc-bin.list
    2        2.2 kB  /var/lib/dpkg/info/mount.list
    2        2.2 kB  /var/lib/dpkg/info/base-passwd.list
    2        2.2 kB  /etc/passwd
    2        2.1 kB  /etc/passwd-
    2        2.0 kB  /var/lib/dpkg/info/gzip.list
    2        2.0 kB  /var/lib/dpkg/info/init-system-helpers.list
    2        1.7 kB  /var/lib/dpkg/info/bsdutils.list
    2        1.6 kB  /var/lib/dpkg/info/insserv.list
    2        1.5 kB  /var/lib/dpkg/info/mawk.list
    2        1.4 kB  /var/lib/dpkg/info/libpam-modules-bin.list
    2        1.2 kB  /var/lib/dpkg/info/libpcre3:amd64.list
    2        1.2 kB  /var/lib/apt/extended_states
    2        1.2 kB  /etc/shadow-
    2        1.2 kB  /etc/shadow
    2        1.1 kB  /var/lib/dpkg/info/sysvinit-utils.list
    2        1.0 kB  /etc/group
    2         985 B  /etc/group-
    2         948 B  /var/lib/dpkg/info/hostname.list
    2         851 B  /etc/gshadow
    2         816 B  /var/lib/dpkg/info/dash.list
    2         740 B  /var/lib/dpkg/info/startpar.list
    2         644 B  /var/lib/dpkg/info/libsemanage-common.list
    2         578 B  /var/lib/dpkg/info/libaudit-common.list
    2         540 B  /var/lib/dpkg/info/gpgv.list
    2          82 B  /etc/init.d/.depend.boot
    2          48 B  /etc/init.d/.depend.start
    2          42 B  /etc/init.d/.depend.stop
    2           0 B  /var/cache/apt
    2           0 B  /var/lib/dpkg/triggers/Lock
    2           0 B  /tmp
    2           0 B  /var/cache/debconf/passwords.dat
    2           0 B  /var/lib/dpkg/updates
    2           0 B  /var/lib/apt/lists
    2           0 B  /etc/.pwd.lock
    2           0 B  /var/lib/dpkg/lock
Results:
  FAIL: highestUserWastedPercent: too many bytes wasted, relative to the user bytes added (%-user-wasted-bytes=0.4675079879226798 > threshold=0.1)
  SKIP: highestWastedBytes: rule disabled
  FAIL: lowestEfficiency: image efficiency is too low (efficiency=0.6507519729927197 < threshold=0.9)
Result:FAIL [Total:3] [Passed:0] [Failed:2] [Warn:0] [Skipped:1]

pudelkoM avatar Jan 06 '21 02:01 pudelkoM

There are some issues with CircleCI and the docker socket. container_run_and_commit don't seem to pick up the docker socket. Maybe this PR has to wait until the migration to Jenkins.

CircleCI provides the docker socket in an env var: DOCKER_HOST=tcp://35.229.50.57:2376

By default Bazel uses an empty env set for reproducible builds. By passing through the relevant vars, we should have accress to docker again: --action_env=DOCKER_HOST

pudelkoM avatar Jan 12 '21 01:01 pudelkoM

Need to figure out local development

bocon13 avatar Jul 15 '21 20:07 bocon13