ortex icon indicating copy to clipboard operation
ortex copied to clipboard

Issues cross-compiling for Nerves

Open lawik opened this issue 1 year ago • 11 comments

Relevant output log:

Error seems to be:

  aarch64-nerves-linux-gnu-gcc: error: unrecognized command-line option '-m64'
  thread 'main' panicked at /home/lawik/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ring-0.16.20/build.rs:656:9:

This? https://github.com/briansmith/ring/issues/2009

...
   Compiling zip v0.6.4
   Compiling tar v0.4.38
error: failed to run custom build command for `ring v0.16.20`

Caused by:
  process didn't exit successfully: `/home/lawik/nervescloud/nerves_cloud_kiosk/_build/frio_rpi4_dev/lib/ortex/native/ortex/release/build/ring-e238304c7cffd031/build-script-build` (exit status: 101)
  --- stdout
  OPT_LEVEL = Some("0")
  TARGET = Some("x86_64-unknown-linux-gnu")
  HOST = Some("x86_64-unknown-linux-gnu")
  cargo:rerun-if-env-changed=CC_x86_64-unknown-linux-gnu
  CC_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=CC_x86_64_unknown_linux_gnu
  CC_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_CC
  HOST_CC = None
  cargo:rerun-if-env-changed=CC
  CC = Some("/home/lawik/projects/nerves_systems/o/membrane-rpi4/host/bin/aarch64-nerves-linux-gnu-gcc")
  cargo:rerun-if-env-changed=CFLAGS_x86_64-unknown-linux-gnu
  CFLAGS_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=CFLAGS_x86_64_unknown_linux_gnu
  CFLAGS_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_CFLAGS
  HOST_CFLAGS = None
  cargo:rerun-if-env-changed=CFLAGS
  CFLAGS = Some("-mabi=lp64 -fstack-protector-strong -mcpu=cortex-a72 -fPIE -pie -Wl,-z,now -Wl,-z,relro -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -pipe -O2 --sysroot /home/lawik/projects/nerves_systems/o/membrane-rpi4/staging")
  cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
  CRATE_CC_NO_DEFAULTS = None
  DEBUG = Some("false")
  CARGO_CFG_TARGET_FEATURE = Some("fxsr,sse,sse2")

  --- stderr
  running "/home/lawik/projects/nerves_systems/o/membrane-rpi4/host/bin/aarch64-nerves-linux-gnu-gcc" "-O0" "-ffunction-sections" "-fdata-sections" "-fPIC" "-m64" "-mabi=lp64" "-fstack-protector-strong" "-mcpu=cortex-a72" "-fPIE" "-pie" "-Wl,-z,now" "-Wl,-z,relro" "-D_LARGEFILE_SOURCE" "-D_LARGEFILE64_SOURCE" "-D_FILE_OFFSET_BITS=64" "-pipe" "-O2" "--sysroot" "/home/lawik/projects/nerves_systems/o/membrane-rpi4/staging" "-I" "include" "-pedantic" "-pedantic-errors" "-Wall" "-Wextra" "-Wcast-align" "-Wcast-qual" "-Wconversion" "-Wenum-compare" "-Wfloat-equal" "-Wformat=2" "-Winline" "-Winvalid-pch" "-Wmissing-field-initializers" "-Wmissing-include-dirs" "-Wredundant-decls" "-Wshadow" "-Wsign-compare" "-Wsign-conversion" "-Wundef" "-Wuninitialized" "-Wwrite-strings" "-fno-strict-aliasing" "-fvisibility=hidden" "-fstack-protector" "-g3" "-DNDEBUG" "-c" "-o/home/lawik/nervescloud/nerves_cloud_kiosk/_build/frio_rpi4_dev/lib/ortex/native/ortex/release/build/ring-b8b4d4cd8b26833b/out/aesni-x86_64-elf.o" "/home/lawik/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ring-0.16.20/pregenerated/aesni-x86_64-elf.S"
  aarch64-nerves-linux-gnu-gcc: error: unrecognized command-line option '-m64'
  thread 'main' panicked at /home/lawik/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ring-0.16.20/build.rs:656:9:
  execution failed
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
error: failed to run custom build command for `ring v0.16.20`

lawik avatar Nov 25 '24 14:11 lawik

It also reproduces if you:

mix archive.install hex nerves_bootstrap
mix nerves.new ccortex
cd ccortex
# add to mix.exs deps: ortex and nx
export MIX_TARGET=rpi4
mix deps.get
mix firmware

lawik avatar Nov 25 '24 14:11 lawik

Confirmed that it the issue exists on 1.10.0 which is current latest :)

lawik avatar Dec 09 '24 09:12 lawik

It seems weird that the log output indicates the TARGET to be x86.

lawik avatar Dec 09 '24 09:12 lawik

I think this tracks, I'm guessing it has to do with there not being a path in ort to auto download the shared libonnxruntime library for arm (if not, I bet it will fail when it gets there). Getting ortex running on nerves for rip4 (and jetson boards) is something I'd love to see, there's no reason it shouldn't work. We'll just need to build the correct libonnxruntime and add logic to the build steps.

I may poke at this over the holidays if I can find the time :wink:

mortont avatar Dec 10 '24 16:12 mortont

Would love it if you can :)

lawik avatar Dec 10 '24 17:12 lawik

The initial problem is not the libonnxruntime. The initial problem is the TLS library for the HTTP client.

I think I fixed it:

index 4052470..f401cd0 100644
--- a/native/ortex/Cargo.toml
+++ b/native/ortex/Cargo.toml
@@ -11,16 +11,12 @@ crate-type = ["cdylib"]
 
 [dependencies]
 rustler = "0.29.0"
-ort = { version = "2.0.0-rc.8" }
+ort = { version = "2.0.0-rc.8", default-features = false, features = [ "half", "ndarray", "copy-dylibs", "load-dynamic" ] }
 ndarray = "0.16.1"
 half = "2.2.1"
 tracing-subscriber = { version = "0.3", features = [ "env-filter", "fmt" ] }
 num-traits = "0.2.15"
-rustls = "0.22.4"
 
 [features]
 # ONNXRuntime Execution providers
-directml = ["ort/directml"]
-coreml = ["ort/coreml"]
-cuda = ["ort/cuda"]
-tensorrt = ["ort/tensorrt"]
+acl = ["ort/acl"]

The important bits are default-features = false which will remove the download-binaries behavior. I tried adding in the ARM execution provider acl but not sure that's actually meaningful right now.

I added the libonnxruntime, not sure that helped:

export ORT_LIB_LOCATION="/home/lawik/Downloads/onnxruntime-linux-aarch64-1.20.1

But not it builds with warnings. But Nerves detects a real problem:

|nerves| Building OTP Release...

* [Nerves] validating vm.args
* skipping runtime configuration (config/runtime.exs not found)
* creating _build/rpi4_dev/rel/ccortex/releases/0.1.0/vm.args
Updating base firmware image with Erlang release...
scrub-otp-release.sh: ERROR: Unexpected executable format for '/home/lawik/ccortex/_build/rpi4_dev/_nerves-tmp/rootfs_overlay/srv/erlang/lib/ortex-0.1.10/priv/native/libortex.so'

Got:
 readelf:Advanced Micro Devices X86-64;0x0

Expecting:
 readelf:AArch64;0x0

This file was compiled for the host or a different target and probably
will not work.

I will check if I just need to clean up something in my project but I suspect ortex needs a hint to build for ARM.

lawik avatar Jan 03 '25 07:01 lawik

Make sure we have the aarch64 target that exists in ortex's .cargo/config.toml.

rustup target add aarch64-unknown-linux-gnu
export CARGO_BUILD_TARGET=aarch64-unknown-linux-gnu

The error is then in cross-compilation fun-time land:

error: linking with `cc` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/home/lawik/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/home/lawik/.nerves/artifacts/nerves_toolchain_aarch64_nerves_linux_gnu-linux_x86_64-13.2.0/bin:/home/lawik/.asdf/installs/erlang/27.0.1/er
(..snip..)
 = note: /usr/bin/ld: /home/lawik/ccortex/_build/rpi4_dev/lib/ortex/native/ortex/aarch64-unknown-linux-gnu/release/deps/ortex.ortex.3ecc0e640d3743e-cgu.00.rcgu.o: Relocations in generic ELF (EM: 183)
          /usr/bin/ld: /home/lawik/ccortex/_build/rpi4_dev/lib/ortex/native/ortex/aarch64-unknown-linux-gnu/release/deps/ortex.ortex.3ecc0e640d3743e-cgu.00.rcgu.o: Relocations in generic ELF (EM: 183)
          /usr/bin/ld: /home/lawik/ccortex/_build/rpi4_dev/lib/ortex/native/ortex/aarch64-unknown-linux-gnu/release/deps/ortex.ortex.3ecc0e640d3743e-cgu.00.rcgu.o: Relocations in generic ELF (EM: 183)
(.. snip because repeats a lot ..)

My guess is that it is using my local cc and ld instead of the toolchain. But I don't have enough of a grasp on that stuff and Cargo to figure out how to get it right.

lawik avatar Jan 03 '25 08:01 lawik

You may need to set CC and CXX env vars to your cross compiler toolchain compilers? I'm not very familiar with how nerves does cross compilation, admittedly.

mortont avatar Jan 06 '25 20:01 mortont

I believe those should be set but I'm not sure they survive Rustler -> Cargo and friends.

lawik avatar Jan 06 '25 20:01 lawik

Abelino is our savior and found a config that worked:

Goes in your config/target.exs:

config :ortex, Ortex.Native,
  target: "aarch64-unknown-linux-gnu",
  env: [
    {"CC", ""},
    {"CFLAGS", ""},
    {"CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER", "aarch64-nerves-linux-gnu-gcc"}
  ]

lawik avatar Feb 13 '25 07:02 lawik

I have a public example Nerves firmware repo here: abelino/ortex_fw.

For those interested in the details, cargo is ignoring the user specified target value via cargo --target or via CARGO_BUILD_TARGET if the following environment variables are defined: CC and CFLAGS. So when either of those two vars exist in ENV, then cargo will ignore user provided target value and assume TARGET is the same as HOST. The trick here is to override CC and CFLAGS with an empty string and allow cargo to pick the user provided TARGET. I have yet to dive into why cargo behaves this way. I will eventually look into it.

Nerves also updates the PATH with the bin location for aarch64-nerves-linux-gnu-gcc, so we don't need to provide the full path to the linker, just its name and can use CARGO_TARGET_<triple>_LINKER for that.

abelino avatar Feb 13 '25 19:02 abelino