OnnxStream icon indicating copy to clipboard operation
OnnxStream copied to clipboard

Risc-V compilation fails with "c++: error: '-march=native': ISA string must begin with rv32 or rv64"

Open xawos opened this issue 1 year ago • 6 comments

Full error after executing this step using CPU SG2000 on the Milk-V Duo S.:

[ 33%] Building CXX object CMakeFiles/sd.dir/sd.cpp.o
c++: error: '-march=native': ISA string must begin with rv32 or rv64
gmake[2]: *** [CMakeFiles/sd.dir/build.make:79: CMakeFiles/sd.dir/sd.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:87: CMakeFiles/sd.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2

In file OnnxStream/src/build/CMakeFiles/sd.dir/flags.make the 9th line reads CXX_FLAGS = -std=gnu++20 -O3 -march=native. I edited it even if it was autogenerated and told me not to edit it, sue me.

Changing that line to CXX_FLAGS = -std=gnu++20 -O3 -march=rv64imafdc_zicsr_zifencei makes it compile successfully (after adding a swapfile as the device only has 512MB of RAM) with some warnings:

debian@duos:~/OnnxStream/src/build$ cmake --build . --config Release
[ 33%] Building CXX object CMakeFiles/sd.dir/onnxstream.cpp.o
/home/debian/OnnxStream/src/onnxstream.cpp: In destructor "onnxstream::XnnPack::~XnnPack()":
/home/debian/OnnxStream/src/onnxstream.cpp:520:21: warning: "throw" will always call "terminate" [-Wterminate]
  520 |                     throw std::runtime_error("failed to delete operator");
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/debian/OnnxStream/src/onnxstream.cpp:520:21: note: in C++11 destructors default to "noexcept"
[ 66%] Linking CXX executable sd
[100%] Built target sd

It does unfortunately explode when executed, as follows:

debian@duos:~$ ./sd
----------------[start]------------------
positive_prompt: a photo of an astronaut riding a horse on mars
negative_prompt: ugly, blurry
output_png_path: ./result.png
steps: 10
seed: 799744
----------------[prompt]------------------
Token: "a</w>"
Token: "photo</w>"
Token: "of</w>"
Token: "an</w>"
Token: "astronaut</w>"
Token: "riding</w>"
Token: "a</w>"
Token: "horse</w>"
Token: "on</w>"
Token: "mars</w>"
Illegal instruction

Last few lines of strace:

futex(0x2aca37754c, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=8992, tv_nsec=90815628}, FUTEX_BITSET_MATCH_ANY) = 0
futex(0x2aca377560, FUTEX_WAKE_PRIVATE, 1) = 0
brk(0x2aca3d2000)                       = 0x2aca3d2000
munmap(0x3fbaf3f000, 151785472)         = 0
brk(0x2aca40c000)                       = 0x2aca40c000
futex(0x2ac9b73e98, FUTEX_WAKE_PRIVATE, 2147483647) = 0
--- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPC, si_addr=0x2ac9b2aba0} ---
+++ killed by SIGILL +++
Illegal instruction

I see you mention in #10 that the compilation hangs in certain circumstances and also Illegal Instruction error in Termux, in this case both happen but I don't have any occurrence of -march=native.

Not sure if I should continue troubleshooting this, imma throw the towel for today, opening this issue to help someone else that might find this same error and suffer with me in the meantime.

p.s. love this project, thanks ❤️

xawos avatar Dec 11 '24 00:12 xawos

hi,

as you wrote, to solve the problem of GCC hanging, simply add some swap space.

For the "Illegal Instruction" problem, can you try deleting the build directory and rerunning cmake by specifying MAX_SPEED=OFF? This should build a binary using only generic RISC-V instructions (in GCC, this is equivalent to not specifying march at all).

Vito

vitoplantamura avatar Dec 11 '24 19:12 vitoplantamura

Thanks for the suggestion, I tried yesterday but I didn't delete the files, for some reason I thought it was sufficient. Just tried now after your suggestion by deleting and recreating the build folder and it gives the same error unfortunately, adding the output of the parameter --ops-printf:

debian@duos:~$ ./sd --ops-printf
----------------[start]------------------
positive_prompt: a photo of an astronaut riding a horse on mars
negative_prompt: ugly, blurry
output_png_path: ./result.png
steps: 10
seed: 821248
----------------[prompt]------------------
Token: "a</w>"
Token: "photo</w>"
Token: "of</w>"
Token: "an</w>"
Token: "astronaut</w>"
Token: "riding</w>"
Token: "a</w>"
Token: "horse</w>"
Token: "on</w>"
Token: "mars</w>"
#0) Reshape (Reshape_113)
#1) Gather (Gather_114)
#2) Add (Add_116)
Illegal instruction

xawos avatar Dec 11 '24 20:12 xawos

Thanks for your patience: from the --ops-printf output, it's clear that this is an XNNPACK issue.

Can you try specifying "-DXNNPACK_ENABLE_RISCV_VECTOR=OFF" when you run cmake (when building XNNPACK)?

Ie: cmake -DXNNPACK_BUILD_TESTS=OFF -DXNNPACK_BUILD_BENCHMARKS=OFF -DXNNPACK_ENABLE_RISCV_VECTOR=OFF ..

PS: please remove and recreate the XNNPACK "build" directory, before executing the command.

Thank you, Vito

vitoplantamura avatar Dec 11 '24 21:12 vitoplantamura

Awesome, thanks, we are making progress, although it still fails generation on the Diffusion step with an error coming from this line.

Full pastebin here, last relevant lines:

#666) ReduceMean (ReduceMean_859)
#667) Add (Add_861)
#668) Sqrt (Sqrt_862)
#669) Div (Div_863)
#670) Mul (Mul_864)
#671) Add (Add_865)
----------------[diffusion]---------------
step:0          #0) Unsqueeze (/time_proj/Unsqueeze)
#1) Mul (/time_proj/Mul)
=== ERROR === failed to create multiply operation

Commands i've run for XNNPACK, as you asked:

cmake -DXNNPACK_BUILD_TESTS=OFF -DXNNPACK_BUILD_BENCHMARKS=OFF -DXNNPACK_ENABLE_RISCV_VECTOR=OFF ..
cmake --build . --config Release

and for OnnxStream:

cmake -DMAX_SPEED=OFF -DOS_LLM=OFF -DOS_CUDA=OFF -DXNNPACK_DIR=/home/debian/XNNPACK ..
cmake --build . --config Release

xawos avatar Dec 12 '24 00:12 xawos

So yes, using --rpi does solve it and generates images, I'm wondering if it's due to this XNNPACK issue though

xawos avatar Dec 12 '24 00:12 xawos

Awesome!

The need to specify "--rpi" is due to the fact that this particular processor does not support native fp16 arithmetic.

Vito

vitoplantamura avatar Dec 12 '24 08:12 vitoplantamura