grpc-java icon indicating copy to clipboard operation
grpc-java copied to clipboard

Add riscv64 support

Open 17999824wyj opened this issue 3 months ago • 11 comments

Why

Feat: #12316

Others

Specially, in 1.55.x, maybe we should patch buildscripts/kokoro/linux_artifacts.sh in another way, but I don't know whether I need to make another PR to support that.

If you have any issues with this library in conjunction with "protoc-jar" under RISC-V64, you can refer to the blog: links

or repo to get that resource: links

17999824wyj avatar Aug 28 '25 02:08 17999824wyj

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: 17999824wyj / name: wangyijia (72ebcd5c96621a56db7a1016c51876bce69f1480, 1d2491dd95c2e90110743d79023c49172796f8d7, 29bcce42f75587d52368381f8aa51b513b663d8b, 70ccae99fd687616f638d963de8a10f54b7bf8b7, 00882620c8e3174bc4e46b8791da8bd28e813b23)

The CI is failing:

+ GCC_ARCH=riscv64-unknown-linux-gnu
+ cmake .. -DCMAKE_CXX_STANDARD=14 -Dprotobuf_BUILD_TESTS=OFF -DBUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=/tmp/protobuf-cache/22.5/Linux-riscv64 -DABSL_INTERNAL_AT_LEAST_CXX17=0 -Dcrosscompile_ARCH=riscv64-unknown-linux-gnu -DCMAKE_TOOLCHAIN_FILE=/grpc-java/buildscripts/toolchain.cmake -B.
-- The C compiler identification is unknown
-- The CXX compiler identification is unknown
CMake Error at CMakeLists.txt:32 (project):
  The CMAKE_C_COMPILER:

    riscv64-unknown-linux-gnu-gcc

  is not a full path and was not found in the PATH.

  Tell CMake where to find the compiler by setting either the environment
  variable "CC" or the CMake cache entry CMAKE_C_COMPILER to the full path to
  the compiler, or to the compiler name if it is in the PATH.


CMake Error at CMakeLists.txt:32 (project):
  The CMAKE_CXX_COMPILER:

    riscv64-unknown-linux-gnu-g++

  is not a full path and was not found in the PATH.

  Tell CMake where to find the compiler by setting either the environment
  variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
  to the compiler, or to the compiler name if it is in the PATH.


-- Configuring incomplete, errors occurred!

ejona86 avatar Aug 28 '25 19:08 ejona86

It looks like some dependency packages were missing during the build, so I’ve added them in Dockerfile.multiarch.base.

As for the failed workflow, I can't reproduction it locally, that job in my repo has passed: links

Let me try to force push again to restart the workflow.

17999824wyj avatar Aug 29 '25 03:08 17999824wyj

As described in our CONTRIBUTING.md, please address issues with additional commits. The force-pushes require us to re-review the full PR instead of just looking at the changes. (Nothing to be done at this point right now, but for the future.)

ejona86 avatar Aug 29 '25 18:08 ejona86

Alright, I’ve double-checked: my repo’s CI passes all tests now, but the pr’s CI is failing... :(

17999824wyj avatar Sep 02 '25 01:09 17999824wyj

The only CI that should fail by this change is "Linux artifacts", which you can't run on our own.

It is currently failing, but for a very different reason (PROGRESS!! ??):

third_party/abseil-cpp/absl/log/libabsl_log_internal_message.a(log_message.cc.o): In function `.L0 ':
log_message.cc:(.text._ZNSt6atomicIbE23compare_exchange_strongERbbSt12memory_order[_ZNSt6atomicIbE23compare_exchange_strongERbbSt12memory_order]+0xbe): undefined reference to `__atomic_compare_exchange_1'
third_party/abseil-cpp/absl/log/libabsl_log_internal_globals.a(globals.cc.o): In function `.L0 ':
globals.cc:(.text._ZNSt6atomicIbE8exchangeEbSt12memory_order[_ZNSt6atomicIbE8exchangeEbSt12memory_order]+0x44): undefined reference to `__atomic_exchange_1'
collect2: error: ld returned 1 exit status

Looks like it is a protobuf or absl bug, but caused by riscv being a snowflake? https://github.com/protocolbuffers/protobuf/issues/14549 and https://github.com/abseil/abseil-cpp/issues/1561 The "fix" looks like a workaround. Unfortunately, that means just upgrading a component isn't likely to fix this.

ejona86 avatar Sep 02 '25 18:09 ejona86

Looks like it is a protobuf or absl bug, but caused by riscv being a snowflake? protocolbuffers/protobuf#14549 and abseil/abseil-cpp#1561 The "fix" looks like a workaround. Unfortunately, that means just upgrading a component isn't likely to fix this.

There is a possibility this is caused by a particular version of GCC or the like. If you don't see this on your machine, then maybe Ubuntu 18.04 (multiarch image) has the trouble but Ubuntu 20.04 or 22.04 won't?

ejona86 avatar Sep 02 '25 19:09 ejona86

There is a possibility this is caused by a particular version of GCC or the like. If you don't see this on your machine, then maybe Ubuntu 18.04 (multiarch image) has the trouble but Ubuntu 20.04 or 22.04 won't?

Both Ubuntu 18.04 and 20.04 are out of their regular support window. I'll need to cross-check with the glibc version on Debian, but I might just upgrade those images and see if this problem fixes itself.

ejona86 avatar Sep 03 '25 04:09 ejona86

Interesting....

It passed and what I do was just back to the HEAD^

17999824wyj avatar Sep 03 '25 13:09 17999824wyj

The Linux Artifacts CI hadn't run yet. I have to start it for you. I've started it.

ejona86 avatar Sep 03 '25 14:09 ejona86

So sorry about that, I found a bug in my code.

diff --git a/compiler/check-artifact.sh b/compiler/check-artifact.sh
index a2632be81..4b78beeb5 100755
--- a/compiler/check-artifact.sh
+++ b/compiler/check-artifact.sh
@@ -61,10 +61,10 @@ checkArch ()
         assertEq "$format" "elf64-x86-64" $LINENO
       elif [[ "$ARCH" == aarch_64 ]]; then
         assertEq "$format" "elf64-little" $LINENO
-      elif [[ "$ARCH" == loongarch_64 ]]; then
-        echo $format
       elif [[ "$ARCH" == riscv64 ]]; then
         assertEq "$format" "elf64-littleriscv" $LINENO
+      elif [[ "$ARCH" == loongarch_64 ]]; then
+        echo $format
        assertEq "$format" "elf64-loongarch" $LINENO
       elif [[ "$ARCH" == ppcle_64 ]]; then
         format="$(powerpc64le-linux-gnu-objdump -f "$1" | grep -o "file format .*$" | grep -o "[^ ]*$")"

The reason was that I used a patch-file to load my changes, but my changes are based on 1.5x.x. When I patched it in 1.7x.x, it went wrong because line-number location. So sorry about that.

17999824wyj avatar Sep 04 '25 16:09 17999824wyj