build icon indicating copy to clipboard operation
build copied to clipboard

RISC-V to Experimental tier

Open drom opened this issue 3 years ago • 56 comments

What is the problem this feature will solve?

Support growing number of RISC-V based devices

What is the feature you are proposing to solve the problem?

Provide RISC-V release of node-js

What alternatives have you considered?

No response

drom avatar Feb 16 '22 02:02 drom

@nodejs/build @nodejs/releasers

Trott avatar Feb 16 '22 02:02 Trott

Without knowing enough about the current RISC-V landscape, it would be interesting to hear about distributions, operating systems and devices that are common. Are there lowest common denominators that we can leverage? Or is it a scattered landscape where we'd need to be shipping 5 different binaries to keep everyone happy? Already having musl as a competitor to glibc is slightly tricky to support on x64.

Beyond that - how would we test this in our CI system? Then how do we build binaries? Are we emulating everything? Is emulation a straightforward proposition? Are there any infra providers that might be convinced to provide RISC-V hardware?

If someone just wants to have a stab at it, you could try and contribute a recipe to unofficial-builds @ https://github.com/nodejs/unofficial-builds/; using a container on x64 to cross-compile a binary to RISC-V somehow. That'd probably be where such a binary would end up without significant CI and people resources anyway.

rvagg avatar Feb 16 '22 04:02 rvagg

Hi @rvagg great to hear from you.

Without knowing enough about the current RISC-V landscape, it would be interesting to hear about distributions, operating systems and devices that are common. Are there lowest common denominators that we can leverage? Or is it a scattered landscape where we'd need to be shipping 5 different binaries to keep everyone happy? Already having musl as a competitor to glibc is slightly tricky to support on x64.

Major Linux distributions declared RISC-V support: openSUSE, Debian/Ubuntu, Arch, Gentoo, Fedora,...

Most of them converges on: RV64GC baseline.

Beyond that - how would we test this in our CI system? Then how do we build binaries? Are we emulating everything? Is emulation a straightforward proposition? Are there any infra providers that might be convinced to provide RISC-V hardware?

I represent SiFive and we would gladly provide real Hardware for regression and CI here is the platform: https://www.sifive.com/boards/hifive-unmatched

If someone just wants to have a stab at it, you could try and contribute a recipe to unofficial-builds @ https://github.com/nodejs/unofficial-builds/; using a container on x64 to cross-compile a binary to RISC-V somehow. That'd probably be where such a binary would end up without significant CI and people resources anyway.

That is sounds awesome!

drom avatar Feb 16 '22 08:02 drom

Hi @drom I'm very interested in this. The question would be how such boards could be hosted if we had them - do you have hosting for boards (Possibly via the PLCT lab?) or would you just be providing the hardware to the project that we'd have to host? Cross-compiling would likely be the preferred option with the testing completed on real boards, although with ccache enabled it could be done natively on the boards too in a reasonably timely manner (it's under 10 minutes for a "full" rebuild when the source hasn't changed using the onboard SD card for storage)

@rvagg I've used RISC-V systems with Ubuntu, Debian and Fedora and all of those are usable. As Aliaksei says there is a fairly good baseline that we can target for this that most boards that anyone would be likely to run on support so that shouldn't be a concern at this time. FYI I have previously built a RISC-V port of Node.js from https://github.com/v8-riscv/node on a SiFive Unleashed board that I have for another project so this is very much feasible (I just haven't had the cycles to push it forward since there's been little to drive it, but with SiFive's interest I'm sure we can make it happen and I'm very keen to help push it!)

sxa avatar Feb 16 '22 10:02 sxa

@sxa Great to hear from you.

Hi @drom I'm very interested in this. The question would be how such boards could be hosted if we had them - do you have hosting for boards (Possibly via the PLCT lab?) or would you just be providing the hardware to the project that we'd have to host?

We (SiFive) are open for any of these options. What ever is more convenient, comfortable for the Node project. Sending some "Unmatched" boards to a specific location. We can host dedicated Nodes. I have dedicated board where I will be trying to run NodeJS right on my table, and can share. I just want this project to be successful.

Cross-compiling would likely be the preferred option with the testing completed on real boards, although with ccache enabled it could be done natively on the boards too in a reasonably timely manner (it's under 10 minutes for a "full" rebuild when the source hasn't changed using the onboard SD card for storage)

Great. I will try both options and report.

@rvagg I've used RISC-V systems with Ubuntu, Debian and Fedora and all of those are usable. As Aliaksei says there is a fairly good baseline that we can target for this that most boards that anyone would be likely to run on support so that shouldn't be a concern at this time. FYI I have previously built a RISC-V port of Node.js from https://github.com/v8-riscv/node on a SiFive Unleashed board that I have for another project so this is very much feasible (I just haven't had the cycles to push it forward since there's been little to drive it, but with SiFive's interest I'm sure we can make it happen and I'm very keen to help push it!)

Thank you for your help.

drom avatar Feb 16 '22 16:02 drom

Great. I will try both options and report.

I'm also now trying to build the current nodejs/node master branch on my unleashed to see what state it's in

sxa avatar Feb 16 '22 18:02 sxa

Great. I will try both options and report.

I'm also now trying to build the current nodejs/node master branch on my unleashed to see what state it's in

Took about eight hours to build on the unleashed board using --with-intl=none --verbose --without-node-snapshot --shared-openssl (Those were the options I used in the past - we'd likely want to make sure we can remove some of those before declaring a release, although I would imagine that wouldn't block us continuing as an unofficial build)

sxa@unleashed-sid:~/node-main$ ./node --version
v18.0.0-pre
sxa@unleashed-sid:~/node-main$

sxa avatar Feb 17 '22 10:02 sxa

Also builds ok natively in about ten hours with just --shared-openssl from those above options, which is good. Attempting to build with the mebedded openssl fails due to it trying to use invalid compiler options:

cc: error: unrecognized command-line option '-m64'

but a shared openssl for a first pass shouldn't be too much of a problem :-)

For reference, a sample command line being used for the attempt at compiling openssl is as follows - it has configured itself for x64 instead of riscv64 which probably isn't too hard to resolve: cc -o /home/sxa/node-main/out/Release/obj.target/openssl/deps/openssl/openssl/ssl/bio_ssl.o ../deps/openssl/openssl/ssl/bio_ssl.c '-DV8_DEPRECATION_WARNINGS' '-DV8_IMMINENT_DEPRECATION_WARNINGS' '-D_GLIBCXX_USE_CXX11_ABI=1' '-DNODE_OPENSSL_HAS_QUIC' '-D__STDC_FORMAT_MACROS' '-DOPENSSL_NO_PINSHARED' '-DOPENSSL_THREADS' '-DOPENSSL_NO_HW' '-DOPENSSL_API_COMPAT=0x10100001L' '-DSTATIC_LEGACY' '-DNDEBUG' '-DOPENSSL_USE_NODELETE' '-DL_ENDIAN' '-DOPENSSL_BUILDING_OPENSSL' '-DAES_ASM' '-DBSAES_ASM' '-DCMLL_ASM' '-DECP_NISTZ256_ASM' '-DGHASH_ASM' '-DKECCAK1600_ASM' '-DMD5_ASM' '-DOPENSSL_BN_ASM_GF2m' '-DOPENSSL_BN_ASM_MONT' '-DOPENSSL_BN_ASM_MONT5' '-DOPENSSL_CPUID_OBJ' '-DOPENSSL_IA32_SSE2' '-DPADLOCK_ASM' '-DPOLY1305_ASM' '-DSHA1_ASM' '-DSHA256_ASM' '-DSHA512_ASM' '-DVPAES_ASM' '-DWHIRLPOOL_ASM' '-DX25519_ASM' '-DOPENSSL_PIC' '-DMODULESDIR="/home/sxa/node-main/out/Release/obj.target/deps/openssl/lib/openssl-modules"' '-DOPENSSLDIR="/home/sxa/node-main/out/Release/obj.target/deps/openssl"' '-DENGINESDIR="/dev/null"' '-DTERMIOS' -I../deps/openssl/openssl -I../deps/openssl/openssl/include -I../deps/openssl/openssl/crypto -I../deps/openssl/openssl/crypto/include -I../deps/openssl/openssl/crypto/modes -I../deps/openssl/openssl/crypto/ec/curve448 -I../deps/openssl/openssl/crypto/ec/curve448/arch_32 -I../deps/openssl/openssl/providers/common/include -I../deps/openssl/openssl/providers/implementations/include -I../deps/openssl/config -I../deps/openssl/config/archs/linux-x86_64/asm -I../deps/openssl/config/archs/linux-x86_64/asm/include -I../deps/openssl/config/archs/linux-x86_64/asm/crypto -I../deps/openssl/config/archs/linux-x86_64/asm/crypto/include/internal -I../deps/openssl/config/archs/linux-x86_64/asm/providers/common/include -pthread -Wall -Wextra -Wno-unused-parameter -Wa,--noexecstack -Wall -O3 -pthread -m64 -Wall -O3 -Wno-missing-field-initializers -Wno-old-style-declaration -O3 -fno-omit-frame-pointer -MMD -MF /home/sxa/node-main/out/Release/.deps//home/sxa/node-main/out/Release/obj.target/openssl/deps/openssl/openssl/ssl/bio_ssl.o.d.raw -c

sxa avatar Feb 18 '22 10:02 sxa

v16.x and v17.x can build. but 17.x ocurs large segment fault. May be some patch of v8 not merge.

luyahan avatar Feb 18 '22 11:02 luyahan

Also builds ok natively in about ten hours with just --shared-openssl from those above options, which is good.

Wow. @sxa you just killing it! :+1: When you say "natively" is it on x86-64 or in RV64 QEMU?

drom avatar Feb 18 '22 16:02 drom

Wow. @sxa you just killing it! +1 When you say "natively" is it on x86-64 or in RV64 QEMU?

Neither :-) It's on an Unleashed board that I have access to for another project (so sadly no M.2 drives unlike the Unmatched ones). I've used an RV64 qemu in the past but that would take days to build Node.js I expect!

sxa avatar Feb 18 '22 18:02 sxa

I moved the issue to nodejs/build as I think it's the most relevant repository to continue this discussion (since there'll be work around setting up infrastructure). As discussed on Slack we can start with setting up one Jenkins agent and having a test job running on it. Once we have that setup we can discuss how many nodes would be convenient for both the Node.js project and for SiFive.

Question for @nodejs/build-infra: would we prefer RISC-V nodes being hosted by SiFive, or do we want to host those ourselves?

mmarchini avatar Feb 19 '22 18:02 mmarchini

Assuming we'd have full access to them I think having it hosted by SiFive would generally be the preferred option assuming we had full access to it, although I could probably host one too. My concern would be that I'm not sure it's possible to recover the Unmatched boards if they crash for whatever reason without someone physically pushing the reset button on them unlike most cloud machines so if we had any issues with it that could be a problem.

sxa avatar Feb 20 '22 22:02 sxa

Managed hosting where we have reason to trust the provider is the preferred option, by far. I've tried to minimise our reliance on infra hosted by individuals wherever possible and it would be good to reduce that even further over time.

Let's try and avoid signing up our infra for more boxes on desks. It's a big weak point in terms of reliability and, to some degree security.

rvagg avatar Feb 20 '22 23:02 rvagg

Also builds ok natively in about ten hours with just --shared-openssl

A brief update on this - while it does build ok with the ICU support, it does seem to get stuck when building the test-doc target, so the ICU support is probably not working correctly on this platform.

sxa avatar Feb 21 '22 12:02 sxa

I'm looking to get this building with a cross-compiler although I'm currently getting compiler failures when doing so. This PR doesn't resolve the issue I'm seeing. I'm going to build natively on the RISC-V board to see if it's a specific issue with the cross compile. If it can be convinced to work (and it should do) then we can likely get it added as an unofficial build using the existing system fairly quickly afterwards.

sxa avatar Feb 24 '22 17:02 sxa

I've got a clean build with a cross compiler now so we should be able to incorporate this into a suitable dockerfile now. I've got one based on the arm_cross Dockerfile that will do the job using a tarball of a cross-compiler I use elsewhere for now.

sxa avatar Feb 25 '22 16:02 sxa

"Unofficial build" support is now live (although the server that's building is is creaking a bit under space limitations) so we have builds at https://unofficial-builds.nodejs.org/download/release/v17.9.0/ so we just need to get that formalised in the docs.

The builds don't pass all the tests (mainly failures in crypto - perhaps due to openssl-no-asm - and addons tests) but it fundamentally works so I'm comfortable that it's good enough for experimental. It is built with full-icu (there's a version of 17.7.1 without ICU) so may well have an issue with the hangs described above with the doc target mentioned previously but we should consider whether we wish to perform some formal testing on this on real boards other than the ones I'm using elsewhere for now :-)

sxa avatar Apr 16 '22 10:04 sxa

fancy, good work @sxa .. I guess we need to deal with those space problems

rvagg avatar Apr 19 '22 02:04 rvagg

I see RISC-V now included in the unofficial builds, great! Seems that the space issues from April are resolved. What steps are needed, and what is the progress of, moving the RISC-V support from unofficial to official?

olof-nord avatar Sep 25 '22 21:09 olof-nord

The next step would be to look at getting the build passing all the tests that come with node, since they weren't all passing last time I tried. SInce this is pretty much a spare time project for me I haven't had too much time to look at progressing that.

sxa avatar Sep 26 '22 09:09 sxa

FYI There's a build break on 19.0.0 so there is no unofficial build for RISC-V in the latest version yet: https://github.com/nodejs/node/issues/45059

sxa avatar Oct 18 '22 19:10 sxa

Ref the last comment, 19.1.0 has RISC-V support back again at https://unofficial-builds.nodejs.org/download/release/v19.1.0/

While it's not part of the build pipelines yet (Performance would likely be an issue) I have connected in a RISC-V Ubuntu system into the CI and set up a job to build from the main branch daily and run the tests against it if any committers are interested in keeping an eye on it and/or fixing any failures. The job is at https://ci.nodejs.org/job/sxa-rvnodetest/ (On it's first run through as I write this but it should complete the build ok.

It would be good to get openssl support building without --openssl-no-asm at some point.

sxa avatar Nov 24 '22 19:11 sxa

Great update!

With the CI it builds way (way) faster than on the HiFive Unmatched (also with Ubuntu). I have done a few builds (with --shared-openssl), and the build times are on an average 8h. Fastest builds I could get were around 2h with --without-npm --without-corepack --without-ssl --without-node-options --without-intl --without-inspector - nothing close to 30 minutes like with the Jenkins you linked.

With SSL being the seemingly biggest task open, is what is needed for Node.js risc-v support to provide the assembly for OpenSSL? What is needed to organise this? Is there any documentation, further information what needs doing?

And on a side note, what is the difference of building Node.js with --openssl-no-asm vs --shared-openssl?

olof-nord avatar Nov 24 '22 23:11 olof-nord

With the CI it builds way (way) faster than on the HiFive Unmatched (also with Ubuntu)

Actually it doesn't - it was a labelling issue on my part and the first builds I left running yesterday built on an x64 system instead of RISC-V 😳 The system that has been added to the Jenkins CI is an Unmatched board so you should see it running with a similar performance to yours. I've had a build complete in about 5 minutes using ccache (second time through obviously) and that's what I'll leave it using, but obviously the times will vary depending on what changes have gone in on any given day which affects how much of the cache can be reused. We could cross compile from an x64 server (I've done this before) and run the testing only on the RISC-V system if we wanted to speed things up, but for now since this is just going to be building daily rather than on every PR I'm going to keep it all native on the board.

what is needed for Node.js risc-v support to provide the assembly for OpenSSL? What is needed to organise this?

In terms of openssl it would be a case of running the build without any of the openssl options and see what breaks. It's been a while since I tried it but it was a fairly obvious compile failure as I recall and liklely not too hard to resolve. Note that https://github.com/nodejs/node/blob/1a83ad6a693f851199608ae957ac5d4f76871485/deps/openssl/config/Makefile#L18 may be relevant as it looks like the Makefiles in the node source tree currently don't expect to be building without no-asm on this platform.

The other thing that will need to be looked at is any test failures that show up in that job. I don't think there were too many the last time I ran it on my machine.

And on a side note, what is the difference of building Node.js with --openssl-no-asm vs --shared-openssl?

The Node.js codebase has it's own version of the openssl source in there (deps/openssl) which is the one we build with and statically link it into the node.js binary.

--openssl-no-asm tells the Node build process to run the openssl ./Configure with the no-asm parameter which tells it not to use any of openssl's integrated assembly language optimisations. So while it works, it will likely not

--shared-openssl tells the Node build process not to build the version of openssl in deps/openssl within the codebase, but to take the SSL header files and libraries from the operating system and build and link against those instead.

sxa avatar Nov 25 '22 16:11 sxa

Looks like we have five failures in the main test bucket on the first run

  • parallel/test-crypto-keygen (TIMEOUT)
  • parallel/test-fs-watch-recursive (TIMEOUT)
  • test-net-socket-connect-without-cb (ENOTFOUND localhost - potentially a machine specific problem)
  • parallel/test-tcp-wrap-listen (Also ENOTFOUND localhost)
  • parallel/test-tls-dhe (TIMEOUT) The other 3775 tests in the suite passed. In the second run, only the last three of those failed so the first two probably just need a larger timeout to pass consistently (and probably true of the third too)

sxa avatar Nov 26 '22 12:11 sxa

Additionally there are a few issues in the testing outside the main set of tests:

Linter checks:

Oops! Something went wrong! :(

ESLint: 8.28.0

/home/iojs/workspace/sxa-rvnodetest/tools/node_modules/eslint/node_modules/jsdoc-type-pratt-parser/dist/index.js:94
    const identifierStartRegex = /[$_\p{ID_Start}]|\\u\p{Hex_Digit}{4}|\\u\{0*(?:\p{Hex_Digit}{1,5}|10\p{Hex_Digit}{4})\}/u;
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

SyntaxError: Invalid regular expression: /[$_\p{ID_Start}]|\\u\p{Hex_Digit}{4}|\\u\{0*(?:\p{Hex_Digit}{1,5}|10\p{Hex_Digit}{4})\}/: Invalid property name in character class

Doc related tests

=== release test-doctool-html ===
Path: doctool/test-doctool-html
/home/iojs/workspace/sxa-rvnodetest/tools/doc/node_modules/highlight.js/lib/languages/xml.js:18
  const TAG_NAME_RE = regex.concat(/[\p{L}_]/u, regex.optional(/[\p{L}0-9_.-]*:/u), /[\p{L}0-9_.-]*/u);
                                   ^^^^^^^^^^^

SyntaxError: Invalid regular expression: /[\p{L}_]/: Invalid property name in character class
    at internalCompileFunction (node:internal/vm:74:18)

test-make-doc failure:

=== release test-make-doc ===
Path: doctool/test-make-doc
node:internal/process/esm_loader:108
    internalBinding('errors').triggerUncaughtException(
                              ^

AssertionError [ERR_ASSERTION]: false == true
    at file:///home/iojs/workspace/sxa-rvnodetest/test/doctool/test-make-doc.mjs:18:8
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25) {
  generatedMessage: true,
  code: 'ERR_ASSERTION',
  actual: false,
  expected: true,
  operator: '=='
}

Node.js v20.0.0-pre
Command: out/Release/node /home/iojs/workspace/sxa-rvnodetest/test/doctool/test-make-doc.mjs

TracedValue.Escaping(Object|Array):

[ RUN      ] TracedValue.EscapingObject
../test/cctest/test_traced_value.cc:83: Failure
Expected equality of these values:
  check
    Which is: "{\"a\":\"1\\u00E2\\u0082\\u00AC23\\\"\\u0001\\b\\f\\n\\r\\t\\\\\"}"
  string
    Which is: "{\"a\":\"1\xE2\x82\xAC" "23\\\"\\u0001\\b\\f\\n\\r\\t\\\\\"}"
    As Text: "{"a":"1€23\"\u0001\b\f\n\r\t\\"}"
With diff:
@@ -1,1 +1,2 @@
-{\"a\":\"1\\u00E2\\u0082\\u00AC23\\\"\\u0001\\b\\f\\n\\r\\t\\\\\"}
+{\"a\":\"1\xE2\x82\xAC" "23\\\"\\u0001\\b\\f\\n\\r\\t\\\\\"}"
    As Text: "{"a":"1€23\"\u0001\b\f
+\r\t\\"}

[  FAILED  ] TracedValue.EscapingObject (0 ms)
[ RUN      ] TracedValue.EscapingArray
../test/cctest/test_traced_value.cc:95: Failure
Expected equality of these values:
  check
    Which is: "[\"1\\u00E2\\u0082\\u00AC23\\\"\\u0001\\b\\f\\n\\r\\t\\\\\"]"
  string
    Which is: "[\"1\xE2\x82\xAC" "23\\\"\\u0001\\b\\f\\n\\r\\t\\\\\"]"
    As Text: "["1€23\"\u0001\b\f\n\r\t\\"]"
With diff:
@@ -1,1 +1,2 @@
-[\"1\\u00E2\\u0082\\u00AC23\\\"\\u0001\\b\\f\\n\\r\\t\\\\\"]
+[\"1\xE2\x82\xAC" "23\\\"\\u0001\\b\\f\\n\\r\\t\\\\\"]"
    As Text: "["1€23\"\u0001\b\f
+\r\t\\"]

[  FAILED  ] TracedValue.EscapingArray (0 ms)

sxa avatar Nov 26 '22 12:11 sxa

@sxa I suspect those regexp failures are because you're compiling without ICU.

richardlau avatar Nov 26 '22 15:11 richardlau

Great to list out the test errors, re the timeout adjustments there has been some work carried out by @ArchFeh already to accommodate less powerful riscv testing devices.

I looked into the OpenSSL assembly language optimisations for RISC-V, and there is some initial work for aes and modes/ghash in place, but I could not find it included in any release yet.

From what I can tell, Node.js additionally needs bn, chacha, ec, poly1305 as well as sha group (keccak1600, sha1, sha256 as well as sha512) asm for an architecture to be complete. Is that correct?

I do not know how much effort is needed here, but it looks like there is quite a lot of work left before this is done. Assuming all ciphers listed are needed, the current status is 20% (2 from 10) complete.

olof-nord avatar Nov 27 '22 13:11 olof-nord

Yeah I hadn't realised when I wrote that statement earlier that the upstream openssl likely does not have the asm support in it, so that's not an issue that's specific to the Node.js build process (other than the fact we configure it by default without no-asm)

From what I can tell, Node.js additionally needs bn, chacha, ec, poly1305 as well as sha group (keccak1600, sha1, sha256 as well as sha512) asm for an architecture to be complete. Is that correct?

@richardlau @mhdawson Do you know if that is correct or if it can be considered a "complete" port while it's still using no-asm?

sxa avatar Nov 28 '22 12:11 sxa