arrow-site icon indicating copy to clipboard operation
arrow-site copied to clipboard

DataFusion Substrait blog post

Open andygrove opened this issue 2 years ago • 11 comments

andygrove avatar Feb 22 '23 14:02 andygrove

@jdye64 @nseekhao fyi

andygrove avatar Feb 22 '23 14:02 andygrove

The build scripts seem to be broken

https://github.com/apache/arrow-site/actions/runs/4243705123/jobs/7376801156


17 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities
npx webpack --mode=production
rm -f javascript/main.js
node:internal/crypto/hash:71
  this[kHandle] = new _Hash(algorithm, xofLen);

alamb avatar Feb 23 '23 13:02 alamb

@alamb - it seems like the Webpack issues linked below may be causing the build failure:

  1. https://github.com/webpack/webpack/issues/14532#issuecomment-1434095919
  2. https://github.com/webpack/webpack/issues/13572
  3. https://github.com/webpack/webpack/pull/14306

My high level understanding is that in Node 18 (the build output shows Node.js v18.14.1 is being used), the md4 hashing algorithm is deprecated (more specifically, it seems that Node 18 uses OpenSSL 3.0, and md4 is deprecated in OpenSSL 3.0) and the version of Webpack used by apache/arrow-site (v5.21.2) seems to default to using md4.

Webpack v5.61.0 added a WASM md4 implementation as a fallback. However, the advice in 1. recommends setting output.hashFunction in the Webpack config to use an alternative hashing algorithm instead. Specifically, it recommends using xxhash64 (which is planned to be the default hashing algorithm when Webpack 6 is released).

So, to summarize, it seems that a combination of upgrading to the latest version of Webpack and switching over to xxhash64 may resolve this issue.

kevingurney avatar Feb 28 '23 11:02 kevingurney

After some more investigation, I discovered that the default Node.js version was switched to 18 for the ubuntu-latest GitHub Actions runner image on February 13th. This appears to explain why this build failure started appearing a few weeks ago.

Given this information, an alternative approach to the one detailed in my previous comment would be to pin the Node.js version used by the GitHub Actions runner to version 16 for the setup-node action. Of course, this would mean we would be continuing to rely on an outdated version of Node.js, which doesn't seem ideal in the long term.

kevingurney avatar Feb 28 '23 11:02 kevingurney

I've captured this issue as a Bug with the Component set to [Website] in #34379 in the apache/arrow project.

kevingurney avatar Feb 28 '23 12:02 kevingurney

After some more investigation, I discovered that...

Personally, I would recommend that we switch away from the proprietary, arbitrarily changing, and non-locally debuggable ubuntu-latest and to the official locally-runnable docker image ubuntu:latest where possible as we did here:

https://github.com/apache/arrow-ballista/blob/b61cfbf54705f4cbfcbc7103f87509e49cd01fda/.github/workflows/rust.yml#L79

avantgardnerio avatar Feb 28 '23 14:02 avantgardnerio

Thanks @avantgardnerio! Agreed - this seems like a good solution. It would be great to not have to worry about things changing suddenly in ubuntu-latest. I've added this to the list of potential workarounds in #34379.

kevingurney avatar Feb 28 '23 14:02 kevingurney

Thanks @avantgardnerio! Agreed - this seems like a good solution. It would be great to not have to worry about things changing suddenly in ubuntu-latest. I've added this to the list of potential workarounds in https://github.com/apache/arrow-site/issues/325.

Awesome -- could someone make a PR to this repo?

alamb avatar Feb 28 '23 16:02 alamb

@alamb - yes, I just assigned the issue to myself and will make a PR

kevingurney avatar Feb 28 '23 17:02 kevingurney

Update: I've opened pull request #326 to address the build issues.

kevingurney avatar Mar 02 '23 11:03 kevingurney

The doc build has been fixed -- do we still want to publish this content?

alamb avatar Mar 31 '23 12:03 alamb