bigtop icon indicating copy to clipboard operation
bigtop copied to clipboard

BIGTOP-3921: Add ZSTD Codec Support for hadoop

Open rzuo opened this issue 1 year ago • 10 comments

Description of PR

Add ZSTD codec compiled to hadoop native library by default.

How was this patch tested?

For code changes:

  • [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'BIGTOP-3638. Your PR title ...')?
  • [x] Make sure that newly added files do not have any licensing issues. When in doubt refer to https://www.apache.org/licenses/

rzuo avatar Mar 28 '23 07:03 rzuo

We need fix for toolchain too to support this on all supported distros/platforms.

iwasakims avatar Mar 28 '23 07:03 iwasakims

Is the Zstandard licensed under the one we can bundle it by default? https://www.apache.org/legal/resolved.html

iwasakims avatar Mar 28 '23 08:03 iwasakims

We need fix for toolchain too to support this on all supported distros/platforms.

In bigtop_toolchain/manifests/packages.pp, libzstd-devel has been included, so anything else need to modify?

Thanks Robin

rzuo avatar Mar 28 '23 08:03 rzuo

Is the Zstandard licensed under the one we can bundle it by default? https://www.apache.org/legal/resolved.html

By my understanding, this fix only make the hadoop native library compiled with pre-installed ZSTD binary, no ZSTD (GPL-2.0 license) code included in this repository. If no zstd library present, this script will compile as before.

Thanks Robin

rzuo avatar Mar 28 '23 08:03 rzuo

Is the Zstandard licensed under the one we can bundle it by default? https://www.apache.org/legal/resolved.html

zsdt's license is BSD (https://github.com/facebook/zstd/blob/dev/LICENSE). As far as I know, Apache 2.0 license is compatible with BSD. (Pls correct me if I had misunderstanding)

guyuqi avatar Mar 28 '23 08:03 guyuqi

Is the Zstandard licensed under the one we can bundle it by default? https://www.apache.org/legal/resolved.html

zsdt's license is BSD (https://github.com/facebook/zstd/blob/dev/LICENSE). As far as I know, Apache 2.0 license is compatible with BSD. (Pls correct me if I had misunderstanding)

aha, you are correct, i miss understood that zstd is GPL licensed.

rzuo avatar Mar 28 '23 08:03 rzuo

In bigtop_toolchain/manifests/packages.pp, libzstd-devel has been included, so anything else need to modify?

Oops. I forgot about BIGTOP-3535.

iwasakims avatar Mar 28 '23 13:03 iwasakims

By my understanding, this fix only make the hadoop native library compiled with pre-installed ZSTD binary, no ZSTD (GPL-2.0 license) code included in this repository.

We are publishing pre built packages for users' convenience. We can not link it against pre-built library if it is distributed under GPL. Zstandard seems to be dual-licensed under BSD and GPLv2. I'm not confident about which is applied to pre-built libzstd.so redistributed by OS distros.

iwasakims avatar Mar 28 '23 13:03 iwasakims

By my understanding, this fix only make the hadoop native library compiled with pre-installed ZSTD binary, no ZSTD (GPL-2.0 license) code included in this repository.

We are publishing pre built packages for users' convenience. We can not link it against pre-built library if it is distributed under GPL. Zstandard seems to be dual-licensed under BSD and GPLv2. I'm not confident about which is applied to pre-built libzstd.so redistributed by OS distros.

I could completely understand your concerns for LICENSE. The terms of the license are too complicated to comprehend well. Fortunately, it appears that Apache Spark does support ZSTD and Spark guys had updated LICENSE and NOTICE in SPARK-24654. They provided the extra license statement in source tree: LICENSE-binary-zstd. Shall we also create the similar LICENSE file for zstd binary in Bigtop?

What do you think of it? @iwasakims

guyuqi avatar Apr 17 '23 08:04 guyuqi

Spark bundles pre-built binary of zstd-jni (containing zstd as part of it). I'm concerning about the license of libzstd binary distributed by OS distros here.

iwasakims avatar Apr 18 '23 02:04 iwasakims