accumulo icon indicating copy to clipboard operation
accumulo copied to clipboard

Propose to support building and testing Accumulo on ARM64 platform

Open liusheng opened this issue 3 years ago • 7 comments

Is your feature request related to a problem? Please describe. Currently, there are many softwares have support running on ARM64 platform. We have also done many efforts about making big-data projects support ARM64 platform. For an example, Hadoop has published ARM64 platform specific packages: https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.3.0/hadoop-3.3.0-aarch64.tar.gz

and also have ARM specific CI job configured: https://ci-hadoop.apache.org/job/Hive-trunk-linux-ARM/

It would be better also to enable Accumulo support ARM64 platform

Describe the solution you'd like

  1. Try to make Accumulo can sucessfully build on ARM64 server
  2. Try to run Accumulo on ARM64 server and make all the test cases passed
  3. Add ARM specific CI jobs to guarantee ARM platform support in development.

Describe alternatives you've considered N/A

Additional context N/A

liusheng avatar Jan 26 '21 07:01 liusheng

@liusheng While most users probably deploy Accumulo on x86_64 servers, Accumulo doesn't do a lot that could be considered architecture dependent, so we're probably not aware of any specific problems on ARM64 or other architectures.

Are you aware of any specific problems with building or running Accumulo on ARM64? If not, I'm not sure what there is to do be done here. If you are aware of such problems, it would be good to create specific bug reports or PRs for each problem.

If you'd like to discuss ARM64 support with other users who may be interested in that, I encourage you to discuss it on our user mailing list. If it becomes necessary to discuss large changes to support ARM64 (for smaller changes, a PR is fine), that discussion would be appropriate for our dev mailing list. Information about both mailing lists can be found at https://accumulo.apache.org/contact-us

Regarding the proposed work to run test cases and CI jobs on ARM64, I think it might be a little premature to devote effort into that, since we're a relatively small community compared to Hadoop and it's not yet clear there is a significant interest or volunteers available to perform that work.

ctubbsii avatar Jan 26 '21 19:01 ctubbsii

+1 to add CI on ARM64!

@ctubbsii Accumulo is coded mostly in Java but still there is server/native that produces a .so (Linux native library). I have built Accumulo on ARM64 hardware and there is no problem, but with a CI we could be sure that there are no regressions. Also, in the past I've had problems with third party dependencies like Protobuf-Java and Brotli. That is, the project itself is platform independent but some of the dependencies may lead to problems.

Accumulo uses GitHub Actions and there is no support for ARM64 there yet (it is planned for Q1 2021 though) but you may use TravisCI just for testing on ARM64 (and PPC64, s390x). If you like the idea I'd be happy to provide a Pull Request!

gancho-ivanov avatar Feb 08 '21 14:02 gancho-ivanov

Accumulo uses GitHub Actions and there is no support for ARM64 there yet (it is planned for Q1 2021 though) but you may use TravisCI just for testing on ARM64 (and PPC64, s390x). If you like the idea I'd be happy to provide a Pull Request!

TravisCI was previously used, but we discontinued that for several reasons. If support for these are coming in Q1 2021, then I would prefer we wait until then to add testing for other arches, rather than reintroduce TravisCI.

ctubbsii avatar Feb 08 '21 14:02 ctubbsii

Hi @ctubbsii Sorry for reply lately, thanks for @gancho-ivanov help clarified. Like other Java projects, Hadoop, Spark, Most code of Accumulo can be directly built and run on ARM64 platform, but there is also native code that produces a .so (Linux native library) with is architecture sensitive, it is better to add ARM CI to ensure ARM platform support for future development. As mentioned, the ARM64 release is planed in Github Action in Q1 2021, let's wait and then configure ARM CI with github action. Thank you!

liusheng avatar Feb 10 '21 07:02 liusheng

Unfortunately GitHub Actions postponed the addition of Linux ARM64 runner nodes for later. Not clear when that could be.

Recently I've written a blog about how to run GitHub Action jobs on ARM64 at Huawei Cloud: https://martin-grigorov.medium.com/githubactions-build-and-test-on-huaweicloud-arm64-af9d5c97b766. But honestly, in my opinion using TravisCI or CircleCI is the easiest! Few other Apache projects recently re-introduced TravisCI for testing on Linux ARM64 and s390x: Parquet, Drill, Tomcat & Wicket.

martin-g avatar Jun 03 '21 08:06 martin-g

Using TravisCI or CircleCI requires additional user registration and privileges to manage builds, so I'm not comfortable doing that myself, and would prefer to avoid it. If some other PMC member wanted to do that, I wouldn't be opposed, so long as it didn't increase the burden on the other active PMC/committers to maintain. I'd be opposed to using something like Huawei Cloud, or anything else that requires storing credentials to a third party service in the repo settings, as that's a lot more work to manage, and harder for the PMC to secure and protect as a collective, often requiring interaction with ASF INFRA to maintain and update, and much harder to self-service. Every 3rd party GitHub action, like huaweicloud/cce-cluster-credentials@v1 would also require INFRA approval, since they have a curated allow-list, for security and project safety.

Aside from security using 3rd-party services, my main concern is the disproportionate work required to maintain a workflow that has marginal value to the Accumulo community. I wouldn't want us to spend more time and effort on maintaining a CI workflow than the value the community gets from having support for those architectures.

If there is a small community of ARM64 or s390x users that would benefit from having Accumulo support those, another option is for that community to run CI on their own against the public repo, and report back bugs they find. If that community grows sufficiently to warrant a bigger investment of time/effort from the PMC to maintain CI for those architectures, the interest of that community will be reflected in their ongoing contributions to the project, and we can add support for CI on those later. Currently, given the lack of bugs being reported that are specific to s390x/ARM64, my personal opinion is that there isn't sufficient interest or risk of problems to warrant the effort at this time.

ctubbsii avatar Jun 03 '21 13:06 ctubbsii

At https://martin-grigorov.medium.com/github-actions-arm64-runner-on-oracle-cloud-a77cdf7a325a I've explained how to use GitHub Actions self-hosted ARM64 runners on Oracle Cloud for free. Again you will need to create an account at OracleCloud to be able to create the VM but apart from that the build setup would be the same as now. ASF Infra help is needed only to get the token for ./config.sh step and to setup the security.

Let me know if you are interested in this approach and if you need any kind of help!

martin-g avatar Nov 16 '21 13:11 martin-g