openhouse
openhouse copied to clipboard
[BUG] Local docker failed to run spark-shell on Mac M1
Willingness to contribute
Yes. I can contribute a fix for this bug independently.
OpenHouse version
v0.5.62
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 20.0): Apple M1, MacOS Sonoma 14.5, Docker v4.30
- JDK version: 1.8
Describe the problem
While running spark-shell
commands in the SETUP.md
, it always prompts a fatal error related to Java Runtime Environment.
After investigation, I found that it is a common docker issue on Apple Silicon Macbook due to a bug in Rosseta (the x86/amd64 emulation application on Apple Silicon).
More details about this issue can be found in https://github.com/docker/for-mac/issues/7006
While waiting for the fix from Apple, there are several workarounds for this issue. For me, downgrading the Docker to [https://docs.docker.com/desktop/release-notes/#4272](version 4.27.2) will work. Additionally, other methods are mentioned in https://github.com/docker/for-mac/issues/7006#issuecomment-2122869966.
Stacktrace, metrics and logs
A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007ffffe0b8e1e, pid=692, tid=0x00007fffe86e6700
#
# JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-8u232-b09-1~deb9u1-b09)
# Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x628e1e]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /opt/spark/hs_err_pid692.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
Code to reproduce bug
bin/spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-3.1_2.12:1.2.0 \
--jars openhouse-spark-runtime_2.12-*-all.jar \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,com.linkedin.openhouse.spark.extensions.OpenhouseSparkSessionExtensions \
--conf spark.sql.catalog.openhouse=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.openhouse.catalog-impl=com.linkedin.openhouse.spark.OpenHouseCatalog \
--conf spark.sql.catalog.openhouse.metrics-reporter-impl=com.linkedin.openhouse.javaclient.OpenHouseMetricsReporter \
--conf spark.sql.catalog.openhouse.uri=http://openhouse-tables:8080 \
--conf spark.sql.catalog.openhouse.auth-token=$(cat /var/config/$(whoami).token) \
--conf spark.sql.catalog.openhouse.cluster=LocalHadoopCluster
What component does this bug affect?
- [ ]
Table Service
: This is the RESTful catalog service that stores table metadata.:services:tables
- [ ]
Jobs Service
: This is the job orchestrator that submits data services for table maintenance.:services:jobs
- [ ]
Data Services
: This is the jobs that performs table maintenance.apps:spark
- [ ]
Iceberg internal catalog
: This is the internal Iceberg catalog for OpenHouse Catalog Service.:iceberg:openhouse
- [ ]
Spark Client Integration
: This is the Apache Spark integration for OpenHouse catalog.:integration:spark
- [ ]
Documentation
: This is the documentation for OpenHouse.docs
- [X]
Local Docker
: This is the local Docker environment for OpenHouse.infra/recipes/docker-compose
- [ ]
Other
: Please specify the component.